SOAP question

Alphathree33

Platinum Member
Dec 1, 2000
2,419
0
0
Suppose I have a SOAP request that has:

<soap:sometag> some content </soap:sometag>

Now obviously I can't have any < or > tags in "some content", and so I have a function that parses those out and replaces them with the appropriate escape characters.

Unfortunately, the server is still barfing at a rather complex content string and I can't figure out which character(s) are causing it.

(Normal strings like <soap:sometag>Hi there, how are you?</soap:sometag> are working fine.)

Thoughts?
 

DaveSimmons

Elite Member
Aug 12, 2001
40,730
670
126
Many soap connections require UTF-8 character encoding instead of ASCII. This requires sending characters above 127 as 2-byte encoding (Google for how) or you can use HTML &# xxx ; encoding in some cases.

Some servers also can't accept some ASCII characters 0-31 as-is (often the "smart quotes" that Word uses), so check for anything below 32 other than \r \n \t.
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
What language are you working in? Are there no useable SOAP clients already available to you?

But, can you post one of your complex strings for inspection? Aside from <'s and >'s you will also probably have to escape &'s and quotes. But, a decent xml library should also be doing this for you. Are you formulating your SOAP messages in text?
 

Alphathree33

Platinum Member
Dec 1, 2000
2,419
0
0
Originally posted by: kamper
What language are you working in? Are there no useable SOAP clients already available to you?

But, can you post one of your complex strings for inspection? Aside from <'s and >'s you will also probably have to escape &'s and quotes. But, a decent xml library should also be doing this for you. Are you formulating your SOAP messages in text?

I'm working in JavaScript =)

Hang on, I'll get the string... it's nasty.
 

Alphathree33

Platinum Member
Dec 1, 2000
2,419
0
0
This is it. It's what MSN gives me when I copy from a random conversation I was having.

<br>
<script></script><!--
D(["mb","The "They" says:<br /-->The dev env idea reminds me of an idea we\'d brought up before; having<br>a development environment that facilitates extreme programming by<br><script></script><!--
D(["mb","Keymaker says:<br /-->y<br>well, i think what i am gonna do now is sleep on it and see if i have<br>any questions<br>The "They" says:<br>Sounds good.<br>Keymaker says:<br>sounds good<br>Tom says:<br>gooood night<br>Keymaker says:<br>do we want to schedule another meeting?<br>Keymaker says:<br>or leave it<br>Keymaker says:<br>for nwo<br>Tom says:<br>leave it fornow<br>Tom says:<br>we\'ve always got email<br>The "They" says:<br>The next thing to do is spec out the design, which I don\'t think is<br>
 

DaveSimmons

Elite Member
Aug 12, 2001
40,730
670
126
apply binary search to find the offending part of the string.


and double-check the output of your <> fixing function somehow. An alert box?
 

Alphathree33

Platinum Member
Dec 1, 2000
2,419
0
0
Originally posted by: DaveSimmons
apply binary search to find the offending part of the string.


and double-check the output of your <> fixing function somehow. An alert box?

I have already checked the output of my fixing function with an alert box... it does indeed work correctly. :)

Please elaborate on what you mean by "apply binary search to find offending part of the string"... I assume you mean binary as in "at the binary level" and not the search technique of sorting and dividing in half. :)

But the whole problem is that I don't know what values to search/replace.
 

DaveSimmons

Elite Member
Aug 12, 2001
40,730
670
126
No, an ancient debugging trick for finding the "bad part" of anything is to cut it in half and try the first half.

If the first half fails, divide it in 2 and repeat.

Else try the second half, and when it fails divide it in half and repeat

... and so on, just like how binary search finds a match in a sorted list.
 

Alphathree33

Platinum Member
Dec 1, 2000
2,419
0
0
Originally posted by: DaveSimmons
No, an ancient debugging trick for finding the "bad part" of anything is to cut it in half and try the first half.

If the first half fails, divide it in 2 and repeat.

Else try the second half, and when it fails divide it in half and repeat

... and so on, just like how binary search finds a match in a sorted list.

Excellent (and extremely painstaking) suggestion. The offending character, as it turns out, is:

&

How might I escape that in a SOAP request?
 

DaveSimmons

Elite Member
Aug 12, 2001
40,730
670
126
try & amp ; since that's the standard escape for it

you might also be able to use the "CDATA" wrapper around text blocks, google for details
 

Alphathree33

Platinum Member
Dec 1, 2000
2,419
0
0
Okay I've got the <![CDATA[ working on things I send to the server, but I'm wondering how to get a similar effect on the things returned from the server.

I'm sending HTML files upstream inside <![CDATA[ tags. These HTML files may themselves, for example, be tutorials on how to use HTML.

So I might have

<P> <img> tags are cool! </P>

Now this sends to the server fine, but when it comes back to me, the server escapes all of the actual < and > characters, leaving me unable to tell which ones were intended to be escaped and which ones were not.
 

DaveSimmons

Elite Member
Aug 12, 2001
40,730
670
126
The message from the server should be 100% escaped so a simple 1-pass replacement of & amp ; and & lt ; and & gt ; should fix that up.

So if the server had as its message/data
"use < b> & lt; b & gt ; < /b> to turn on bold"

the SOAP service should translate that as
"use & lt ; b & gt ; & amp; lt; b & amp ; gt ; & lt ; /b & gt ; to turn on bold"

and if you do a one-pass replacement of of & amp ; and & lt ; and & gt you'll see the original message is restored. (if using a built-in search/replace function, replace & amp ; last!)

The other issue you'll run into is that the web service may send characters above 127 as 2-byte (or multi-byte) UTF-8 which you'll need to convert back to ASCII or HTML & # nnn ; codes.
 

Alphathree33

Platinum Member
Dec 1, 2000
2,419
0
0
Originally posted by: DaveSimmons
The message from the server should be 100% escaped so a simple 1-pass replacement of & amp ; and & lt ; and & gt ; should fix that up.

So if the server had as its message/data
"use < b> & lt; b & gt ; < /b> to turn on bold"

the SOAP service should translate that as
"use & lt ; b & gt ; & amp; lt; b & amp ; gt ; & lt ; /b & gt ; to turn on bold"

and if you do a one-pass replacement of of & amp ; and & lt ; and & gt you'll see the original message is restored. (if using a built-in search/replace function, replace & amp ; last!)

The other issue you'll run into is that the web service may send characters above 127 as 2-byte (or multi-byte) UTF-8 which you'll need to convert back to ASCII or HTML & # nnn ; codes.

Thanks Dave, very helpful once again. I'll sort this out and let you know how that goes.
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
Just out of curiousity, have you tried to find javascript soap libaries? They do exist. Even an xml library will handle all the escaping for you seamlessly.

I'm not trying to tell you that you can't do it the way you're doing it ;) but xml can be cumbersome even when dealing nice handling libraries :)
 

Alphathree33

Platinum Member
Dec 1, 2000
2,419
0
0
Originally posted by: kamper
Just out of curiousity, have you tried to find javascript soap libaries? They do exist. Even an xml library will handle all the escaping for you seamlessly.

I'm not trying to tell you that you can't do it the way you're doing it ;) but xml can be cumbersome even when dealing nice handling libraries :)

I'd like to keep as much of the implementation under my own watch as possible. I'm perfectly capable and happy to write the algorithms myself so long as I understand how the SOAP portion of it actually works.

I did briefly search for a javascript library but it wasn't really worth my time=)
 

UCJefe

Senior member
Jan 27, 2000
302
0
0
Just base64 encode it and forget about it. That is if you control both ends (client + server) and you can do base64 from Javascript (you might have to roll your own routine). This is what is normally done with SOAP and things like that (ASP.NET Viewstate for example).
 

DaveSimmons

Elite Member
Aug 12, 2001
40,730
670
126
That's usually only done for binary data, not text data that just needs a few escapes. I'll grant it's simpler though if you don't mind the increase in message size.