XML - serializing data and type together

dighn

Lifer
Aug 12, 2001
22,820
4
81
So at work we have an XML-based system where we need to serialize a list of arguments between a server and a client. The list is defined on the server, serialized to the client for processing and sent back to the server by the client. This list consists of an arbitrary set of properties each with their own type.

For now just ignore why we aren't using XSD or some other standard system for doing this. We have two choices:

1) set XML item name to property name, add a type attribute to specify type e.g.

<Age type="int">10</Age>
<FirstName type="string">John</FirstName>
<LastName type="string">Doe</LastName>

2) set XML item name to type, specify name in attribute:

<Int name="age">10</Int>
<String name="FirstName">John</String>
<String name="LastName">Doe</String>

Now both of these accomplish the same thing so there is a bit of a debate on which is better. Personally I favor 1 because this seems more inline with how XML is typically used, and the client could send just the property names and values provided the server knows the schema (which it does - the client cannot modify the types) e.g. <Age>10</Age>. Also a more specialized client coded against a specific instance of such a list wouldn't even need the type information itself. However the counter-argument for using 2) is that it more naturally reflects how the generic client parser might process the data.

What do you guys think? I don't think there's any right or wrong here, but just want to see what people generally think.
 
Last edited:

cabri

Diamond Member
Nov 3, 2012
3,616
1
81
If the parser does not know the field type (flawed design)
then treat each field as numeric until determined to be a character - then it becomes a string
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,698
4,659
75
If the parser does not know the field type (flawed design)
then treat each field as numeric until determined to be a character - then it becomes a string

I generally agree except that I think you have that backwards. Everything is a string, or portion of a string, until proven otherwise. For that reason I'd go with type 1 - make fields easily searchable by tag name, parse everything as strings, then atoi() ints based on attributes.
 

cabri

Diamond Member
Nov 3, 2012
3,616
1
81
I generally agree except that I think you have that backwards. Everything is a string, or portion of a string, until proven otherwise. For that reason I'd go with type 1 - make fields easily searchable by tag name, parse everything as strings, then atoi() ints based on attributes.

If actually parsing the data in the field:
By assuming that you have a number; detection of the first character forces a string

Otherwise, you have to parse down the complete data and then do a number test.


Now, if the data itself is not being parsed, then running a number conversion and having it fail will indicate a string.

I probably overthought at a lower level:$
 

dighn

Lifer
Aug 12, 2001
22,820
4
81
Oh we are parsing everything as a string. The type information is used later on for processing and/or GUI interactions. We are using this for a plug-in system where a user defines a set of properties on the server, that can be configured by a client. The client under question here must deal with that set generically.

Oh and there are also composite types e.g.

<PrimaryAddress type="Address">
<Street />
<State />
</PrimaryAddress>
 
Last edited:

bzb_Elder

Member
May 25, 2011
86
13
71
If presented with these two options, I would also choose number 1. Each property (name, age, phone, etc) is expressed as an xml element, along with the attributes required for the consuming application to process the data correctly.

My two cents.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,698
4,659
75
It also occurs to me: shouldn't data types be stored in a different kind of document? An XSD or something?
 

dighn

Lifer
Aug 12, 2001
22,820
4
81
It also occurs to me: shouldn't data types be stored in a different kind of document? An XSD or something?

We should, but some components of the system are very low-level embedded devices (not Unix based) for which we have to hand-code the XML parser. XSD is the kind of complication that we want to avoid. Maybe in a future iteration we could consider that. For now encoding types in attributes or element names are pretty ingrained into the system.
 
Last edited: