Writing out XML: regular file IO or use DOM/some XML object facility?

SunnyD · Mar 27, 2008

What is generally considered the "preferred" way to write out XML data from an application? I'm trying to make things as lightweight as possible in an application I'm writing, and from that standpoint I think it would be more efficient to just use simple file IO to generate XML output to a file, though I can see some pitfalls of doing it like that (such as forgetting to close a tag somewhere or whatnot).

What is the actual generally preferred way to output XML to a file? Are there any major benefits to using DOM or some custom XML parsing object that can serialize out?

Crusty · Mar 27, 2008

For me it would depend on the application/environment. I generally shy away from raw file IO for XML though, especially in .NET with all the great serialization tools, and the ability to read/write XML files natively using a DataSet object.

SunnyD · Mar 27, 2008

Originally posted by: Crusty
For me it would depend on the application/environment. I generally shy away from raw file IO for XML though, especially in .NET with all the great serialization tools, and the ability to read/write XML files natively using a DataSet object.

Console app in Native C++/Win32, running in the background to never be seen. The only meaningful interaction it will ever have is the XML files it will output when run. Also, as mention, the app is supposed to be as light as possible.

Markbnj · Mar 27, 2008

I'm with Crusty on this one. Even if you have just a simple, static XML schema that is highly resistant to change, there's little sense in spending your time writing and debugging the code to do what the DOM does effortlessly. In terms of efficiency you won't be adding much in either execution time or code size by using it.

troytime · Mar 27, 2008

for me it depends on the data and how its being used

if the data is constantly changing and access regularly, build the xml on the fly for every request

edit: doesn't really apply here i guess. I didn't read the thread (had to pee like a mofo)

SunnyD · Mar 27, 2008

Originally posted by: Markbnj
I'm with Crusty on this one. Even if you have just a simple, static XML schema that is highly resistant to change, there's little sense in spending your time writing and debugging the code to do what the DOM does effortlessly. In terms of efficiency you won't be adding much in either execution time or code size by using it.

What about the scenario when you can't guarantee what libraries or versions of libraries will be on the machine that the app will be run on. Can DOM be linked in statically?

Markbnj · Mar 27, 2008

Originally posted by: SunnyD

Originally posted by: Markbnj
I'm with Crusty on this one. Even if you have just a simple, static XML schema that is highly resistant to change, there's little sense in spending your time writing and debugging the code to do what the DOM does effortlessly. In terms of efficiency you won't be adding much in either execution time or code size by using it.

Click to expand...

What about the scenario when you can't guarantee what libraries or versions of libraries will be on the machine that the app will be run on. Can DOM be linked in statically?

I don't know if the DOM can be statically linked. I assume this is a C++ app, since you're asking about static linkage. I don't know how the DOM is packaged for C++ consumption, but I assume it is a DLL. A little searching should answer that.

kamper · Mar 27, 2008

Originally posted by: Markbnj
I'm with Crusty on this one. Even if you have just a simple, static XML schema that is highly resistant to change, there's little sense in spending your time writing and debugging the code to do what the DOM does effortlessly. In terms of efficiency you won't be adding much in either execution time or code size by using it.

I'd agree with you if he was reading xml but the op only mentions writing it. Outputting xml in text form isn't really that hard to do correctly and reliably. And DOM programming (at least with the raw w3c api) is not exactly fun.

Markbnj · Mar 27, 2008

Originally posted by: kamper

Originally posted by: Markbnj
I'm with Crusty on this one. Even if you have just a simple, static XML schema that is highly resistant to change, there's little sense in spending your time writing and debugging the code to do what the DOM does effortlessly. In terms of efficiency you won't be adding much in either execution time or code size by using it.

Click to expand...

I'd agree with you if he was reading xml but the op only mentions writing it. Outputting xml in text form isn't really that hard to do correctly and reliably. And DOM programming (at least with the raw w3c api) is not exactly fun.

Maybe I'm just spoiled by the .Net XmlDocument family of classes, easy accessibility of xpath queries, etc. There's really not much to it.

Yeah, you're right, if all the OP wants to do is spit out some XML it isn't that big a deal, especially if you don't bother about making the code generic and reusable.

But I am all about not reinventing the wheel these days. Must be having teenagers and a really tight schedule.

DaveSimmons · Mar 27, 2008

For simple XML in a win32 C++ app I'd generate it "by hand" to reduce dependency on outside code/libraries that could change over time.

We use grid, editor, spreadsheet, spellcheck, image, SOAP etc. libraries in our main applications where the dependency is well worth the features but printf or cout of some tags is easy enough to do without.

If I was going to use outside code I'd want it as a statically linked library not a DLL or COM component. I'd also static link the C++ runtime (and MFC if applicable).

kamper · Mar 29, 2008

Originally posted by: Markbnj
Maybe I'm just spoiled by the .Net XmlDocument family of classes, easy accessibility of xpath queries, etc. There's really not much to it.

Sure, xpath is great, but he doesn't need it.

Yeah, you're right, if all the OP wants to do is spit out some XML it isn't that big a deal, especially if you don't bother about making the code generic and reusable.

The code can be plenty reusable. As long as he doesn't need to reuse it to generate a dom document. Seems by far the simpler choice if he's trying to avoid pulling in an extra library.

But I am all about not reinventing the wheel these days. Must be having teenagers and a really tight schedule.

Well, I'm just being obstinate now, but I don't see how spitting out text is reinventing the wheel. If the DOM is the wheel, then he doesn't actually need one 😛 I'm sure the code for either way would be of similar complexity and would take a similar amount of time to write (unless the coder isn't already familiar with the dom).

Markbnj · Mar 29, 2008

The code can be plenty reusable. As long as he doesn't need to reuse it to generate a dom document. Seems by far the simpler choice if he's trying to avoid pulling in an extra library.

Sure, any code can be made reusable. But with respect to .Net, it already offers two complete means of translating classes and their runtime state into a properly formatted XML document: serialization and the DOM. If you don't use them, then you have to write code to generate a proper XML doc containing properly formatted elements to represent each instance of an object that you want persisted in the file. Sure, it's just "spitting out text", but I don't agree that the code would have similar levels of complexity. On the one hand you're dealing with objects and their properties, while on the other you're emitting tokens in a file format. Complexity isn't strictly a measure of the number of instructions, but also the level of abstraction you're working at.

This is a highly academic debate, obviously, if the number of objects and their runtime relationships are small. Then I agree it really doesn't matter. But in any non-trivial setting where robustness and maintainability is a concern, why would you write code to generate XML using stream I/O?

kamper · Mar 29, 2008

Originally posted by: Markbnj
This is a highly academic debate, obviously, if the number of objects and their runtime relationships are small. Then I agree it really doesn't matter. But in any non-trivial setting where robustness and maintainability is a concern, why would you write code to generate XML using stream I/O?

Well I keep wondering why you would write code to generate dom nodes when you don't need it. Assuming that everything that can go wrong will seems a bit fuddish, but I admit there could be scenarios you're thinking of that I'm not. I'm mostly playing devil's advocate by now.

Markbnj · Mar 29, 2008

Originally posted by: kamper

Originally posted by: Markbnj
This is a highly academic debate, obviously, if the number of objects and their runtime relationships are small. Then I agree it really doesn't matter. But in any non-trivial setting where robustness and maintainability is a concern, why would you write code to generate XML using stream I/O?

Click to expand...

Well I keep wondering why you would write code to generate dom nodes when you don't need it. Assuming that everything that can go wrong will seems a bit fuddish, but I admit there could be scenarios you're thinking of that I'm not. I'm mostly playing devil's advocate by now.

The suggestion of the DOM might be overkill. The OP never really says how complicated the application is, or how many objects need to be written out. He just says he wants to keep it lightweight. Serialization would probably be a good choice as you can get some decent control over what gets generated, and also put in place a framework that makes it possible to serialize anything you add to the runtime model.

I don't think I suggested that "everything that can go wrong will." What I do mean to suggest is that the lower the level of abstraction you have to work at, the greater the opportunities to get things wrong, because you are simply working with smaller pieces. If I had to write out a very small amount of XML, one time, guaranteed to never change, I might use text i/o. But if those conditions were really satisfied I would probably just script it out, or type it in by hand.

I'm mostly playing devil's advocate by now.

That's what we do here on the Internets 😉.

DaveSimmons · Mar 29, 2008

Another hypothetical reason why you might choose text I/O instead of an XML library is memory use. In win32 a bunch of couts or printfs or file write (string buffer)s should have almost zero memory footprint, whle some XML libraries will do a memory allocation for each node / tag in the file (and not free them until the complete doc is written).

But yes, for a complex XML document a library is less likely to let you shoot yourself in the foot than writing tags by hand.

SunnyD · Mar 31, 2008

The reason I asked as mentioned somewhere along the line mostly revolves around two important issues:

1. Dependency
2. Memory

I won't go into the details of the application itself, but by nature the app itself has to use a little in terms of resources as possible. That's why avoiding something like DOM would be preferable. On top of that, it has to be as "portable" as possible, being able to run up and down the Windows line from 95 to Vista, with all possible platform scenarios involved. That's why whatever API that is used would have to be statically linked and also it's dependencies as such too. I can't rely on external libraries, as they simply might not exist.

Writing out XML: regular file IO or use DOM/some XML object facility?

SunnyD

Belgian Waffler

Crusty

Lifer

SunnyD

Belgian Waffler

Markbnj

Elite Member <br>Moderator Emeritus

troytime

Golden Member

SunnyD

Belgian Waffler

Markbnj

Elite Member <br>Moderator Emeritus

kamper

Diamond Member

Markbnj

Elite Member <br>Moderator Emeritus

DaveSimmons

Elite Member

kamper

Diamond Member

Markbnj

Elite Member <br>Moderator Emeritus

kamper

Diamond Member

Markbnj

Elite Member <br>Moderator Emeritus

DaveSimmons

Elite Member

SunnyD

Belgian Waffler

TRENDING THREADS