pros/cons of using XML in Unix

MrChad

Lifer
Aug 22, 2001
13,507
3
81
:confused:

Using XML for what? XML is just a means of organizing/storing/transmitting data. It may have many potential uses on a Unix-based system, each of which have their own pros and cons.
 

n0cmonkey

Elite Member
Jun 10, 2001
42,936
1
0
When I see XML mentioned in relation to *nix it's usually for logs and configuration files. I think it's a bad idea for each of those things. It's overly complicated, and bloats just about everything it touches.
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
You don't really use xml for an operating system which I guess is what n0c was explaining. You use xml for much higher level things like passing structured information between completely different systems. html, for example is really just a form of xml.
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
Originally posted by: notfred
Originally posted by: kamper
html, for example is really just a form of xml.

No it's not.

Good enough. I figured that between xhtml being xml and older versions being so similar I could use it as an example. At any rate I never write anything except xhtml and I just call it html.
 

n0cmonkey

Elite Member
Jun 10, 2001
42,936
1
0
Originally posted by: kamper
You don't really use xml for an operating system which I guess is what n0c was explaining. You use xml for much higher level things like passing structured information between completely different systems. html, for example is really just a form of xml.

I meant that plain text just makes more sense for plenty of things. People/companies try and XMLize everything. There have been requests to make logs into XML documents, and Macromedia used XML config files for Coldfusion MX. These are just the wrong place for it, IMO.
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
Originally posted by: n0cmonkey
Originally posted by: kamper
You don't really use xml for an operating system which I guess is what n0c was explaining. You use xml for much higher level things like passing structured information between completely different systems. html, for example is really just a form of xml.

I meant that plain text just makes more sense for plenty of things. People/companies try and XMLize everything. There have been requests to make logs into XML documents, and Macromedia used XML config files for Coldfusion MX. These are just the wrong place for it, IMO.

I agree that log files are a bad place for xml because of the bloat but I think config files are a great place for it. From the developers perspective it makes things simple because you just define a document structure and any old xml library will make extracting the required info very easy. It saves you from defining a new syntax, writing a parser for it, and then expecting the user to understand the new syntax.

If there's an xml based document of some kind (config files being one example, things like build scripts are another) I can read the xsd and understand most of what I am supposed to be able to specify fairly quickly. That's better than some free-form format where I have to rely on the developers willingness and ability to document. In that case the code is usually well ahead of the documentation and it takes too long to get up to speed on how to configure things.
 

n0cmonkey

Elite Member
Jun 10, 2001
42,936
1
0
Originally posted by: kamper
Originally posted by: n0cmonkey
Originally posted by: kamper
You don't really use xml for an operating system which I guess is what n0c was explaining. You use xml for much higher level things like passing structured information between completely different systems. html, for example is really just a form of xml.

I meant that plain text just makes more sense for plenty of things. People/companies try and XMLize everything. There have been requests to make logs into XML documents, and Macromedia used XML config files for Coldfusion MX. These are just the wrong place for it, IMO.

I agree that log files are a bad place for xml because of the bloat but I think config files are a great place for it. From the developers perspective it makes things simple because you just define a document structure and any old xml library will make extracting the required info very easy. It saves you from defining a new syntax, writing a parser for it, and then expecting the user to understand the new syntax.

If there's an xml based document of some kind (config files being one example, things like build scripts are another) I can read the xsd and understand most of what I am supposed to be able to specify fairly quickly. That's better than some free-form format where I have to rely on the developers willingness and ability to document. In that case the code is usually well ahead of the documentation and it takes too long to get up to speed on how to configure things.

Documentation will still be key. How many COMPLICATED (I'm stressing that word for a reason) XML config files have you delt with? Take a peek at Coldfusion MX's config file, and tell me how easy it would be to mess with it assuming no knowledge of XML.
 

drag

Elite Member
Jul 4, 2002
8,708
0
0
I setup a Ice2 Ogg audio server once and it used XML.

It was a hatefull thing, although I suppose they had their reasons. I am used to dealing with text-based configuration files, and they are easy as a end user to deal with when you provide a sample configuration with the various options and comments commented out. Take a look at the OpenSSH sshd configuration files, they are a good example of good config files. Lots of documentation, and then when you combine that with the Man files, that's about all you need to know to setup a well configured server.

To the end user there is very little advantage that I can see in changing that over to XML. And the same thing to configuration files.

Also is a good example is the XF86Config, it uses it's own syntax but it's designed to be self documenting. You know that your dealing with the video monitor setup, for instance and combine that with the man page that's about 80% of everything you need to know about setting up your X Windows server.

The advantages I can forsee is stuff like making it easier to setup GUI programs to configure your apps for you and such, or to make tools to allow people to search thru log files or collect error data from log files and present it in a pleasent manner.

That has some advantages, but they realy aren't that significant IMO. Right now to a end user you have to learn the syntax and layout of plain text configuration files in order to work them. If you were to change completely over to XML, as a end user you would still have to learn the syntax and layout of the configuration files in order to work them, but you'd also have to know XML. Kinda a pain.

What I am thinking is that it's important to avoid using things like XML to unload the workload off of the Developers and onto the end users.

On some stuff it's very nice though. Like nobody ever rarely's mess around wth user preferences in their ~/.gconf directory. They just aren't that important, if something is messed up you just blow away the entire config and start from scratch again when your a end user. So if XML parsing makes things easier for the developer and makes things potentionally more stable then it's a good idea.
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
Originally posted by: n0cmonkey
Originally posted by: kamper
Originally posted by: n0cmonkey
Originally posted by: kamper
You don't really use xml for an operating system which I guess is what n0c was explaining. You use xml for much higher level things like passing structured information between completely different systems. html, for example is really just a form of xml.

I meant that plain text just makes more sense for plenty of things. People/companies try and XMLize everything. There have been requests to make logs into XML documents, and Macromedia used XML config files for Coldfusion MX. These are just the wrong place for it, IMO.

I agree that log files are a bad place for xml because of the bloat but I think config files are a great place for it. From the developers perspective it makes things simple because you just define a document structure and any old xml library will make extracting the required info very easy. It saves you from defining a new syntax, writing a parser for it, and then expecting the user to understand the new syntax.

If there's an xml based document of some kind (config files being one example, things like build scripts are another) I can read the xsd and understand most of what I am supposed to be able to specify fairly quickly. That's better than some free-form format where I have to rely on the developers willingness and ability to document. In that case the code is usually well ahead of the documentation and it takes too long to get up to speed on how to configure things.

Documentation will still be key. How many COMPLICATED (I'm stressing that word for a reason) XML config files have you delt with? Take a peek at Coldfusion MX's config file, and tell me how easy it would be to mess with it assuming no knowledge of XML.

I've never seen a coldfusion config file but I've deal with very large ant build scripts, jboss config files, log4j config files and the like; most of them with next to no documentation. I find that the xml syntax helps a great deal in clarifying the scope and purpose of each item that I am configuring. Got a large coldfusion file handy? Show it to me.

Granted, if you have no prior understanding of xml it wouldn't be easy but my point is that once you understand the syntax and the structural implications (and really, most computer geeks should) any xml config file becomes easily accessible as opposed to a free form file which may have different idioms depending on the developers whim and lacks the implicit expressive power of xml. You may very well have to learn a new syntax for each config file.

But, at any rate, I suppose it also depends on the backgrounds of your user. I come from a web services java world where xml is absolutely huge so it makes sense to use it. I would understand that for other applications it may not be as effective, like for ssh and stuff that drag mentions.

<-- shakes head
"I don't understand why you unix folks always seem to want to do things the hard way" :p
 

n0cmonkey

Elite Member
Jun 10, 2001
42,936
1
0
Originally posted by: kamper

I've never seen a coldfusion config file but I've deal with very large ant build scripts, jboss config files, log4j config files and the like; most of them with next to no documentation. I find that the xml syntax helps a great deal in clarifying the scope and purpose of each item that I am configuring. Got a large coldfusion file handy? Show it to me.

Granted, if you have no prior understanding of xml it wouldn't be easy but my point is that once you understand the syntax and the structural implications (and really, most computer geeks should) any xml config file becomes easily accessible as opposed to a free form file which may have different idioms depending on the developers whim and lacks the implicit expressive power of xml. You may very well have to learn a new syntax for each config file.

But, at any rate, I suppose it also depends on the backgrounds of your user. I come from a web services java world where xml is absolutely huge so it makes sense to use it. I would understand that for other applications it may not be as effective, like for ssh and stuff that drag mentions.

<-- shakes head
"I don't understand why you unix folks always seem to want to do things the hard way" :p

Luckily I don't have to admin CFMX anymore, or I would have provided you with a copy. As an admin, not a programmer, not a web developer, but just as an admin, it was a pain in the butt. It was too busy, and that made it hard to read. It increased the size considerably, which made it overwhelming.

If config files started switching to XML I'd consider spending the time (what appears to be years by the looks of things) to learn XML, but it shouldn't be necessary for someone that just wants to setup a system.

There should be some consistency between daemon software too. If you change one config over to XML, you should probably change them all. And that will be a large undertaking.

Proper documentation solves everything. :)
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
Originally posted by: n0cmonkey
Here is one of the config files for CF MX.

Heh. Not to belittle you but that is trivial. It's not even coldfusion, it's the generic web app config file that must be handled by any servlet/jsp container. Interestingly enough, config files like that are frequently generated automatically out of meaningful comments embedding in application code (good use of xml:)) by tools like xdoclet. Maybe not web.xml but larger, more complicated ones like ejb descriptors.

Originally posted by: n0cmonkey
If config files started switching to XML I'd consider spending the time (what appears to be years by the looks of things) to learn XML, but it shouldn't be necessary for someone that just wants to setup a system.

You got some tags. The tags must close and be properly nested in a tree structure (that's the real kicker). Tags can have attributes. You place data in between tags, giving it meaning. There's a 5 second lesson that handles just about everything you need to know to read that config file; everything else is application specific (you'd have to figure it out no matter what syntax you're using). Oh, and <!-- --> denotes a comment.

Edit: I guess your concerns over the web.xml are justified. The content of that file should not have to be edited by an administrator, it deals largely with development issues. A properly packaged web application should come with the web.xml long since completed and all you should have to do is drop the war file into the proper place.
 

Sunner

Elite Member
Oct 9, 1999
11,641
0
76
Originally posted by: n0cmonkey
Originally posted by: kamper
Originally posted by: n0cmonkey
Originally posted by: kamper
You don't really use xml for an operating system which I guess is what n0c was explaining. You use xml for much higher level things like passing structured information between completely different systems. html, for example is really just a form of xml.

I meant that plain text just makes more sense for plenty of things. People/companies try and XMLize everything. There have been requests to make logs into XML documents, and Macromedia used XML config files for Coldfusion MX. These are just the wrong place for it, IMO.

I agree that log files are a bad place for xml because of the bloat but I think config files are a great place for it. From the developers perspective it makes things simple because you just define a document structure and any old xml library will make extracting the required info very easy. It saves you from defining a new syntax, writing a parser for it, and then expecting the user to understand the new syntax.

If there's an xml based document of some kind (config files being one example, things like build scripts are another) I can read the xsd and understand most of what I am supposed to be able to specify fairly quickly. That's better than some free-form format where I have to rely on the developers willingness and ability to document. In that case the code is usually well ahead of the documentation and it takes too long to get up to speed on how to configure things.

Documentation will still be key. How many COMPLICATED (I'm stressing that word for a reason) XML config files have you delt with? Take a peek at Coldfusion MX's config file, and tell me how easy it would be to mess with it assuming no knowledge of XML.

I agree.
I'm sure XML is good for lots of stuff, but IMO config files isn't one of those things.
When I first ran into Apache the config files were easy to understand with relatively little help from the documentation.
When I first ran into Tomcat, it eventually pissed me off so much that I gave up and went with another solution alltogether.

XML has become such a huge buzzword that everyone wants to use it for everything, which is FUBAR IMO.
 

n0cmonkey

Elite Member
Jun 10, 2001
42,936
1
0
Originally posted by: kamper
Originally posted by: n0cmonkey
Here is one of the config files for CF MX.

Heh. Not to belittle you but that is trivial. It's not even coldfusion, it's the generic web app config file that must be handled by any servlet/jsp container. Interestingly enough, config files like that are frequently generated automatically out of meaningful comments embedding in application code (good use of xml:)) by tools like xdoclet. Maybe not web.xml but larger, more complicated ones like ejb descriptors.

Originally posted by: n0cmonkey
If config files started switching to XML I'd consider spending the time (what appears to be years by the looks of things) to learn XML, but it shouldn't be necessary for someone that just wants to setup a system.

You got some tags. The tags must close and be properly nested in a tree structure (that's the real kicker). Tags can have attributes. You place data in between tags, giving it meaning. There's a 5 second lesson that handles just about everything you need to know to read that config file; everything else is application specific (you'd have to figure it out no matter what syntax you're using). Oh, and <!-- --> denotes a comment.

Edit: I guess your concerns over the web.xml are justified. The content of that file should not have to be edited by an administrator, it deals largely with development issues. A properly packaged web application should come with the web.xml long since completed and all you should have to do is drop the war file into the proper place.

That was just an example I had found. It doesn't look like hte one I had to edit before (just to fix CF MX). Compare a complex XML config file to Apache's config file, and you might see what I mean. ;)
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
Originally posted by: n0cmonkey
That was just an example I had found. It doesn't look like hte one I had to edit before (just to fix CF MX). Compare a complex XML config file to Apache's config file, and you might see what I mean. ;)

You're not going to impress me by telling me that you've worked with more complex xml files and that apache's makes more sense to you. Here's a few config files that I've found on my hard drive. Most are simply the defaults that come with jboss and tomcat, anything used in a deployable application will get much bigger.

JBoss database configuration:
http://kamper.dyndns.org/xml/standardjbosscmp-jdbc.xml
That has no comments but I can still figure out what it's talking about.

Tomcat server configuration:
http://kamper.dyndns.org/xml/server.xml
Plenty of comments, making it fully editable by anyone who knows what they are configuring. Look at it in any browser and the syntactic structure becomes fairly clear.

A properly documented web.xml (same as what you posted before except that it's not for a specific application):
http://kamper.dyndns.org/xml/tomcat-web.xml

To me, the data presented in these files is organized in a much more coherent manner than httpd.conf and I would take the xml before the .conf anyday.

Edit: unfortunately, firefox butchers the comments in the 3rd file listed. It's easier to read in ie or any text editor with xml syntax highlighting
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
Sorry, I'm getting a little snappy here. I guess my point is that xml still scales very well for those that understand it. You can use what you like and I'll use what I like and the world will be a happier place for it. :)
 

Sunner

Elite Member
Oct 9, 1999
11,641
0
76
Originally posted by: kamper
Originally posted by: n0cmonkey
That was just an example I had found. It doesn't look like hte one I had to edit before (just to fix CF MX). Compare a complex XML config file to Apache's config file, and you might see what I mean. ;)

You're not going to impress me by telling me that you've worked with more complex xml files and that apache's makes more sense to you. Here's a few config files that I've found on my hard drive. Most are simply the defaults that come with jboss and tomcat, anything used in a deployable application will get much bigger.

JBoss database configuration:
http://kamper.dyndns.org/xml/standardjbosscmp-jdbc.xml
That has no comments but I can still figure out what it's talking about.

Tomcat server configuration:
http://kamper.dyndns.org/xml/server.xml
Plenty of comments, making it fully editable by anyone who knows what they are configuring. Look at it in any browser and the syntactic structure becomes fairly clear.

A properly documented web.xml (same as what you posted before except that it's not for a specific application):
http://kamper.dyndns.org/xml/tomcat-web.xml

To me, the data presented in these files is organized in a much more coherent manner than httpd.conf and I would take the xml before the .conf anyday.

Edit: unfortunately, firefox butchers the comments in the 3rd file listed. It's easier to read in ie or any text editor with xml syntax highlighting

And to me it's exactly the other way around.
Compared to httpd.conf(or proftpd.conf, sshd.conf, or most any .conf I've ran into) those make no sense at all, while the .conf files make perfect sense.
Of course, I'm an admin, not a programmer, and I'm also used to .conf files, my first experience with XML config files was just 2 years ago, with Tomcat, and like I said, it pissed me off enough that I just deleted Tomcat and went another route.
 

n0cmonkey

Elite Member
Jun 10, 2001
42,936
1
0
Originally posted by: kamper
Originally posted by: n0cmonkey
That was just an example I had found. It doesn't look like hte one I had to edit before (just to fix CF MX). Compare a complex XML config file to Apache's config file, and you might see what I mean. ;)

You're not going to impress me by telling me that you've worked with more complex xml files and that apache's makes more sense to you. Here's a few config files that I've found on my hard drive. Most are simply the defaults that come with jboss and tomcat, anything used in a deployable application will get much bigger.

I wasn't trying to impress, just get my point across. :)

If everything switched to XML tomorrow, I'd be researching it right now. But the world won't switch, especially over night, so I don't think we have to worry too much either way. :p

JBoss database configuration:
http://kamper.dyndns.org/xml/standardjbosscmp-jdbc.xml
That has no comments but I can still figure out what it's talking about.

What does <alias-max-length>30</alias-max-length> mean? Off the top of your head. :)

Tomcat server configuration:
http://kamper.dyndns.org/xml/server.xml
Plenty of comments, making it fully editable by anyone who knows what they are configuring. Look at it in any browser and the syntactic structure becomes fairly clear.

A properly documented web.xml (same as what you posted before except that it's not for a specific application):
http://kamper.dyndns.org/xml/tomcat-web.xml

To me, the data presented in these files is organized in a much more coherent manner than httpd.conf and I would take the xml before the .conf anyday.

The httpd.conf is simple. variable_name variable. How much simpler could it get? The comments between variables helps a lot too.

Sorry, I'm getting a little snappy here. I guess my point is that xml still scales very well for those that understand it. You can use what you like and I'll use what I like and the world will be a happier place for it. :)

I probably got a bit snappy earlier, so I wasn't going to take it personally or anything. :p
 

Barnaby W. Füi

Elite Member
Aug 14, 2001
12,343
0
0
xml isn't really meant for human viewing/editing. If the job requires either, then xml is probably a bad choice. Look how damn noisy that CFMX config file is! No sane person would want to type all that crap by hand.

(although it is tempting for programmers: parsers are such an annoyance to write)
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
Originally posted by: n0cmonkey

What does <alias-max-length>30</alias-max-length> mean? Off the top of your head. :)

I'd have to see it in context (that's the whole point of xml). I'd also have to understand what I'm configuring, that's integral to any configuration file (and I'm no object-relational datamapping expert).

That being said, I cheated and found an example from that file. My best guess would be that is the maximum allowed length of a name used to alias a table in a generated sql statement.

alias-max-length = 30 wouldn't be any more useful. I would have had more trouble realizing that that was a top level configuration parameter of typemapping to an Ingres database if it hadn't been nicely inherent from the xml layout :p (The example I found was in the Ingres section. I could also tell you that your example applies to an Oracle databse, as everyother database in the file supports 32).
 

n0cmonkey

Elite Member
Jun 10, 2001
42,936
1
0
Originally posted by: kamper
Originally posted by: n0cmonkey

What does <alias-max-length>30</alias-max-length> mean? Off the top of your head. :)

I'd have to see it in context (that's the whole point of xml). I'd also have to understand what I'm configuring, that's integral to any configuration file (and I'm no object-relational datamapping expert).

Experienced admins can probably take a random configuration for apache, or sendmail, or qmail, or postfix, or xntpd, or openntpd, or openssh, etc. and explain it. ;)
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
Originally posted by: BingBongWongFooey
xml isn't really meant for human viewing/editing. If the job requires either, then xml is probably a bad choice. Look how damn noisy that CFMX config file is! No sane person would want to type all that crap by hand.

(although it is tempting for programmers: parsers are such an annoyance to write)

I beg to differ (sheesh, I'm arguing with everyone today :roll: )

Another important part of xml is that it sits nicely in between an ideal computer format and an ideal human readable format. If it was never meant to be read by humans it would probably be binary and not nearly so verbose. I have no trouble reading the posted xml file, perhaps because I'm used to it.

Get a decent xml editor and you get things like automatic tag closure (type "<foo>" and it appends "</foo>"), automatic formatting (because, of course, white space and nesting are of the utmost importance for human readablility) and syntax highlighting. Syntax highlighting is especially handy when the editor can find the dtd or xsd, parse your xml file and tell you when you've created an invalid document and why.

It's the same as editing source code in any programming language. It's a compromise between straight up english and straight up binary.