Wanted to get everyone's opinions about this problem I'm working on. I have an incoming chunk of data from a provider that consists of many (3000-5000) individual pieces of digital content. Each piece is in the form of a file, and is stored out at the end of a long directory structure, one file per leaf directory, along with a smallish chunk of XML metadata.
I have to search through these metadata files and based on the properties of elements reassemble the pieces of content into a new structure, by copying them to new directories, and running the metadata through an XSLT.
After puzzling over it part of today I concluded that what I need to do is assemble all the existing metadata files into a single XML document, which can be queried using xpath to select out the nodes and relationships I want for the new structure.
The alternative would be to implement a search of the XML metadata in the leaf directories every time I need to select out a set of nodes with certain properties.
I like the idea of building one XML file out of all the small metadata files for two reasons: the first is that there are a lot of pieces and pieces of pieces in these sets, so the leaf nodes will be visited a lot, maybe 1000 times to build the whole structure. The second is that I think the logic will be easier to write and maintain in xpath.
Any thoughts?
I have to search through these metadata files and based on the properties of elements reassemble the pieces of content into a new structure, by copying them to new directories, and running the metadata through an XSLT.
After puzzling over it part of today I concluded that what I need to do is assemble all the existing metadata files into a single XML document, which can be queried using xpath to select out the nodes and relationships I want for the new structure.
The alternative would be to implement a search of the XML metadata in the leaf directories every time I need to select out a set of nodes with certain properties.
I like the idea of building one XML file out of all the small metadata files for two reasons: the first is that there are a lot of pieces and pieces of pieces in these sets, so the leaf nodes will be visited a lot, maybe 1000 times to build the whole structure. The second is that I think the logic will be easier to write and maintain in xpath.
Any thoughts?
