• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Database with large files

gregulator

Senior member
I am designing a small LAN of CNC machines that will be fed files from a main server. The server will hold all the information (machine files, part files, etc) and even large graphics files (100MB+). I am wondering the best and easiest structure for this kind of environment. I am sure I could whip up something in C# real quick that just uses folders in a filesystem, but we all know that scales poorly at some point (though it is really easy to navigate graphically with explorer). So should I start investing time into something like MySQL and putting a custom/graphical app on top? Does it handle large files well, or am I going to need to set up links to large files which then means files go to file system folders anyways (but MySQL keeps track)? Any suggestions are appreciated. Thanks!
 
It depends on the volume of data you plan on handling. I have worked on solutions with either approach - storing blobs (byte arrays or binary data, to be specific) directly in the DB, as well as just storing the relative (not absolute!) paths of files in the DB (actual files go on the file system). The former is scalable when the volume of data being handled is lower. Usually, exploiting the FS and the DB collectively is a sound design and scales well.

Think about it (this is specific to SQL Server) - if you're storing 100 MB files in the DB as a blob, just the amount of load on the system in terms of ambient SQL Server transactions, page splits, indexes, etc. can be huge. I won't be surprised if your transactions time out during the commit phase because there are other transactions still being processed. Nevertheless, when you know up-front that you won't be handling too many transactions, storing files as blobs might result in less cryptic code.

Also, think about what kind of machine (RAM, CPU) you will need... processing 100 MB of files will definietely require more RAM.
 
Yeah, it's a trade-off. Using a database simplifies things over designing and implementing your own fault-tolerant file system. You might try a hybrid approach: store smaller files as blobs, and store the really big image files in their own folder system, with appropriate integrity checks.
 
I would go with a hybrid appraoch. A database is not going to be faster serving the data then the filesystem.

The power of the DB will be in querying the data/properties of the files.


for example: i'm writing a python/django based photo gallery.. The power in the system will be in storing all the properties(EXIF) and tag(favorite,person,place) about the photo in a DB for quick queries(i.e. give me all photos around fred's birthday from 2002-2006, show me all photos i took with ISO speed 200, etc. The photo's themseves hold no real power, the data about them does
 
I don't really think it's much of a tradeoff-- more of an option of last resort, or if performance isn't an issue for you.

Database servers do not make good fileservers. A few large file requests will kill the performance of your database as it overwrites and fills your buffer cache with data that noone else will access. And unless your database is using direct disk access, there isn't a chance in hell it can retreive the data faster than a well-tuned filesystem could (since the DB would have to get it from the filesystem itself first).
 
Back
Top