Database with large files

gregulator · Jun 8, 2008

I am designing a small LAN of CNC machines that will be fed files from a main server. The server will hold all the information (machine files, part files, etc) and even large graphics files (100MB+). I am wondering the best and easiest structure for this kind of environment. I am sure I could whip up something in C# real quick that just uses folders in a filesystem, but we all know that scales poorly at some point (though it is really easy to navigate graphically with explorer). So should I start investing time into something like MySQL and putting a custom/graphical app on top? Does it handle large files well, or am I going to need to set up links to large files which then means files go to file system folders anyways (but MySQL keeps track)? Any suggestions are appreciated. Thanks!

imported_Dhaval00 · Jun 8, 2008

It depends on the volume of data you plan on handling. I have worked on solutions with either approach - storing blobs (byte arrays or binary data, to be specific) directly in the DB, as well as just storing the relative (not absolute!) paths of files in the DB (actual files go on the file system). The former is scalable when the volume of data being handled is lower. Usually, exploiting the FS and the DB collectively is a sound design and scales well.

Think about it (this is specific to SQL Server) - if you're storing 100 MB files in the DB as a blob, just the amount of load on the system in terms of ambient SQL Server transactions, page splits, indexes, etc. can be huge. I won't be surprised if your transactions time out during the commit phase because there are other transactions still being processed. Nevertheless, when you know up-front that you won't be handling too many transactions, storing files as blobs might result in less cryptic code.

Also, think about what kind of machine (RAM, CPU) you will need... processing 100 MB of files will definietely require more RAM.

Markbnj · Jun 8, 2008

Yeah, it's a trade-off. Using a database simplifies things over designing and implementing your own fault-tolerant file system. You might try a hybrid approach: store smaller files as blobs, and store the really big image files in their own folder system, with appropriate integrity checks.

Dravic · Jun 8, 2008

I would go with a hybrid appraoch. A database is not going to be faster serving the data then the filesystem.

The power of the DB will be in querying the data/properties of the files.

for example: i'm writing a python/django based photo gallery.. The power in the system will be in storing all the properties(EXIF) and tag(favorite,person,place) about the photo in a DB for quick queries(i.e. give me all photos around fred's birthday from 2002-2006, show me all photos i took with ISO speed 200, etc. The photo's themseves hold no real power, the data about them does

Argo · Jun 8, 2008

Check out Amazon S3

QED · Jun 9, 2008

I don't really think it's much of a tradeoff-- more of an option of last resort, or if performance isn't an issue for you.

Database servers do not make good fileservers. A few large file requests will kill the performance of your database as it overwrites and fills your buffer cache with data that noone else will access. And unless your database is using direct disk access, there isn't a chance in hell it can retreive the data faster than a well-tuned filesystem could (since the DB would have to get it from the filesystem itself first).

Database with large files

gregulator

Senior member

imported_Dhaval00

Senior member

Markbnj

Elite Member <br>Moderator Emeritus

Dravic

Senior member

Argo

Lifer

QED

Diamond Member

TRENDING THREADS