Database Question ( MySQL particularly... ) LOTS of tables, or ONE BIG ONE?

Superwormy · Mar 3, 2003

All kinda theoretical at this point but...

Using MySQL currently, but that might change later, say I have a table, only about 4kb per record, but an anticipated number of records somewhere around 20 MILLION. Thats a lot of data.

Now, would it be better to break the table into multiple tables, say something like this:

table_a ( tables with the main record starting with the letter a )
table_b ( tables with the main record starting with the letter b )
table_c ( table.... etc you get the point

OR

Would it be better to just leave it in one HUGE table?

If it helps any, the only selects that are done select by the IDX of the record, the Primary key. So the select is just:
SELECT * FROM table WHERE myprimarykey = 233

If it was split into multiple tables, it would jsut change to:
SELECT * FROM table_" . $firstcharacter . " WHERE myprimarykey = 223

Any thoughts? How much faster would 26 smaller tables be than 1 huge table when selecting by index?

EagleKeeper · Mar 3, 2003

With a smart system, multiple tables would be better (faster) if you can split them across drives.

Argo · Mar 3, 2003

Multiple tables would definitely be faster.

notfred · Mar 3, 2003

Would it be faster to look through a card catalog if it was jsut one long drawer, or a different drawer for each letter of the alphabet?

manly · Mar 3, 2003

I'm gonna be contrarian and say the manual split-table approach won't necessarily help out that much. You can physically partition a database (or the underlying data storage) without modifying the logical database anyway.

For starters, it's unlikely you'll achieve any kind of uniform distribution across the 26 tables. Secondly, this sounds like the basic case where a secondary index would help out a lot. As usual, take my opinion with a grain of salt. I'm just another AT geek, not a professional DBA.

The only real question (that I can't answer) is how scalable is MySQL? Are 10+ GB tables* in common use with MySQL?

* 10 GB nowadays isn't as big as it sounds

Superwormy · Mar 3, 2003

Anyone else? Other opinions?

I guess another issue to keep in mind is that I believe MySQL limits the total number of tables that can be open at one time... will tables have to be opened / closed more often with this many more tables, adn will this slow thigns down?

Armitage · Mar 3, 2003

I think that indexing the table should logically accomplish what you hope to accomplish physically by splitting up the tables. Being able to split the tables between multiple spindles may help, but you may be better off RAIDing those spindles together.
But I'm guessing that the only way to tell is to try it.

glugglug · Mar 3, 2003

The time that it takes your code to figure out what table to look in for a particular piece of data is going to be much more than the difference in time for SQL to check the index on a larger table.

The time it takes SQL to union results out of multiple tables is going to be a hell of a lot more than it would take to just query with the big table.

So you're much better off staying with the one big table.

Superwormy · Mar 4, 2003

Aight but it wont really take me any extra time to determine what table to grab from if I sort everything by the first letter of the main record right?

So I can just do ( PHP code here ):
SELECT * FROM table_" . $singlechar . " WHERE ID = 25

And I'll only ever be doing single selects, I really should never need to do a JOIN to join the result sets... so then do I still go with one table?

Armitage · Mar 4, 2003

If it's all going to be on one physical disk spindle, I don't think you'll see any advantage to splitting up the tables. Across multiple spindles it might help when the load really cranks up. But as I said before, RAID might buy you the same advantage without the hassle of dealing with multiple tables.
What kind of load do you expect?

Sounds like an interesting experiment though ... sometimes all you can do is test it.
Let us know what you find, cuz I think most of the responses you've had so far are just intuition or WAGS.

Superwormy · Mar 4, 2003

I will definitally be trying it hopefully both ways, and I'll let you guys know what I find out. Might be at least another few weeks though before I find out. At least.

EagleKeeper · Mar 4, 2003

Originally posted by: Superwormy
Anyone else? Other opinions?

I guess another issue to keep in mind is that I believe MySQL limits the total number of tables that can be open at one time... will tables have to be opened / closed more often with this many more tables, adn will this slow thigns down?

I know that Access has a 30+ table limit that can be opened at once.
I suspect the MySQL may have a similar limit.

glugglug · Mar 4, 2003

You are better off calling an sp, with the key value you are looking for as a param, so that SQL only has to parse the select statement and determine an optimization strategy once, rather than having PHP build the select statement dynamically.

Although for the multiple table case php will handle building the table name better than using a SQL exec for that kind of logic within an sp -- having SQL just query one huge table still ends up by far the best option.

The only exception to this would be if you actually had multiple database servers and distributed the data across the servers, especially if the data was too big to become cached on one server.

chsh1ca · Mar 4, 2003

Access is a flatfile database format akin to DBF.
MySQL is a database server capable of handling hundreds of simultaneous connections.

They are nothing alike.

The answer to your question on open tables is that it's configurable: http://www.mysql.com/doc/en/Table_cache.html

Principally speaking, the multi-table approach is a BAD idea. We've got a database of image information that's indexed here. Something along the lines of 3 million rows taking up roughly 63.7GB of data, all in a single table on MSSQL Server 2000. The drives the thing is on are in RAID5, and more importantly, the server has 8GB of RAM.

RAM is more likely to be a key factor in indexing. If the entire index is loaded into ram with room to spare, you shouldn't have any issues pulling up any record, depending on how MySQL handles indexes.

You're far more likely to run into slowdowns within your application than on the DB side, unless you're talking 10+ million rows.

Superwormy · Mar 5, 2003

Well, heh, thats the thing, we're talking 20+ million rows

GlugGlub, what do you mean by this:
You are better off calling an sp, with the key value you are looking for as a param, so that SQL only has to parse the select statement and determine an optimization strategy once, rather than having PHP build the select statement dynamically.

Thanks1

glugglug · Mar 5, 2003

If you send the SQL server "select * from xtable where ID=123", that statement needs to get parsed and the optimizer needs to figure out that the index on ID is the best one to use before it executes, and this happens every time you send the server the select.

If you call a sp that looks like this:

create procedure blah @IDparam as int
AS
select * from xtable where ID = @IDparam

that SP gets compiled and run through the optimizer ONCE the first time you run it, and the query plan to use the index gets cached. For a select of a single row based on an indexed field like this, the actual search itself is tiny compared to the optimization process.

manly · Mar 5, 2003

Particularly for the one-table approach, you could also use a prepared statement, which is also cached by the DBMS. You don't need to implement a stored procedure until you know you need one.

ztadmin · Mar 6, 2003

what the hell are you doing that needs a 20 million row table? i mean, thats pretty big.

Shazam · Mar 6, 2003

Originally posted by: ztadmin
what the hell are you doing that needs a 20 million row table? i mean, thats pretty big.

Big? Not even close.

joinT · Mar 6, 2003

Originally posted by: notfred
Would it be faster to look through a card catalog if it was jsut one long drawer, or a different drawer for each letter of the alphabet?

ROFL - good answer

Descartes · Mar 6, 2003

Originally posted by: Shazam

Originally posted by: ztadmin
what the hell are you doing that needs a 20 million row table? i mean, thats pretty big.

Click to expand...

Big? Not even close.

I hope you're kidding.

Database Question ( MySQL particularly... ) LOTS of tables, or ONE BIG ONE?

Golden Member

Discussion Club Moderator<br>Elite Member

Lifer

Lifer

Lifer

Golden Member

Banned

Diamond Member

Golden Member

Banned

Golden Member

Discussion Club Moderator<br>Elite Member

Diamond Member

Golden Member

Golden Member

Diamond Member

Lifer

Member

Golden Member

Lifer

Lifer