• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

MySQL select row order

neegotiator

Golden Member
Had a question regarding a simple SELECT in MySQL. Assuming the table has no primary key or index defined, I don't really care the order records were inserted, just how they're stored in table.

If I do a SELECT * FROM `table` LIMIT n, will I get back the rows in the same order as how it's stored in the table? I've tried on small sample and seems the case, but just wanted to make sure.

Just for some context, there's a ~3 billion row table that we want to try and transfer from AWS RDS MySQL instance to S3. So just want to be able to transfer in chunks rather than whole thing at once.
 
A DBA once told me that there is no guarantee on the order that comes back, but I've never seen anything except the order of the PK. Still, it is recommended to use ORDER BY if you have to guarantee an ordering.

According to this, MySQL will do a table scan if no ORDER BY is given: http://forums.mysql.com/read.php?21,239471,239688#msg-239688

SELECT * FROM tbl -- this will do a "table scan". If the table has never had any DELETEs/REPLACEs/UPDATEs, the records will happen to be in the insertion order, hence what you observed.

If you had done the same statement with an InnoDB table, they would have been delivered in PRIMARY KEY order, not INSERT order. Again, this is an artifact of the underlying implementation, not something to depend on.

Even then, that commenter says not to depend on it... so in general it sounds like you should always use ORDER BY if you want to guarantee an ordering for a SELECT statement.
 
Last edited:
Thanks much Savatar.

Yeah if the table had an auto-incremented ID that was the PK, I wouldn't expect that to be in same order either. But since this table has none, does not currently experience inserts/deletes, and I don't care about insertion order (just the order its currently stored), was hoping to slide by without an ORDER BY of some kind and just depend on LIMIT. Even still, I'm nervous without one. The intention was to transfer from the table in chunks to CSV's in S3 rather than the whole table in one go.

Will have to think this over some more.
 
while i'm not an expert in mysql (i'm and oracle dba), all RDBMS are pretty similar in this regard. regular tables are called heap tables.

that means that the rows can be stored the way the databases feels is most appropriate.
most databases has some implementation of a sorted index table or IOT (Index Organized Table in oracle) where the data is actually stored in a btree index structure.

this means that a pk is a must and the data is ordered by the pk.

baring that, you can do a
"Create Table As Select" or CTAS using an ordered select clause and it should give you what you want.
 
Why not simply dump the table once to CSV locally and cut it into chunks locally and then transfer the chunks? That will guaranty that you get everything as of the time of dumping the table as well.

The problem I see with doing the "order by" and limit is that it will need to process through the entire database before it can start outputting records, and with a large dataset, this will incur a massive amount of computation and memory requirement, and it will be repeated each time you shift to the next set of output values (unless you simply dump the entire results into a new table using a primary key)..
 
Last edited:
Thanks all for the comments!

Exactly, wanted to avoid using resources to process that table (i.e. - ORDER BY, temp table with PK, etc...). We came up to essentially same conclusion as well that dump into CSV on another instance, and using that to split then transfer is best option.
 
Thanks all for the comments!

Exactly, wanted to avoid using resources to process that table (i.e. - ORDER BY, temp table with PK, etc...). We came up to essentially same conclusion as well that dump into CSV on another instance, and using that to split then transfer is best option.

There's no guarantee if you don't use an order by clause. There's no performance penalty to ordering by a table's key, so do that with a limit to transfer it in chunks.
 
There's no guarantee if you don't use an order by clause. There's no performance penalty to ordering by a table's key, so do that with a limit to transfer it in chunks.

There are no primary keys/indexes. That is the problem. So any/all "order by"'s will incur a performance penalty.
 
Back
Top