Need help with design logic in MS SQL

TechBoyJK

Lifer
Oct 17, 2002
16,699
60
91
I have a client with a table with about 40 columns of data and about 5 million records. I know with different forms of layouts, some say, break it up into smaller tables.. but 75% of the time, requests for the info in the tables is a request for everything. rarely is there a request for little bits of data from each record, here and there. I was thinking it would be better in this case, to leave it all in one table to avoid creating additional keys when they are not really needed...
 

WannaFly

Platinum Member
Jan 14, 2003
2,811
1
0
Are there performance issues?
40 Columns is not that much, an in all relativity 5mil rows isnt that many for MSSQL. If there are performance issues, perhaps look into your indexes and tune the server a bit better?

Can you use Horizontal partitioning, I.E. Table1 for lastname A-M, Table2 for N-Z? This is a viable solution in many cases. You can create a SP or view to put the logic to work with the correct table and it is seamless to the end user.
 

Markbnj

Elite Member <br>Moderator Emeritus
Moderator
Sep 16, 2005
15,682
14
81
www.markbetz.net
Ah, the old normalization vs. denormalization argument. I love the smell of ERD purists in the morning... er... ewww, no I don't.

The argument in favor of normalization:

Seperate entities in the database along the logical and semantic boundaries that exist in the problem domain, and you will make the design more flexible, modular, and maintainable. Changes in a single entity from the problem domain will not affect large swaths of the database. If you have different entities that both refer to the same base entity type (i.e. the database equivalent of class inheritance, where the logical entities with data in table b, and table c, both also include references to table a) then you won't repeat columns.

The argument in favor of denormalization:

You've given it. Performance. If you don't have any repeating subtypes, and you are always querying for a set of tables using a join, then why not roll the tables up into a single flat table with all the columns?

In general I prefer a normalized schema that reflects the problem domain. It's just going to cause less trouble in terms of comprehension in the future. And I would venture to say that on modern platforms the performance impact of properly constructed joins just isn't that serious to justify cramming the schema into one table. But if it were, for a given application, I wouldn't hesitate.