:( Some people use SQL Server like "Giant excel spread sheets"

Cogman

Lifer
Sep 19, 2000
10,286
145
106
Wow, this is really FRUSTRATING. I have the lovely task of setting up a new database system. The problem is, it must
A. Take data from the old system and
B. Maintain the old system.

The problem? The old system was set up by a retarded monkey (I swear). It is a RELATIONAL database, and they treated it as unrelational as possible. Most tables have no private key. Data is duplicated in several places. And whats worse, it looks like some tables are just plain ignored.

Here's an example. There is one table named "Companies" It as a short 3 letter abbreviation and the companies name. the state, the manager, ect. Well, instead of giving that table a private key and referencing that private key in other tables, The other tables will just put in the 3 letter abbreviation. Now, that wouldn't be so bad, if they kept a consistent naming scheme! Seriously, they did different abbreviations all over the place that have nothing to do with the company table. Effectively making the company table worthless. One company was name "Cordova" with a 3 letter abbr. of "CRD" but instead of using that abbr. the table has "CORDOVA".

My goodness, how can people be this dense. In that example above, I've seen at least 3 different tables that reference the same company in 3 different ways...
 
Last edited:

Cogman

Lifer
Sep 19, 2000
10,286
145
106
Here's a better example of the retardedness in action. There is one table with two columns Employees and Who. The ONLY difference I can tell between those two columns is that one column contains the employees username in caps, and the other is lowercase.... I'm not joking.


And a quick query confirms....
 
Last edited:

Cogman

Lifer
Sep 19, 2000
10,286
145
106
Why does this surprise you? I see this more often than not.

Oh but it makes things hard to work with. From just about any stand point.

The worst is columns like
Mo Tu We Th Fr Sa Su

Which serve, what purpose? Do people REALLY care about how many hours you worked in all the monday's of the month? It serves no purpose.
 

Kirby

Lifer
Apr 10, 2006
12,028
2
0
Here's a better example of the retardedness in action. There is one table with two columns Employees and Who. The ONLY difference I can tell between those two columns is that one column contains the employees username in caps, and the other is lowercase.... I'm not joking.


And a quick query confirms....

Replace all the Who entries with lyrics from The Who.
 

Cogman

Lifer
Sep 19, 2000
10,286
145
106
Replace all the Who entries with lyrics from The Who.

I would, but I seriously don't know how it is being used in older applications. For all I know, half look up who and the other half look up employees... Damn legacy software
 

KIAman

Diamond Member
Mar 7, 2001
3,342
23
81
Oh but it makes things hard to work with. From just about any stand point.

The worst is columns like
Mo Tu We Th Fr Sa Su

Which serve, what purpose? Do people REALLY care about how many hours you worked in all the monday's of the month? It serves no purpose.

Nobody has to justify why they want data and how they use it. I set up several hundred OLAP cubes for a BI data warehouse system and I was surprised at how many different ways people want to see data, but in the end, all you have to do is make it available and keep integrity.

If you already reached the step where you can identify many of the issues, you could proactively clean things up. If this is MSSQL, then you can do plenty of modifications without disrupting production systems.

Are there applications that rely on this schema or is the database a means for storage?
 

troytime

Golden Member
Jan 3, 2006
1,996
1
0
that sucks.

i once inherited a website and mysql db that had a column called 'yes'
it was an enum with two possibilities... 'yes' and 'no'

swear to christ.
 

Cogman

Lifer
Sep 19, 2000
10,286
145
106
Nobody has to justify why they want data and how they use it. I set up several hundred OLAP cubes for a BI data warehouse system and I was surprised at how many different ways people want to see data, but in the end, all you have to do is make it available and keep integrity.

If you already reached the step where you can identify many of the issues, you could proactively clean things up. If this is MSSQL, then you can do plenty of modifications without disrupting production systems.

Are there applications that rely on this schema or is the database a means for storage?

Unfortunately there are applications that rely on the schema. So just dumping the old setup is, sadly, out of the question. I know the reason they have the Mo-Su Columns.. It is because when entering information they are able to do it a week at a time. So instead of calculating what day each entry is, they do a straight insert into the DB.

This DB is split off from the production DB, (I'll keep it that way until I'm satisfied I won't break anything) but my thinking is to use triggers to get a new database communicating with the old one.
 

airdata

Diamond Member
Jul 11, 2010
4,987
0
0
Cogman : that's why you're making the big bucks :)

It does make things difficult though when you need to work with or around poorly setup structure.
 

brandonb

Diamond Member
Oct 17, 2006
3,731
2
0
I've had people at my company complain that something was "too normalized" because they had to use 4-5 joins in a SQL statement to link accounts to people to whatever it was they were trying to get. I'm like 4-5 joins big whoop. The thing that irks me the most is when you have a decent system, and some (not really) hotshot developer thinks its too complicated and rewrites it to be like a giant excel spread sheet.
 

Cogman

Lifer
Sep 19, 2000
10,286
145
106
I've had people at my company complain that something was "too normalized" because they had to use 4-5 joins in a SQL statement to link accounts to people to whatever it was they were trying to get. I'm like 4-5 joins big whoop. The thing that irks me the most is when you have a decent system, and some (not really) hotshot developer thinks its too complicated and rewrites it to be like a giant excel spread sheet.

Man, that does suck. I'm a big fan of "Store data once, in one place" not "Duplicate it wherever you feel like".

It makes things so much clearer to be able to just be able to say "I need information from this table, this table, and this table" because if you ever have to change something, it is as simple as editing one table, not every table.
 

KIAman

Diamond Member
Mar 7, 2001
3,342
23
81
I've had people at my company complain that something was "too normalized" because they had to use 4-5 joins in a SQL statement to link accounts to people to whatever it was they were trying to get. I'm like 4-5 joins big whoop. The thing that irks me the most is when you have a decent system, and some (not really) hotshot developer thinks its too complicated and rewrites it to be like a giant excel spread sheet.

It is all situational and there are lots compromises between a highly normalized structure vs denormalized structure but being too "complicated" should never be justification, LOL.

It makes things so much clearer to be able to just be able to say "I need information from this table, this table, and this table" because if you ever have to change something, it is as simple as editing one table, not every table.

Ahh, you are thinking like a developer here. Sometimes the tradeoff is better to have complicated schema editing to satisfy a bigger requirement, like performance, for example.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,695
4,658
75
Then again, at an insurance company I worked at for awhile, the boss wanted all the tables to contain field names in one column and values in another. Basically, he wanted a NoSQL database, but it didn't exist (or at least wasn't in common use) back then.
 

Markbnj

Elite Member <br>Moderator Emeritus
Moderator
Sep 16, 2005
15,682
14
81
www.markbetz.net
It is all situational and there are lots compromises between a highly normalized structure vs denormalized structure but being too "complicated" should never be justification, LOL.



Ahh, you are thinking like a developer here. Sometimes the tradeoff is better to have complicated schema editing to satisfy a bigger requirement, like performance, for example.

The database should always be as normalized as possible within performance constraints, imo. Have people forgotten what views were invented for? :).

I'm in a similar situation this week, Cogman. I work on the back-end content management for a very large digital music archive (1 million+ tracks). We routinely absorb content into the repository from labels large and small. The largest ones send us feeds almost daily in standardized formats with xml metadata and its all handled automatically. But then we get these one-off dumps that we have to process outside the pipe.

They decided to pick up this big chunk of Indian content. The publisher sent a bunch of folders with subdirectories for content and art, and a spreadsheet for each folder to tie the metadata in with the content.

Which would be fine except: a) the spreadsheets are all in different formats: different number of columns, different column ordering, different data formatting ("244" vs. "4:04" for duration, for example, switched right in the middle of a column for no reason); b) the artist, album, and track names in the spreadsheets almost never match the filenames on disk: in one case they preserved spaces, in another subbed in dashes, or they're misspelled. It's ridiculous. We already have two full days in massaging code to suck this stupid crap up.

/sigh
 

Cogman

Lifer
Sep 19, 2000
10,286
145
106
The database should always be as normalized as possible within performance constraints, imo. Have people forgotten what views were invented for? :).

I'm in a similar situation this week, Cogman. I work on the back-end content management for a very large digital music archive (1 million+ tracks). We routinely absorb content into the repository from labels large and small. The largest ones send us feeds almost daily in standardized formats with xml metadata and its all handled automatically. But then we get these one-off dumps that we have to process outside the pipe.

They decided to pick up this big chunk of Indian content. The publisher sent a bunch of folders with subdirectories for content and art, and a spreadsheet for each folder to tie the metadata in with the content.

Which would be fine except: a) the spreadsheets are all in different formats: different number of columns, different column ordering, different data formatting ("244" vs. "4:04" for duration, for example, switched right in the middle of a column for no reason); b) the artist, album, and track names in the spreadsheets almost never match the filenames on disk: in one case they preserved spaces, in another subbed in dashes, or they're misspelled. It's ridiculous. We already have two full days in massaging code to suck this stupid crap up.

/sigh

:) I hear yah. I talked with one of my coworkers about this (I'm the only one working on it.) And he said "Well, just don't judge the database too harshly, it has gone through several database servers and been managed by several programmers. It has just sort of grown organically".

To which my response was "I wish I had some roundup to kill that organic growth."

I really wish I could just start from scratch. For the system I'm fixing, it would be easier. It is all the interdependencies that really kills me though. Good luck with your project Mark. Screwing around with strange data sucks.
 

degibson

Golden Member
Mar 21, 2008
1,389
0
0
Cogman and Markbnj:
Code:
SELECT *
FROM omg INNER JOIN bloody_sucks WHERE 
  omg.wtf = bloody_sucks.craptastic
I feel very sorry for you both.
 

lozina

Lifer
Sep 10, 2001
11,711
8
81
that sucks.

i once inherited a website and mysql db that had a column called 'yes'
it was an enum with two possibilities... 'yes' and 'no'

swear to christ.

haha that's funny. The query must look great:

SELECT * FROM STUPID_TABLE WHERE YES = 'NO'
 

Evadman

Administrator Emeritus<br>Elite Member
Feb 18, 2001
30,990
5
81
select * from users where clue > 0
-------
(0) rows returned
 

tatteredpotato

Diamond Member
Jul 23, 2006
3,934
0
76
I recently got a project that involves a C program that runs on an embedded device. What I got initially was ALL the code in one file and heavy usage of global variables which I have been trying to add structure to without breaking other things (which is easy to do when you have so many globals). The desktop app to interface with it had over 30k lines in one file too.
 

Ancalagon44

Diamond Member
Feb 17, 2010
3,274
202
106
If anyone with more than 2 years of development experience insists on the Excel style database format, they should be euthanized. If you need to denormalize your tables that much for performance reasons, you need faster hardware.

@Cogman

Starting a new schema, appealing though it may be, is a huge task that should ideally involve requirements elicitation, analysis, and design. If you arent paid that well, dont do it, unless you think it would be less work than maintaining what they have.

I remember reading an opinion piece written I think in 2007 or 2008 by a guy who claimed that classic ASP is superior to ASP.Net. I lolled.
 

Cogman

Lifer
Sep 19, 2000
10,286
145
106
I recently got a project that involves a C program that runs on an embedded device. What I got initially was ALL the code in one file and heavy usage of global variables which I have been trying to add structure to without breaking other things (which is easy to do when you have so many globals). The desktop app to interface with it had over 30k lines in one file too.

See, I could handle that :) At least changing the structure of a program doesn't really break things (if done right).
 

Apathetic

Platinum Member
Dec 23, 2002
2,587
6
81
I really wish we could tell you that what you're running into is rare but, sadly, it's far too common. This is why I keep telling people "I'm not jaded, I'm experienced".

Dave
 

tatteredpotato

Diamond Member
Jul 23, 2006
3,934
0
76
See, I could handle that :) At least changing the structure of a program doesn't really break things (if done right).

It's a pain but at least I can have a somewhat elegant solution at the end (no external interfaces even need preserved). I hate working on something knowing it's being done in a terrible way.