SQL Server Issues

DaShen

Lifer
Dec 1, 2000
10,710
1
0
This is a really weird question, but I am having issues uniquely identifying records from a flat file that I need to import into a SQL Server DB. The problem is that the best and most efficient way to uniquely indentify the records, because the records can match in almost everyway other than a free text field, is with a time stamp. I don't care as to what "time" old records have because all that was given was a date, but I wanted to figure out a way to set up the system so the first record would have 00:00:00 AM, the second record would have 00:00:01 AM.

I was wondering if anyone has worked in SQL Server enough to figure out a way to automatically do this. The database is for small business needs only, and from previous analysis, I am expecting that the amount of records in the flat file format is roughly between 500,000-700,000 records. Anyone know a good way to do this? I would be extremely grateful.

This is a major project overload because someone's old poorly written project has crashed and they want a solution from me soon. Please help.
 

KLin

Lifer
Feb 29, 2000
30,232
574
126
Why not set a new field as the identity field, and increment it by 1? That's basically the same as an autonumber field from MS Access.
 

DaShen

Lifer
Dec 1, 2000
10,710
1
0
The problem with that is the high traffic and many thread on the server that will accompany this program on rollout. If I use an IDENTITY row key, I make the server have to calculate the row id in the background for many different processes going on.

In other words, I could have 200 people all INSERT a record at once and the server would have to queue up the INSERTS in a way to handle the IDENTITY row. With a timestamp, you are guaranteed unique records even if multiple users INSERT at the same time. And no overhead is done because the server just "stamps" the time and moves on to the next record.
 

MrChad

Lifer
Aug 22, 2001
13,507
3
81
Originally posted by: DaShen
The problem with that is the high traffic and many thread on the server that will accompany this program on rollout. If I use an IDENTITY row key, I make the server have to calculate the row id in the background for many different processes going on.

In other words, I could have 200 people all INSERT a record at once and the server would have to queue up the INSERTS in a way to handle the IDENTITY row. With a timestamp, you are guaranteed unique records even if multiple users INSERT at the same time. And no overhead is done because the server just "stamps" the time and moves on to the next record.

Isn't that going to depend on the granularity of the timestamp? I wouldn't rely on timestamps for uniqueness. An identity record is the safer option here.
 

bunker

Lifer
Apr 23, 2001
10,572
0
71
Use an identity field. 500,000 to 700,000 records is nothing for sql server.
 

DaShen

Lifer
Dec 1, 2000
10,710
1
0
Originally posted by: bunker
Use an identity field. 500,000 to 700,000 records is nothing for sql server.

:) Yeah it is a small business need only. Not even a terabyte. ;) But it is expected to grow to that amount within the next year or so because of the high usage.

So the IDENTITY field is really okay in this situation. Has anyone done a TRACE analysis on anything similar to my needs thanks.
 

bunker

Lifer
Apr 23, 2001
10,572
0
71
Originally posted by: DaShen
Originally posted by: bunker
Use an identity field. 500,000 to 700,000 records is nothing for sql server.

:) Yeah it is a small business need only. Not even a terabyte. ;) But it is expected to grow to that amount within the next year or so because of the high usage.

So the IDENTITY field is really okay in this situation. Has anyone done a TRACE analysis on anything similar to my needs thanks.

500 to 700k records, assuming all are text/numeric fields, shouldn't even approach a GB, let alone a TB of storage.

I've got a database with 670 tables, many of which have well over a million rows, and the whole thing is under 10GB (not including the log file).

Use the identity field :).
 

NeoV

Diamond Member
Apr 18, 2000
9,504
2
81
another vote for identity field...no sure thing that timestamping will be unique, a second is a long time to a computer
 

DaShen

Lifer
Dec 1, 2000
10,710
1
0
Bad news guys.... :( My boss refuses to use an identity field. When she has her mind set on something, she won't change it.

But I talked her into doing the time field stamping herself using a SAS batch program. Once she figures out how much of a pain it is going to be and how much time it will take to do that, even in SAS, I think the IDENTITY field will win out. Oh well. Thanks for the help guys.


***************************

Originally posted by: bunker
500 to 700k records, assuming all are text/numeric fields, shouldn't even approach a GB, let alone a TB of storage.

I've got a database with 670 tables, many of which have well over a million rows, and the whole thing is under 10GB (not including the log file).

Use the identity field :).

I was exaggerating. ;) It probbably won't even approach 10 Gigs. ;)
 

bunker

Lifer
Apr 23, 2001
10,572
0
71
Any chance you could use a multi-field key? Would any combinations of fields in a row create a unique identifier?

What is her objection to using an IDENTITY field? If she even gave you a reason that is ;).
 

MrChad

Lifer
Aug 22, 2001
13,507
3
81
Using an identity field should be faster, as an integer column only occupies 4 bytes of spaces, making the index generated for the identity smaller and faster.
 

DaShen

Lifer
Dec 1, 2000
10,710
1
0
Originally posted by: bunker
Any chance you could use a multi-field key? Would any combinations of fields in a row create a unique identifier?

What is her objection to using an IDENTITY field? If she even gave you a reason that is ;).

Nope.

I already analyzed for that. The old flat file allowed for duplications except for the large text field. Think 4000-8000 characters long. :roll:

I am not indexing that.
 

ryokum

Member
May 5, 2004
112
0
0
Glad I don't work for your boss. Is she a DBA also? Tell her to figure it out since she apparently has a vast knowledge of SQL.
 

Rage187

Lifer
Dec 30, 2000
14,276
4
81
Originally posted by: DaShen
Bad news guys.... :( My boss refuses to use an identity field. When she has her mind set on something, she won't change it.

But I talked her into doing the time field stamping herself using a SAS batch program. Once she figures out how much of a pain it is going to be and how much time it will take to do that, even in SAS, I think the IDENTITY field will win out. Oh well. Thanks for the help guys.


hey look, I found your underlying problem.
 

bunker

Lifer
Apr 23, 2001
10,572
0
71
Originally posted by: Rage187
Originally posted by: DaShen
Bad news guys.... :( My boss refuses to use an identity field. When she has her mind set on something, she won't change it.

But I talked her into doing the time field stamping herself using a SAS batch program. Once she figures out how much of a pain it is going to be and how much time it will take to do that, even in SAS, I think the IDENTITY field will win out. Oh well. Thanks for the help guys.
hey look, I found your underlying problem.
My boss, female, is a damn good Oracle DBA. It has nothing to do with being a woman, it has everything to due with being a clueless IT manager.