need help with a MS SQL import issue

Homerboy

Lifer
Mar 1, 2000
30,856
4,974
126
I'm trying to import a rather large flat file of data (15GB, unknown number of rows, 6 columns)

I am creating a package via MS Visual Studio and have the data being imported via a "flow data" step.

One of the fields is a date and I have that datatype in the flat file Connection properties set to DT_DBDATE (database date)

My column in SQL is set to "date" datatype.

When importing I get the following error for row 2492473 (other rows before it import just fine). I can't edit the flat file as its too large to open in anything, and I can't seem to get the package to ignore this error.

Suggestions?


Information: 0x40043008 at Data Flow Task, SSIS.Pipeline: Post Execute phase is beginning.
Information: 0x402090DD at Data Flow Task, Flat File Source [1]: The processing of file "C:\!!!! SQL STORAGE\CLS table files\ACTIVE.TAB" has ended.
Error: 0xC0202009 at Data Flow Task, SQL Server Destination [66]: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040E07.
An OLE DB record is available. Source: "Microsoft SQL Server Native Client 10.0" Hresult: 0x80040E07 Description: "Error converting data type DBTYPE_DBDATE to date.".
Error: 0xC02092B5 at Data Flow Task, SQL Server Destination [66]: A commit failed.
Error: 0xC0047018 at Data Flow Task, SSIS.Pipeline: component "SQL Server Destination" (66) failed the post-execute phase and returned error code 0xC02092B5.
Information: 0x4004300B at Data Flow Task, SSIS.Pipeline: "component "SQL Server Destination" (66)" wrote 2492473 rows.
 

Evadman

Administrator Emeritus<br>Elite Member
Feb 18, 2001
30,990
5
81
The data in that row is outside the range that is allowed in the date datatype in MS SQL, or is not a date. The DT_DBDATE can only take a year, month and day. If the month and day is flipped in that data element, it won't convert. I've never tried using DT_DBDATE on a field with hour, minute, second, so that may be in the row too. the date may also be before the epoch, as you don't mention the version of SQL server you are importing into. 2008R2 will take any AC date, but none BC. If you are importing into 2005, the earliest datetime is sometime in the 1750's IIRC. Importing a date in the 15th century will fail conversion.

Regardless, there's something wrong with that row. You need to pull that row out of the file so you can inspect it and the answer should be clear. Use .net or SSIS to pull out that row and write it to another file so you can inspect it.
 
Last edited:

Homerboy

Lifer
Mar 1, 2000
30,856
4,974
126
The data in that row is outside the range that is allowed in the date datatype in MS SQL, or is not a date. The DT_DBDATE can only take a year, month and day. If the month and day is flipped in that data element, it won't convert. I've never tried using DT_DBDATE on a field with hour, minute, second, so that may be in the row too. the date may also be before the epoch, as you don't mention the version of SQL server you are importing into. 2008R2 will take any AC date, but none BC. If you are importing into 2005, the earliest datetime is sometime in the 1750's IIRC. Importing a date in the 15th century will fail conversion.

Regardless, there's something wrong with that row. You need to pull that row out of the file so you can inspect it and the answer should be clear. Use .net or SSIS to pull out that row and write it to another file so you can inspect it.

Understood on all the above.
I am just getting my feet wet here with more serious SQL stuff...
How exactly can I use SISS to pull/view that one row?
 

KLin

Lifer
Feb 29, 2000
29,500
125
106
Why not import the data into a table where all the fields are text, and see what the date value is for that row number?
 

Evadman

Administrator Emeritus<br>Elite Member
Feb 18, 2001
30,990
5
81
Understood on all the above.
I am just getting my feet wet here with more serious SQL stuff...
How exactly can I use SISS to pull/view that one row?

Sorry, I didn't realize that you were that new. Basicly, you can define a number of rows to skip, then skip that many. Or, you keep track of the row number, and when you get close to that row, write the data to another flat file in the same format.

Why not import the data into a table where all the fields are text, and see what the date value is for that row number?

I agree with this. It is a great way that doesn't require specialized knowledge. It is possible that something is missing in that row though, like the row or column delimiter. That may cause a text import to fail too. Hopefully, that isn't the problem, because those suck to solve.
 

Oyster

Member
Nov 20, 2008
151
0
0
You can always put breakpoints inside your packages. The best way to handle this *inside* SSIS is to create a script task - basically to debug the contents. Something like

Debug.Assert(if row.Index == 2492474)
// Do Something

The above is a C# snippet - I am assuming you're using VS 2008 or above. I agree with the other posts - you can always bring the data into a temp table inside SQL Server and then see what's going on with your data.

If you deploy your package to production, you probably want to handle these kind of errors gracefully instead of bailing on the entire process.
 

KLin

Lifer
Feb 29, 2000
29,500
125
106
Sorry, I didn't realize that you were that new. Basicly, you can define a number of rows to skip, then skip that many. Or, you keep track of the row number, and when you get close to that row, write the data to another flat file in the same format.



I agree with this. It is a great way that doesn't require specialized knowledge. It is possible that something is missing in that row though, like the row or column delimiter. That may cause a text import to fail too. Hopefully, that isn't the problem, because those suck to solve.

I'm going to guess it's not a valid date value.
 

Homerboy

Lifer
Mar 1, 2000
30,856
4,974
126
So even if I identify the problem by importing via text, then what though? How do I eliminate that error? manually edit the offending data, export the text database to flat file, then reimport that new flat file to the proper database?

Edit: Err.... just go text database to proper database I guess. No need for flat file middle man. Duh.
 

Homerboy

Lifer
Mar 1, 2000
30,856
4,974
126
(excuse me while I think out loud for a bit)

you know... as I'm thinking more about this, the problem is a bit bigger. This flat file is created via an export from a different prosperity database we use in our office. The idea is that tables are exported from this database daily/weekly/monthly into SQL to be able to have a (much) better reporting tool to use (the database program's reporting tools suck)

So the data that is being exported is "wrong" to start with. Every night that this table is exported, and then in-turn imported, I'd have errors with it. I need to either allow for these errors to be ignored/resolved on import within the package, or I am going to get a failed import whenever there is "bad" data... and I can't have that happen.

I think what I need to do is have bad data ignored, and dumped into an alternative database... "ERROR.ACTIVE" (database name is "ACTIVE") or something.

Anyone care to point me in that direction? Does that make sense?
 

WannaFly

Platinum Member
Jan 14, 2003
2,811
1
0
I had a similar problem.
My solution was adding a script task that took in the date columns and if they were invalid defaulted them to 1/1/1900.

Code:
 Try

            'check if DOB is valid
            If IsDate(Row.dob) Then
                'good date
                If DatePart("yyyy", Row.dob) < 1700 Then
                    Row.OutDOB = CDate("1-JAN-1900")
                    Row.OutBirthday = CDate("1-JAN-1900")
                Else
                    Row.OutDOB = CType(Row.dob, Date)
                    Row.OutBirthday = CType(Row.dob, Date)
                End If

            Else
                'bad date
                Row.OutDOB = CDate("1-JAN-1900")
                Row.OutBirthday = CDate("1-JAN-1900")
            End If

            'set birthday same as DOB

        Catch ex As Exception
            Row.OutBirthday = CDate("1-JAN-1900")
            Row.OutDOB = CDate("1-JAN-1900")
        End Try
 

Markbnj

Elite Member <br>Moderator Emeritus
Moderator
Sep 16, 2005
15,682
13
81
www.markbetz.net
It's not that unusual to have to massage data between the extract and the load. First step is to know exactly what the error is and how to resolve it. You can then add an additional step, i.e. export to a text file, massage the text file, import the text file into another db.
 

Homerboy

Lifer
Mar 1, 2000
30,856
4,974
126
It's not that unusual to have to massage data between the extract and the load. First step is to know exactly what the error is and how to resolve it. You can then add an additional step, i.e. export to a text file, massage the text file, import the text file into another db.

Right.
I understand (now) that that is a step I could do.
But what I really need is a safety valve for when this flatfile is imported nightly/weekly. I need the package to NOT error out, and to take any bad data and dump it somewhere else for identification and clean up.

I'm pretty sure that is possible. I just need to figure out how to do it. :)
 

Homerboy

Lifer
Mar 1, 2000
30,856
4,974
126
I'm importing now just 4 columns as varchar(50) just to get the data in.
I'll then retrieve row 2492473 just to see WTF I'm dealing with.
 

WannaFly

Platinum Member
Jan 14, 2003
2,811
1
0
Homerboy: did you see my recommendation? You can also put breakpoints in the script task which will let you see the problematic data.

Have you tried opening the file with notepad++? I've used it on 2GB files with no problems, but yours is much larger so I'm not sure if it'll work for you or not.
 

AyashiKaibutsu

Diamond Member
Jan 24, 2004
9,306
3
81
Yea, I'd try Notepad++ It's worked for me on files Notepad chokes on, but I haven't used it on anything that large either.
 

Homerboy

Lifer
Mar 1, 2000
30,856
4,974
126
Craaaaaaaaap.
54M rows in (dumping all to txt columns)


Error: 0xC02020C7 at Data Flow Task, SQL Server Destination [66]: The attempt to send a row to SQL Server failed with error code 0x80004005.
Error: 0xC0047022 at Data Flow Task, SSIS.Pipeline: SSIS Error Code DTS_E_PROCESSINPUTFAILED. The ProcessInput method on component "SQL Server Destination" (66) failed with error code 0xC02020C7 while processing input "SQL Server Destination Input" (82). The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running. There may be error messages posted before this with more information about the failure.
Error: 0xC02020C4 at Data Flow Task, Flat File Source [1]: The attempt to add a row to the Data Flow task buffer failed with error code 0xC0047020.
Error: 0xC0047038 at Data Flow Task, SSIS.Pipeline: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on component "Flat File Source" (1) returned error code 0xC02020C4. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. There may be error messages posted before this with more information about the failure.
Information: 0x40043008 at Data Flow Task, SSIS.Pipeline: Post Execute phase is beginning.
 

Homerboy

Lifer
Mar 1, 2000
30,856
4,974
126
I've used notepad++ on 2GB+ myself.
13.7GB scares me though :)


Again even cleaning up this ONE flatfile doesn't overall help the bigger picture.
I need a way to have error rows to A) not bomb the whole import B) dump to a error database or error flatfile to be reviewed.
 

Markbnj

Elite Member <br>Moderator Emeritus
Moderator
Sep 16, 2005
15,682
13
81
www.markbetz.net
Right.
I understand (now) that that is a step I could do.
But what I really need is a safety valve for when this flatfile is imported nightly/weekly. I need the package to NOT error out, and to take any bad data and dump it somewhere else for identification and clean up.

I'm pretty sure that is possible. I just need to figure out how to do it. :)

Well, the package won't error out if you know what the problem is, and you massage it out before the import.
 

Homerboy

Lifer
Mar 1, 2000
30,856
4,974
126
sidenote/question:

If I have a time column in my flat file formatted as:


14:29:40
14:02:36
9:32:07
14:19:43
14:19:43
14:19:43
11:00:35
11:00:35

What is should my dataType be for the column in SQL?
I assume DT_DBTIME correct?

(if so, I'm not sure what is the best way to convert to that)
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86

Evadman

Administrator Emeritus<br>Elite Member
Feb 18, 2001
30,990
5
81
How wide is a row (bytes) and is the OS a 32 bit or 64 bit?
 

Oyster

Member
Nov 20, 2008
151
0
0
Here is a snapshot from one of my packages with simple logic. I have blurred the company-specific stuff, but it shows you the logic for dealing with error rows. In my case, I insert these rows in a DB table, but you can write these to a text file.

Notice that I also have a script task that does some other things before I insert the error rows into a DB table. Basically, I get data from a Web service and transform the data. All rows successfully transformed (from UTF-8 to DATETIME), are inserted into a reporting table while the errors get redirected to an "errors" table.

http://imgur.com/ZohWf

To make your debugging task easier, try and divide the problem. Instead of importing all columns at once, simply bring in the datetime column into the SSIS pipeline. For any error rows, you can log the row number which will point you directly to the problem.
 

Fallen Kell

Diamond Member
Oct 9, 1999
6,039
431
126
This is why I don't use Windows for real computer work. On Linux/Unix/OS X, this is such an easy problem to see the data on that line:

1) open a command line terminal

2) cat <sql filename> | head -2492473 | tail -1

3) .... there is no step three because you are done....
 
Last edited:

LokutusofBorg

Golden Member
Mar 20, 2001
1,065
0
76
You can easily pipe error rows to a different destination. I don't know if that EMC link covers all the bases, as it's been a while since I delved deep into SSIS or DTS (old school) but you should be able to get what you want from inside your package just fine. You can either pipe error rows to a different destination, or you can do a script task or whatever (as mentioned above) on the transform to sanitize the data on the way in and everything will go to your destination.
 

WannaFly

Platinum Member
Jan 14, 2003
2,811
1
0
This is why I don't use Windows for real computer work. On Linux/Unix/OS X, this is such an easy problem to see the data on that line:

1) open a command line terminal

2) cat <sql filename> | head -2492473 | tail -1

3) .... there is no step three because you are done....

Not to go off topic, but you peaked my interest with this...OP if you have powershell you can do the same thing basically, though i'm not sure how well it will work for large files as I don't have one to test.

cat <sql filename> | select -skip 2492470 | select -first 10