Hi folks.
Looking for a software package/database/method to accurately & easily work with upwards of 200-million rows of data with 3-5 parameter columns.
Here is what I'm looking to do...
1. Pull out and save every 1500th/1501th row. This results in a 200 million row x 3 column data set being reduced to ~266,666 row x 3 column. Once I have it reduced to ~266,666, will likely drop into excel for further manipulation.
Here is a 3 column example:
Date, Time, Price
Row1 4/1/2010 08:30:05 1202.25 <- Save
Row2 4/1/2010 08:30:05 1202.50 <- Remove
Row3 4/1/2010 08:30:05 1202.25 <- Remove
..... <- Remove
..... <- Remove
Row1500 4/1/2010 08:31:02 1201.50 <- Save
Row1501 4/1/2010 08:31:02 1201.50 <- Save
......
......
Row3000 4/1/2010 08:31:58 1201.25
Row3001 4/1/2010 08:31:58 1201.00
.....
.....
Row4500
Row4501
.....
..... etc
Row150,000,000
Row150,000,001
Want to be able to easily switch to a removal criteria such that I can save every 1200th/1201th row or 1000th/1001th etc.
Data is either in ascii format or txt file from original source.
Any input much appreciated.
Looking for a software package/database/method to accurately & easily work with upwards of 200-million rows of data with 3-5 parameter columns.
Here is what I'm looking to do...
1. Pull out and save every 1500th/1501th row. This results in a 200 million row x 3 column data set being reduced to ~266,666 row x 3 column. Once I have it reduced to ~266,666, will likely drop into excel for further manipulation.
Here is a 3 column example:
Date, Time, Price
Row1 4/1/2010 08:30:05 1202.25 <- Save
Row2 4/1/2010 08:30:05 1202.50 <- Remove
Row3 4/1/2010 08:30:05 1202.25 <- Remove
..... <- Remove
..... <- Remove
Row1500 4/1/2010 08:31:02 1201.50 <- Save
Row1501 4/1/2010 08:31:02 1201.50 <- Save
......
......
Row3000 4/1/2010 08:31:58 1201.25
Row3001 4/1/2010 08:31:58 1201.00
.....
.....
Row4500
Row4501
.....
..... etc
Row150,000,000
Row150,000,001
Want to be able to easily switch to a removal criteria such that I can save every 1200th/1201th row or 1000th/1001th etc.
Data is either in ascii format or txt file from original source.
Any input much appreciated.