Question about tokens and atoi function

RedArmy

Platinum Member
Mar 1, 2005
2,648
0
0
Text file format:

14 43 85 05 64
35 17 67 43 64
35 43 85 05 64
04 85 62 98 22
72 43 85 05 64
************
02 19 34 05 64
14 43 85 05 64
35 17 67 43 64
35 43 85 05 64
04 85 62 98 22

Edit: New question in my post from 10/02/2009 04:12 AM

Any help is greatly appreciated (even if I don't understand it completely :()
 

degibson

Golden Member
Mar 21, 2008
1,389
0
0
I can think of two quick ways to do this, out of a hundred possibilities: with stringstream ala C++, or with strtok() from C. Of those two, I prefer strtok().

Here's a rough sketch of what to do. I have assumed you have defined a Matrix class with the following member functions:
void addRowElement(int);
void addColumn();
I further assume that Matrix starts with zero columns and zero rows.

list<Matrix*> l_matrices;
FILE * infile = fopen("matrix.txt", "r"); // check error cond. in real code
char buff[1024];
Matrix * p_matrix = new Matrix();

while(fgets(buff,1024,infile)) {
..// This loop is called once for each line of file

..char * token = strtok(buff," \t\n");
..if(token==NULL) continue; // blank line

..if(token[0] == '*') {
....// Start of next matrix
....l_matrices.push_back(p_matrix);
....p_matrix = new Matrix();

..} else {
....p_matrix->addColumn();
....while(token != NULL) {
......// Each iteration of this loop is one row element
......p_matrix->addRowElement( atoi(token) );
......token = strtok(NULL, " \t\n");
....}

....// At this point, you are at the end of the row
..}
}

fclose(infile);

// At this point, l_matrices contains a list of all matrices in the file

I hope this helps.

EDIT: Added some vertical whitespace. We really need that 'Attach Code' button to work.
 

EagleKeeper

Discussion Club Moderator<br>Elite Member
Staff member
Oct 30, 2000
42,589
5
0
You are going to have to read the input sequentially.

Process each block until you reach the *** or the end of the data file.
IF you are processing a block that ends with ***, then restart your analysis on the line after the ***

The same logic to process the block will hold true
 

Sc4freak

Guest
Oct 22, 2004
953
0
0
Using C++, if all you want is the number of rows/columns.

#include <fstream>
#include <algorithm>
#include <string>

using namespace std;

int main()
{
fstream fs("file.txt");

int columns1 = 0; // cols in matrix 1
int rows1 = 0; // rows in matrix 1
int columns2 = 0; // cols in matrix 2
int rows2 = 0; // rows in matrix 2
string s;

getline(fs, s);
columns1 = count(s.begin(), s.end(), ' ') + 1;

do
{
getline(fs, s);
++rows1;
} while(s != "************");

getline(fs, s);
columns2 = count(s.begin(), s.end(), ' ') + 1;
++rows2;

while(!fs.eof())
{
getline(fs, s);
++rows2;
}
}

EDIT: This forum sucks for posting code.
http://pastebin.ca/1587064
 

Net

Golden Member
Aug 30, 2003
1,592
3
81
I can think of two quick ways to do this, out of a hundred possibilities: with stringstream ala C++, or with strtok() from C. Of those two, I prefer strtok(). Here's a rough sketch of what to do. I have assumed you have defined a Matrix class with the following member functions: void addRowElement(int); void addColumn(); I further assume that Matrix starts with zero columns and zero rows. list<Matrix*> l_matrices; FILE * infile = fopen("matrix.txt", "r"); // check error cond. in real code char buff[1024]; Matrix * p_matrix = new Matrix(); while(fgets(buff,1024,infile)) { ..// This loop is called once for each line of file ..char * token = strtok(buff," \t\n"); ..if(token==NULL) continue; // blank line ..if(token[0] == '*') { ....// Start of next matrix ....l_matrices.push_back(p_matrix); ....p_matrix = new Matrix(); ..} else { ....p_matrix->addColumn(); ....while(token != NULL) { ......// Each iteration of this loop is one row element ......p_matrix->addRowElement( atoi(token) ); ......token = strtok(NULL, " \t\n"); ....} ....// At this point, you are at the end of the row ..} } fclose(infile); // At this point, l_matrices contains a list of all matrices in the file I hope this helps. EDIT: Added some vertical whitespace. We really need that 'Attach Code' button to work.

i'd also add \r

char * whitespace = " \r\t\n";
 

RedArmy

Platinum Member
Mar 1, 2005
2,648
0
0
Thanks for all the suggestions so far.

degibson: I wish I could understand all that since it has some functions that I tried to implement but failed to do so, however I believe that's over my head.

Sc4freak: I was doing something along the same lines of what you were, but just for kicks I went to see if your code returned the same values as mine, and when I ran it (it compiled fine), it didn't return anything, so I don't know what the problem there is. I added in some lines to show the row & column count to check output, that's when I noticed the problem.
 

RedArmy

Platinum Member
Mar 1, 2005
2,648
0
0
Originally posted by: RedArmy
Thanks for all the suggestions so far.

degibson: I wish I could understand all that since it has some functions that I tried to implement but failed to do so, however I believe that's over my head.

Sc4freak: I was doing something along the same lines of what you were, but just for kicks I went to see if your code returned the same values as mine, and when I ran it (it compiled fine), it didn't return anything, so I don't know what the problem there is. I added in some lines to show the row & column count to check output, that's when I noticed the problem.

Edit: Here's what I have so far: http://pastebin.com/m413e11b8

I did what came most logical to me since I'm restricted to the knowledge of my two CS classes I ever took in college and I don't have the time to spend learning more advanced methods unfortunately.

Would it be possible for me to use fsetpos or fseek (not 100% sure) to set the file position indicator to the asterisks and then somehow just go about doing what I was for the first matrix? I ask this because the amount of asterisks will vary depending on how big the matrices are so I can't do a compare against a certain set string of asterisks.

I guess in pseudo code it would be like:

if (next char != '*')
{
advance file position;
}
From current file position{
do
{
same method I did with first matrix while testing for EOF
}
}
I could just subtract off one row at the end computing the rows and columns for the second matrix since if I'm only testing for the beginning asterisk it'll read in an extra '\n' at the end of wherever those asterisks end...but that's not too big of a deal.

I guess I just want to know if what I'm proposing is possible or if I'm an idiot. Thanks again for all the help.
 

degibson

Golden Member
Mar 21, 2008
1,389
0
0
Originally posted by: RedArmy
I guess I just want to know if what I'm proposing is possible or if I'm an idiot. Thanks again for all the help.

What you're proposing is entirely possible, though using fseek isn't necessary. If you want to skip to matrix N:
- Start a counter at 0
- Read characters until you find a '*'
- Read characters until you find a non-'*'
- Increment your counter. If you counter == N, you've found the matrix you want, and you can start parsing.

Originally posted by: RedArmy
degibson: I wish I could understand all that since it has some functions that I tried to implement but failed to do so, however I believe that's over my head.

Try thinking about a matrix as a vector<vector<int> >. If you don't know what a vector<> is, see cppreference.com.

You can start your vector<vector<int> > (lets call it vv for short) with 0 cols, 0 rows:
- Each addColumn() simply performs a { vector<int> v; vv.push_back( v ); }.
- Each addRow(int x) simply does a { vv[currentColumn].push_back(x); }

 

RedArmy

Platinum Member
Mar 1, 2005
2,648
0
0
Originally posted by: degibson
What you're proposing is entirely possible, though using fseek isn't necessary. If you want to skip to matrix N:
- Start a counter at 0
- Read characters until you find a '*'
- Read characters until you find a non-'*'
- Increment your counter. If you counter == N, you've found the matrix you want, and you can start parsing.

I understand the general idea of what you're saying and it makes sense from a practical stand-point. Can I do an fgetc approach to check for an asterisk? I only ask this since once I find an asterisk, I need to do another search from that point until I don't find an asterisk.

Maybe I interpreted the last part of your method wrong but should it go something like:
1. Set counter variable to 0
2. Search until asterisk is found, incrementing counter by one for each character
3. Somehow set the counter variable to start at it's current value and continue from there searching for a non-* character. Edit: nvm, I wasn't thinking, I got this part easily
4.Once it finds a non-* character, I use the counters current value to set the beginning of where I start reading for the lower matrix.

Edit: Awesome! I just got it. All I had to do was set count equal to my current position (variable for counter), then parse the data like normal. I didn't realize it would be that simple. Score one for the engineer.
 

RedArmy

Platinum Member
Mar 1, 2005
2,648
0
0
Alright, I have another quick question. I now know all the values that I could possibly want to know for both my matrices (rows, columns, amount of numbers in each, what the size of the output matrix will look like if the cross product is taken, etc)...however since I'm reading from a text file those 'numbers' are really characters as of right now. What I'm trying to do right now is dynamically allocate memory and throw both matrices into an array.

I know the sizes of both the matrices from what I did earlier so that's not the problem, I can set up the malloc command just fine. My issue is how I convert those single/double/triple digit 'characters' into integers and then pass them into the array. If I was just straight up putting them into an array that would be easy, but I need to do the conversion first. I figure the atoi function will suit my needs but does that mean I need to read whole lines of data in at a time then use a token to recognize spaces to separate the numbers from one another and THEN pass them into the array?

I guess I'm just looking for the most straight forward (not the most efficient) way to do this. Thanks for taking the time to read this, and as always I appreciate the help.
 

degibson

Golden Member
Mar 21, 2008
1,389
0
0
Originally posted by: RedArmy
I figure the atoi function will suit my needs but does that mean I need to read whole lines of data in at a time then use a token to recognize spaces to separate the numbers from one another and THEN pass them into the array?

atoi() and strtok() go together like a horse and carriage. However, atoi() will convert any ASCII string (read, const char*) to an integer, hence A to I. So if you're reading one character at a time, you need to put all of your characters 'together' into a character array, then call atoi() on that array when the number is complete.

e.g.

string s;

while( c is a character from a number ) {
..s += c;
..c = next_c;
}

int myInt = atoi(s.c_str());

Similarly for floats and doubles, but use strtof() or strtod() instead.

double myDouble = strtod(s.c_str(), NULL);

Fair Warning failed atoi()'s just return zero, so be sure you're passing in numeric characters.

 

Sc4freak

Guest
Oct 22, 2004
953
0
0
It's worth noting that C and C++ are different languages - C++ is not really a strict superset of C as is commonly believed. Even if it were, C++ comes with its own set of idiomatic practices, separate from C.

All this business regarding atoi, strtok, malloc, and the like is idiomatic C, not C++. In C++, the correct way to convert between strings and numeric types is to use the stringstream class. You should also prefer the new keyword instead of malloc.

I know that you need to work under the constraints of your CS class, but for future reference: if you're using C++, use C++. Prefer not to use the archaic C functions when better alternatives exist.

For example, prefer std::stringstream instead of atoi because is more generic in that it can convert both to and from strings and numeric data. It also has better error-reporting - atoi will return 0 even if the input string is incorrect whereas stringstream will signal errors if they occur. The new keyword should be preferred instead of malloc because new is typesafe and calls constructors for non-POD types, whereas malloc does not.

Dealing with individual characters is also not needed in C++ - the stream extraction operator (>>) in IO objects are by default designed to automatically "tokenise" the input according to whitespace.

A C++ example to copy all the numbers into two vectors:
http://pastebin.com/f14e18665

It's a bit long due to all the comments, but I think you get the general idea.
 

degibson

Golden Member
Mar 21, 2008
1,389
0
0
Originally posted by: Sc4freak
....if you're using C++, use C++. Prefer not to use the archaic C functions when better alternatives exist....

Sc4freak is correct in this case. When reading in ASCII files, the C++-flavored I/O can make things more intuitive.

However, there is great advantage to knowing and loving libc's functionality -- because there are many little gotchas (many in the I/O space) that don't have a C++ abstraction built around them. Thats when its useful to know about FILE*, fgets(), fgetc(), etc. Admittedly, not so much atoi(), but it does get rather awkward to mix C I/O and C++ conversion routines.