java help please~!

Rallispec

Lifer
Jul 26, 2001
12,375
10
81
Im trying to make a java program that reads in two text documents and compares them for similar sentances. But i have no idea on how or where to start-- can somebody clue me in on how to convert a text document into a string or at least point me to some where that will teach me?

This isnt exactly homework, its more of an independent research project.. our teacher just told us to make a program that does anything. But we havent really learned any java, we spent all our time talking about "the factory method" and other design patterns, and never any actual java programming stuff. i didnt learn the basic output fuction until yesterday.
Thanks~!

-danny
 

Ameesh

Lifer
Apr 3, 2001
23,686
1
0
Use a StreamReader and the Readline function it will return you as tring so you can use hat to compare with
 

Rallispec

Lifer
Jul 26, 2001
12,375
10
81


<< 2 seperate texts, or 2 in one file? >>




2 seperate texts

thanks for the help ameesh... looking up stream readers now
 

Rallispec

Lifer
Jul 26, 2001
12,375
10
81


<< or you can use a buffered reader, one of them has a ReadLine() function. >>



okay, tried that and it worked, but not how i wanted it too... instead of reading a line at a time, i need it go through one sentance at a time, or posibley one character at a time.. is there a readChar function somewhere?
 

manly

Lifer
Jan 25, 2000
13,589
4,239
136
This isn't a trivial assignment (unless the spec really wants you to implement the most basic functionality). You essentially implement a mini-parser with some semantic analysis (sentence matching).

The simplest way to get a "sentence" is to use a StreamTokenizer that uses ".?!" as the delimeter string. This should work correctly for 95% of all cases. You could resort to a regexp if you need more sophisticated matching (Jakarta ORO is a great regexp library).

The sentence matching is where you have to implement some logic. Without knowing what "similar sentences" means, I couldn't give any concrete suggestions. The most simple approach would be calling String.equals() on the sentences.
 

Rallispec

Lifer
Jul 26, 2001
12,375
10
81
okay.. well here is a more detailed explanation of the thing i'm doing.

the user would put two text documents into a folder, The program would read one sentance from the first file, and then go through and read all the sentances of the second file and check for exact matches.. and then the program moves on to the second sentance of the first text file, and so on, until the end of file is reached.

its supposed to simulate a basic plagerism or cheating detector.

The main thing i'm trying to work out is reading the first file one sentance at a time... Once i get that figured out, i think i know how to do the rest.
 

Ameesh

Lifer
Apr 3, 2001
23,686
1
0
As long as the documents are grammatically correct, manly is correct, using a string tokenizer is the easieast way. just break the input you read apart by the instances of '.' '?' or '!' the only thing i would also check for is if there is a space after the punctuation so you dont break apart decimel numbers.
 

ggavinmoss

Diamond Member
Apr 20, 2001
4,798
1
0
Don't forget about ellipsis periods too: ...

I'm not sure if they would cause a problem, but it's something to think about.

-geoff