Java - How to see if a string contains anything that you DON'T want

hans030390

Diamond Member
Feb 3, 2005
7,326
2
76
This is indirectly related to a homework assignment. I'm creating a program where the user inputs a binary number, and it converts it to decimal form.

I got that all down, but I'm trying to make the program "nicer". I have the basics down...I'll probably get an A on this portion of it, but I have an extra day to work on it and figured I'd do a bit extra for my own learning.

Is there any way to check to see if the binary string the use inputs contains anything but 0's or 1's? I didn't know if there was a quick method or API code (I think it's called that) that could do this for me...if not, could anyone offer some quick help on creating a method that would check this for me (I'm not asking you to make the code for me, but if you do, that's OK with me ;))?
 

Crusty

Lifer
Sep 30, 2001
12,684
2
81
The proper way would be to use a regular expression...

http://java.sun.com/docs/books...orial/essential/regex/

Has some good tips on it.

What it comes down to is you specify a format string that defines what your input can contain(in your case only 1's and 0's) and use that regular expression on your input string. If it passes the test then your input string passes your 'definition' of what your input can contain, if it fails then the input string has bad characters in it.
 

awal

Senior member
Oct 13, 1999
953
0
0
You could actually check if the input is in the correct format in a few different ways.

1. Regular Expression
2. a try catch with Integer.parseInt
3. A loop over your input string checking each character

The fastest approach would be the loop.
 

Onund

Senior member
Jul 19, 2007
287
0
0
Crusty is correct but there's an elegant way to do it in input fields in Java. You can replace the default model used to input the text so it will only accept key presses based on certain criteria.

If you look here and read the part that talks about customized fields, the last section in the description before the 'Warning' part it will explain how to do it.

I used this technique to create a text field that only accepts characters if the input is still a valid digit input. So when you type in the field, if it doens't match my regular expression, the character you tried inputting gets thrown away.

essentially you would be replacing this section:
char[] upper = str.toCharArray();
for (int i = 0; i < upper.length; i++) {
upper = Character.toUpperCase(upper);
}
super.insertString(offs, new String(upper), a);

with custom code that does some regex matching.
 

hans030390

Diamond Member
Feb 3, 2005
7,326
2
76
Originally posted by: awal
You could actually check if the input is in the correct format in a few different ways.

1. Regular Expression
2. a try catch with Integer.parseInt
3. A loop over your input string checking each character

The fastest approach would be the loop.

Thanks, I made a loop and it does the job. I hadn't done any loops yet involving strings, but it turned out to be pretty easy for what I needed. :)
 
Sep 29, 2004
18,656
67
91
Originally posted by: Onund
Crusty is correct but there's an elegant way to do it in input fields in Java. You can replace the default model used to input the text so it will only accept key presses based on certain criteria.

To clarify, this is the document model, not the model.

 

degibson

Golden Member
Mar 21, 2008
1,389
0
0
Originally posted by: Crusty
The proper way would be to use a regular expression...

... or just check, for each character c, if( c=='0' || c=='1' )
 

Onund

Senior member
Jul 19, 2007
287
0
0
Originally posted by: degibson
Originally posted by: Crusty
The proper way would be to use a regular expression...

... or just check, for each character c, if( c=='0' || c=='1' )

this is much easier to write:
if(input.matches("[01]*")

There's a reason why regular expressions are around.
 

chronodekar

Senior member
Nov 2, 2008
721
1
0
Originally posted by: Onund
Originally posted by: degibson
Originally posted by: Crusty
The proper way would be to use a regular expression...

... or just check, for each character c, if( c=='0' || c=='1' )

this is much easier to write:
if(input.matches("[01]*")

There's a reason why regular expressions are around.

We are talking about just 2 characters. Not some fancy sequence or something. So I think this approach is more than sufficient.

Now for a learning experience, regular expressions for this is good practice.
 

Onund

Senior member
Jul 19, 2007
287
0
0
Originally posted by: chronodekar
Originally posted by: Onund
Originally posted by: degibson
Originally posted by: Crusty
The proper way would be to use a regular expression...

... or just check, for each character c, if( c=='0' || c=='1' )

this is much easier to write:
if(input.matches("[01]*")

There's a reason why regular expressions are around.

We are talking about just 2 characters. Not some fancy sequence or something. So I think this approach is more than sufficient.

Now for a learning experience, regular expressions for this is good practice.

oh crap... I'm reading over the OP again and I just made the assumption that the inputs are coming in from a text field of some sort. I suppose this could be command line based eh? I'm just so used to defaulting to perl when I don't need a GUI...

Still though, comparing one character at a time requires the programmer to setup a loop and pull characters out of the string at each index, regex works on the entire string at once in one line.

Bah, maybe it's just me but I've always hated string manipulation in programs until I worked with them in Perl. Once Java included regex in their api I was much happier. The less for loops I have to write for strings, the happier I am.
 

degibson

Golden Member
Mar 21, 2008
1,389
0
0
Originally posted by: Onund
Originally posted by: degibson
Originally posted by: Crusty
The proper way would be to use a regular expression...

... or just check, for each character c, if( c=='0' || c=='1' )

this is much easier to write:
if(input.matches("[01]*")

There's a reason why regular expressions are around.

I'm all in favor of heavyweight mechanisms for heavyweight matching -- but why entangle an entire library to match 0x30 and 0x031?
 

Crusty

Lifer
Sep 30, 2001
12,684
2
81
Originally posted by: degibson
Originally posted by: Onund
Originally posted by: degibson
Originally posted by: Crusty
The proper way would be to use a regular expression...

... or just check, for each character c, if( c=='0' || c=='1' )

this is much easier to write:
if(input.matches("[01]*")

There's a reason why regular expressions are around.

I'm all in favor of heavyweight mechanisms for heavyweight matching -- but why entangle an entire library to match 0x30 and 0x031?

Because shortly the OP will be dealing with more complex inputs and learning the right way the first time will save time in the future. How are regex's heavyweight by any means?
 

degibson

Golden Member
Mar 21, 2008
1,389
0
0
Originally posted by: Crusty
How are regex's heavyweight by any means?

Compared to a lot of other things, regexes are lightweight. But compared to c == '0' || c == '1', even a function call is heavyweight. Presumably, a regular expression matcher is going to need a function call to get in, and more than two conditional branches to work, even on something simple. There's a good chance its going to go through the dynamic loader, its going to take page faults, etc.

Besides, the OP is clearly at a level of learning where learning elegant syntax is trumping clever use of regexes.

Edit: Looking again, I see that valid inputs are in fact strings of [01]*. So to do it outside of a regex compiler, you need three conditional branches, not two. ;)
 

awal

Senior member
Oct 13, 1999
953
0
0
Regular expressions have their place, but many a times the same result can be produced without using them. Benchmarking the RegEx vs the for loop, the for loop is approximately 5-6 times faster for 1000000 evaluations. So while RegEx is definitely faster to write, if performance is your concern avoid them whenever possible.
 

degibson

Golden Member
Mar 21, 2008
1,389
0
0
Originally posted by: awal
Regular expressions have their place, but many a times the same result can be produced without using them. Benchmarking the RegEx vs the for loop, the for loop is approximately 5-6 times faster for 1000000 evaluations. So while RegEx is definitely faster to write, if performance is your concern avoid them whenever possible.

Cool. Java I presume? In retrospect, most of my points were basically moot because of the language...
 

awal

Senior member
Oct 13, 1999
953
0
0
Yeah my tests were against java, but I've seen the same kind of performance using C#.
 

Onund

Senior member
Jul 19, 2007
287
0
0
seriously guys, context. The OP was asking about user inputs. Lets assume the fastest someone is going to input from the keyboard is 120 words per minute, that's 500ms per word with an average length of 4.5 letters per word (got that from some random goggle search). So, 111ms per letter... on a modern computer the performance is not going to be an issue.

I'm sure there are lots of more computationally efficient ways of doing things but for probably 99.9% of the programming I personally do, programmer efficiency usually wins. Just the increase probability of syntax error of setting up a loop would discourage from doing it that way, but I'm lazy like that.
 

leoloco

Junior Member
Feb 2, 2009
3
0
0
All you need to do is write a common method in java as part of validation check. Iterate through user entered string and check for the everysingle value. Something similar to following

String xyz = input string value

if(xyz!=null && xyz.length()>0){
for(int k=0;k<xyz.length();k++){
String temp = xyz.substring(k,k+1)
if(temp.equalsIgnoreCase("0") || temp.equalsIgnoreCase("1")){

//That means good
}else{
// that means wrong input
}
}
}
 

degibson

Golden Member
Mar 21, 2008
1,389
0
0
Originally posted by: Onund
seriously guys, context. The OP was asking about user inputs. Lets assume the fastest someone is going to input from the keyboard is 120 words per minute, that's 500ms per word with an average length of 4.5 letters per word (got that from some random goggle search). So, 111ms per letter... on a modern computer the performance is not going to be an issue.

I'm sure there are lots of more computationally efficient ways of doing things but for probably 99.9% of the programming I personally do, programmer efficiency usually wins. Just the increase probability of syntax error of setting up a loop would discourage from doing it that way, but I'm lazy like that.

Good point. Most apps are I/O bound, yes. Its Java anyway. But consider this:
1) OP is a young programmer -- just learning. Thanks to this thread, OP has learned a lot of ways to do things, along with all their tradeoffs.
2) Habits formed are habits kept. The next time OP writes a checker, it may not be I/O bound. Or it may be bound by I/O that is a lot quicker than keyboard.
3) Not all languages support RegEx's as seamlessly as Java.
4) 99% of programmers write in high-productivity languages because 99% of code executed is written by 1% of programmers. OP may one day be in that 1%. (numbers are approximate...there may be more decimal points involved)
 

ahurtt

Diamond Member
Feb 1, 2001
4,283
0
0
Originally posted by: chronodekar
Originally posted by: Onund
Originally posted by: degibson
Originally posted by: Crusty
The proper way would be to use a regular expression...

... or just check, for each character c, if( c=='0' || c=='1' )

this is much easier to write:
if(input.matches("[01]*")

There's a reason why regular expressions are around.

We are talking about just 2 characters. Not some fancy sequence or something. So I think this approach is more than sufficient.

Now for a learning experience, regular expressions for this is good practice.

I agree, bite the bullet and learn how to work with regular expressions as early in your programming career (if that is your chosen path) as you can. RE's are an immensely powerful tool in your programmer's arsenal but they can be a bit of an art form to master in and of themselves. You can often do with 1 simple regular expression what might take many lines of code and loops to do in other ways. Why reinvent the wheel? RE's exist for a reason, don't ignore them. They are too powerful when used correctly.