a java question

Journer

Banned
Jun 30, 2005
4,355
0
0
so i wrote a little program for a class that converts numbers into English. Well, I'd like to modify this program so that it turns numbers into Japanese. Are there any packages I can import into my program that will allow me to use Japanese in my program?

Ex.
if i was sending the number 1 through the switch, it would set a string equal to 'one'
instead of 'one' i just want to change it to '?'

any help is appreciated
 

Argo

Lifer
Apr 8, 2000
10,045
0
0
What do you mean '?'? Are you actually asking for something like a check:

1,328 - One thousand three hundren twenty eight?

Never heard of such a thing, although I'd imagine it might be pretty tough to do, considering different languages have different rules. For example, some languages might have special words for certain numbers (russian for example)
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
I don't know how japanese does numbers, but assuming you're either looking for single characters that correspond to our arabic numerals or a word corresponding to our words (like 'one'), just use unicode escapes in your string: "\uddbf" where ddbf is a hex number corresponding to the unicode character you're looking for.
 

Journer

Banned
Jun 30, 2005
4,355
0
0
?is the japanese kanji for ?? or in romaji, ichi. it means one.
i know there are lots of rules for different languages concerning numbers. but i know the rules for japanese.
i just don't know how to get my compiler to understand japanese >.>

argumenting 1328 into my program would return what you said, what i want to be able to do is take the same arg, but return:
?????? or in hirigana ???????????? or in romaji: sen san byaku ni jyu hachi
 

Journer

Banned
Jun 30, 2005
4,355
0
0
Originally posted by: kamper
I don't know how japanese does numbers, but assuming you're either looking for single characters that correspond to our arabic numerals or a word corresponding to our words (like 'one'), just use unicode escapes in your string: "\uddbf" where ddbf is a hex number corresponding to the unicode character you're looking for.

how do i find said unicodes? is there anything special i need to do to implement the usage of them?

so...lets say the unicode for ? (i) is 123
could i do:

swtich(x[a]){
case '1': num='\u123';
break;
etc...

could i do:

case '\u123': num=1;
break;
...etc...

 

mundane

Diamond Member
Jun 7, 2002
5,603
8
81
It looks like you could use a resource such as The Unicode charts to lookup the corresponding character for each arabic numeral and 'power' (thousand, million, billion, etc).
 

ppdes

Senior member
May 16, 2004
739
0
0
If you want to use Japanese directly in Strings and whatnot in your Java source code you need to use the encoding argument to javac. Example with a file saved in UTF-16 format:
javac -encoding utf-16 unicode/TestJapanese.java
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
Originally posted by: ppdes
If you want to use Japanese directly in Strings and whatnot in your Java source code you need to use the encoding argument to javac. Example with a file saved in UTF-16 format:
javac -encoding utf-16 unicode/TestJapanese.java
That's interesting. You'll always have to make sure everyone you give the source code to understands that and you'll always have to use editors that play well with whichever encoding you use, but it sure would be a heck of a lot more readable source code, at least for those that understand the japanese characters. I wonder how tools like svn would deal with it (I know you could just treat the file as binary, but how well would diffing work then?).
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
Originally posted by: Journer
Originally posted by: kamper
I don't know how japanese does numbers, but assuming you're either looking for single characters that correspond to our arabic numerals or a word corresponding to our words (like 'one'), just use unicode escapes in your string: "\uddbf" where ddbf is a hex number corresponding to the unicode character you're looking for.

how do i find said unicodes? is there anything special i need to do to implement the usage of them?

so...lets say the unicode for ? (i) is 123
could i do:

swtich(x[a]){
case '1': num='\u123';
break;
etc...

could i do:

case '\u123': num=1;
break;
...etc...
That looks roughly correct, as long as you're only dealing with single characters. Iirc, java doesn't do switches on strings (but don't quote me on that). Just keep in mind that the number is hex, not decimal, so if your code was one hundred and twenty three, your unicode escape is \u007B. I believe you always need all 4 digits but you might want to check a reference for what to do if you need a higher number. I'd bet japanese is within the first two bytes though.
 

Journer

Banned
Jun 30, 2005
4,355
0
0
mundane: thanks for the link, the charts are helpful
ppdes: i've got a linux box ill try that on in a bit, but any idea how to tell eclipse to compile that way?
kamper: thanks

i tried using the unicode in a simple system.out statement but it just printed out a '?'
i'm assuming that is because i didn't have that special compile arg that ppdes talked about.

anywho...doing all this in unicode is going to be a huge bitch, lol. i wonder if there are any programs that will convert my kanji into unicode. looking it up would be a pain. theres more than a few thousand japanese-used kanji's. although i only need about 20. >.>
 

Journer

Banned
Jun 30, 2005
4,355
0
0
ok i logged onto my linux box...this is the source code i'm using:


import static java.lang.System.out;
public class jp {
public static void main (String[] args){
out.println("\u304f");}
}

here is the error i get when compiling:
javac -encoding utf-16 jp.java
jp.java:1: warning: unmappable character for encoding utf-16
import static java.lang.System.out;
^
jp.java:1: 'class' or 'interface' expected
import static java.lang.System.out;
^
jp.java:1: illegal character: \8307
import static java.lang.System.out;
^
jp.java:1: illegal character: \11884
import static java.lang.System.out;
^
jp.java:1: illegal character: \11887
import static java.lang.System.out;
^
jp.java:1: illegal character: \8307
import static java.lang.System.out;
^
jp.java:1: illegal character: \10323
import static java.lang.System.out;
^
jp.java:1: illegal character: \10619
import static java.lang.System.out;
^
jp.java:1: illegal character: \2671
import static java.lang.System.out;
^
jp.java:1: illegal character: \11888
import static java.lang.System.out;
^
jp.java:1: illegal character: \10274
import static java.lang.System.out;
^
jp.java:1: illegal character: \13104
import static java.lang.System.out;
^
jp.java:1: illegal character: \8745
import static java.lang.System.out;
^
jp.java:1: illegal character: \2685
import static java.lang.System.out;
^
jp.java:1: illegal character: \65533
import static java.lang.System.out;
^
14 errors
1 warning

any ideas?
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
You're mixing two different types of advice. My advice to use the \u means that you write your code in regular old ascii. ppdes's advice to use utf-16 means you can type real japanese characters right into your source code so that you can actually see them while coding. It also means that the rest of the characters in your file have to be in utf-16, which is (at least) a two byte character. You'll have to tell eclipse to save your file as utf-16. My guess is that it would also be smart enough to compile it as so if you configure that but, if not, there should be some kind of option you can set in your project build options.
 

Journer

Banned
Jun 30, 2005
4,355
0
0
kamper: ok. i change the character from the Unicode to what i can type in from the IME. the code compiles (using the -encoding switch) but the display just outputs a '?'
 

kamper

Diamond Member
Mar 18, 2003
5,513
0
0
What's the IME? Can you post your code for perusal? (although the forums will probably brutally butcher it) If you can zip it or tarball it and post it somewhere then I can download it with some assurance that it is exactly the same as you coded it.

It might be that your console doesn't know how to deal with multi-byte outputs. I don't really know how all that works, like if java tries to change its unicode strings to some other encoding before printing it and whether or not the change was possible. A safe way would probably be to use swing to display your output, assuming it can find japanese fonts. What platform are you running this on?

Basically trying to use extended character systems with a variety of software that may or may not support more than ascii very well is just messy.
 

ppdes

Senior member
May 16, 2004
739
0
0
To output Japanese as something other than question marks in the Vista Command Window I have to wrap System.out like this:
PrintStream out = new PrintStream(System.out, true, "Shift-JIS");

Then I output to that instead. Shift-JIS is the code page Windows uses for the Command Window on my machine. I do have a lot of Windows settings flipped to Japanese, though. No clue about Linux.

You can change the default encoding for files in Eclipse here:
Window -> Preferences... -> General -> Workspace -> Text file encoding

You can change the encoding for a particular file by:
right-click on file in package explorer -> Properties -> Resource -> Text file encoding

Sometimes you have to close the editor and restart after that before it will let you save Japanese.

Eclipse seems to compile properly based on those settings without any special arguments. It has its own compiler anyway and doesn't use Sun's.

I haven't found a good way to set the default console encoding in Eclipse. Generally I run the file once then go to:
Run -> Open Run Dialog... -> Common tab -> console encoding

There is no Shift-JIS option there, but you can use UTF-8 or similar and then wrap like this:
PrintStream out = new PrintStream(System.out, true, "UTF-8");