UTF-8 to \udddd Conversion with 'native2ascii -encoding'

This section provides a tutorial example on how to convert UTF-8 character strings to \udddd Unicode code sequences with the 'native2ascii -encoding' command.

Now let's see how we can fix the encoding problem with HelloUtf8.java demonstrated in the previous section.

1. Convert HelloUtf8.java to \udddd Unicode code sequences using the "native2ascii -encoding utf-8" command:

C:\herong>native2ascii -encoding utf-8 
   HelloUtf8.java HelloUtf8Converted.java

2. Rename the class name in HelloUtf8Converted.java with an editor:

public class HelloUtf8Converted {
   public static void main(String[] a) {
      System.out.println("Hello world!"); 	
      System.out.println("\u4e16\u754c\u4f60\u597d\uff01"); 	
   }
}

3. Compile and run HelloUtf8Converted.java:

C:\herong>javac HelloUtf8Converted.java

C:\herong>java HelloUtf8Converted

Hello world!
?????

What happens to the Chinese string printed on the console? Why I am not getting Chinese characters back in the output?

The problem is not caused by those \udddd Unicode code sequences used to represent the Chinese string. Those \udddd Unicode code sequences correctly inserted Chinese characters into the storage of a string variable. The problem is caused by the default encoding used by the "out" stream. See the next section on how to fix this problem.

Last update: 2015.

Table of Contents

 About This Book

 Java Tools Terminology

 Installing Java 8 on Windows

 'javac' - The Java Program Compiler

 'java' - The Java Program Launcher

 'jdb' - The Java Debugger

 'jconsole' - Java Monitoring and Management Console

 'jstat' - JVM Statistics Monitoring Tool

 JVM Troubleshooting Tools

 jvisualvm (Java VisualVM) - JVM Visual Tool

 'jar' - The JAR File Tool

 'javap' - The Java Class File Disassembler

 'keytool' - Public Key Certificate Tool

'native2ascii' - Native-to-ASCII Encoding Converter

 'native2ascii' - Encoding Converter Command and Options

 'javac' Using CP1252 to Process Source File

UTF-8 to \udddd Conversion with 'native2ascii -encoding'

 Setting UTF-8 Encoding in PrintStream

 Converting \udddd Sequences Back with "-reverse" Option

 Outdated Tutorials

 References

 PDF Printing Version