Issue
I made a program in NetBeans that takes an input .txt file then writes an output to console. It works fine but when I try to test it using JUnit, the program reads the file incorrectly.
For example, insteand of 'ö'
it reads 'ö'
Is there any way to solve this problem of JUnit not reading non-English characters?
Solution
I suspect that problem is actually in your program or unit tests, not in in JUnit.
If the evidence is as you say it is, then I expect that you code does something like this
Reader r = new FileReader(filename);
which opens the file and sets up a charset decoder based on the default charset.
When you are running the code in the NetBeans, the default charset is UTF-8, and you are reading the file (which is UTF-8 encoded) correctly.
When you are running it in the context of a JUnit test, the default charset is (apparently) LATIN-1 which doesn't match the encoding of the input file.
It is possiblly incorrect for your code to be using the default charset to infer the encoding of its input file. Alternatively, it could be that your JUnit test is incorrect because it is not setting the JVM default charset to match the test file.
The way to open this file with a specific charset (UTF-8) be:
// Java 11
Reader r = new FileReader(filename, StandardCharsets.UTF_8);
// Java 8 and earlier
Reader r = new InputStreamReader(new FileInputStream(filename), "UTF-8");
You can't change a running JVM's default charset. But you could possibly override the platform default charset in the JVM options when you start the JVM that runs the JUnit tests. (See Setting the default Java character encoding)
It is also possible that you have misinterpreted the evidence and the encoding problem is actually on the output side; i.e. there is a mismatch between the default charset and the console's actual charset ... in the context that you are running the JUnit tests.
Answered By - Stephen C
Answer Checked By - Gilberto Lyons (JavaFixing Admin)