java:如何将文件转换为utf8
我有一个文件有一些非utf8 caracter(如“ISO-8859-1”),所以我想将该文件(或读取)转换为UTF8编码,我该怎么做?
代码是这样的:
File file = new File("some_file_with_non_utf8_characters.txt");
/* some code to convert the file to an utf8 file */
...
编辑:放置编码示例
我有一个文件有一些非utf8 caracter(如“ISO-8859-1”),所以我想将该文件(或读取)转换为UTF8编码,我该怎么做?
代码是这样的:
File file = new File("some_file_with_non_utf8_characters.txt");
/* some code to convert the file to an utf8 file */
...
编辑:放置编码示例
下面的代码将文件从 srcEncoding 转换为 tgtEncoding:
public static void transform(File source, String srcEncoding, File target, String tgtEncoding) throws IOException {
BufferedReader br = null;
BufferedWriter bw = null;
try{
br = new BufferedReader(new InputStreamReader(new FileInputStream(source),srcEncoding));
bw = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(target), tgtEncoding));
char[] buffer = new char[16384];
int read;
while ((read = br.read(buffer)) != -1)
bw.write(buffer, 0, read);
} finally {
try {
if (br != null)
br.close();
} finally {
if (bw != null)
bw.close();
}
}
}
--编辑--
使用 Try-with-resources (Java 7):
public static void transform(File source, String srcEncoding, File target, String tgtEncoding) throws IOException {
try (
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(source), srcEncoding));
BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(target), tgtEncoding)); ) {
char[] buffer = new char[16384];
int read;
while ((read = br.read(buffer)) != -1)
bw.write(buffer, 0, read);
}
}
String charset = "ISO-8859-1"; // or what corresponds
BufferedReader in = new BufferedReader(
new InputStreamReader (new FileInputStream(file), charset));
String line;
while( (line = in.readLine()) != null) {
....
}
在那里,您可以解码文本。您可以通过simmetric Writer/OutputStream方法使用您喜欢的编码(例如UTF-8)编写它。