奇怪的 Java Unicode 正则表达式 StringIndexOutOfBoundsException

2022-09-04 04:31:05

我的问题很简单,但令人费解。可能有一个简单的开关可以解决这个问题,但我在Java正则表达式方面没有太多经验......

String line = "						

答案 1

The characters you mentioned are actually "Double byte characters". Which means that two bytes form one character. But for Java to interpret this, the encoding information (when it is different from the default platform encoding) needs to be passed explicitly (or else default platform encoding will be used).

To prove this, consider following

String line = "						

答案 2