问题是,“\uXXXX”表示法表示为4个十六进制数字,形成一个 16 位 。char
Unicode 码位高于 16 位范围,包括 U+F1EB 和 U+1F1F7。这将用两个字符表示,即所谓的代理项对。
您可以使用代码点创建字符串:
int[] codepoints = {0x1F1EB, 0x1F1F7};
String s = new String(codepoints, 0, codepoints.length);
或者使用代理项对,可以像这样推导:
System.out.print("\"");
for (char ch : s.toCharArray()) {
System.out.printf("\\u%04X", (int)ch);
}
System.out.println("\"");
给
"\uD83C\uDDEB\uD83C\uDDF7"
对评论的回应:如何解码
“\uD83C\uDDEB”是两个代理项 16 位字符,表示 U+1F1EB,“\uD83C\uDDF7”是 U+1F1F7 的代理项对。
private static final int CP_REGIONAL_INDICATOR = 0x1F1E7; // A-Z flag codes.
/**
* Get the flag codes of two (or one) regional indicator symbols.
* @param s string starting with 1 or 2 regional indicator symbols.
* @return one or two ASCII letters for the flag, or null.
*/
public static String regionalIndicator(String s) {
int cp0 = regionalIndicatorCodePoint(s);
if (cp0 == -1) {
return null;
}
StringBuilder sb = new StringBuilder();
sb.append((char)(cp0 - CP_REGIONAL_INDICATOR + 'A'));
int n0 = Character.charCount(cp0);
int cp1 = regionalIndicatorCodePoint(s.substring(n0));
if (cp1 != -1) {
sb.append((char)(cp1 - CP_REGIONAL_INDICATOR + 'A'));
}
return sb.toString();
}
private static int regionalIndicatorCodePoint(String s) {
if (s.isEmpty()) {
return -1;
}
int cp0 = s.codePointAt(0);
return CP_REGIONAL_INDICATOR > cp0 || cp0 >= CP_REGIONAL_INDICATOR + 26 ? -1 : cp0;
}
System.out.println("Flag: " + regionalIndicator("\uD83C\uDDEB\uD83C\uDDF7"));
Flag: EQ