如何确定一个字符在Java中是否是字母?
如何检查单字符字符串是否为字母 - 包括任何带重音符号的字母?
我最近必须解决这个问题,所以我会在最近的VB6问题提醒我之后自己回答。
如何检查单字符字符串是否为字母 - 包括任何带重音符号的字母?
我最近必须解决这个问题,所以我会在最近的VB6问题提醒我之后自己回答。
Character.isLetter() 比 string.matches() 快得多,因为 string.matches() 每次都会编译一个新的模式。即使缓存该模式,我认为isLetter()仍然会击败它。
编辑:只是再次遇到这个,并认为我会尝试提出一些实际数字。这是我在基准测试中的尝试,检查所有三种方法(有和没有缓存,和)。我还确保检查了有效和无效的字符,以免扭曲事物。matches()
Pattern
Character.isLetter()
import java.util.regex.*;
class TestLetter {
private static final Pattern ONE_CHAR_PATTERN = Pattern.compile("\\p{L}");
private static final int NUM_TESTS = 10000000;
public static void main(String[] args) {
long start = System.nanoTime();
int counter = 0;
for (int i = 0; i < NUM_TESTS; i++) {
if (testMatches(Character.toString((char) (i % 128))))
counter++;
}
System.out.println(NUM_TESTS + " tests of Pattern.matches() took " +
(System.nanoTime()-start) + " ns.");
System.out.println("There were " + counter + "/" + NUM_TESTS +
" valid characters");
/*********************************/
start = System.nanoTime();
counter = 0;
for (int i = 0; i < NUM_TESTS; i++) {
if (testCharacter(Character.toString((char) (i % 128))))
counter++;
}
System.out.println(NUM_TESTS + " tests of isLetter() took " +
(System.nanoTime()-start) + " ns.");
System.out.println("There were " + counter + "/" + NUM_TESTS +
" valid characters");
/*********************************/
start = System.nanoTime();
counter = 0;
for (int i = 0; i < NUM_TESTS; i++) {
if (testMatchesNoCache(Character.toString((char) (i % 128))))
counter++;
}
System.out.println(NUM_TESTS + " tests of String.matches() took " +
(System.nanoTime()-start) + " ns.");
System.out.println("There were " + counter + "/" + NUM_TESTS +
" valid characters");
}
private static boolean testMatches(final String c) {
return ONE_CHAR_PATTERN.matcher(c).matches();
}
private static boolean testMatchesNoCache(final String c) {
return c.matches("\\p{L}");
}
private static boolean testCharacter(final String c) {
return Character.isLetter(c.charAt(0));
}
}
我的输出:
10000000 tests of Pattern.matches() took 4325146672 ns. There were 4062500/10000000 valid characters 10000000 tests of isLetter() took 546031201 ns. There were 4062500/10000000 valid characters 10000000 tests of String.matches() took 11900205444 ns. There were 4062500/10000000 valid characters
因此,即使使用缓存,这几乎要好8倍。(未缓存比缓存差近3倍。Pattern
只需检查字母是否在A-Z中,因为它不包括带有重音符号的字母或其他字母表中的字母。
我发现您可以将正则表达式类用于“Unicode字母”或其区分大小写的变体之一:
string.matches("\\p{L}"); // Unicode letter
string.matches("\\p{Lu}"); // Unicode upper-case letter
您也可以使用 Character 类执行此操作:
Character.isLetter(character);
但是如果您需要检查多个字母,那就不太方便了。