shlex alternative for Java

2022-09-03 14:41:52

Java 有没有 shlex 的替代方案?我希望能够拆分引号分隔的字符串,就像shell处理它们一样。例如,如果我发送:

one two "three four"
并执行拆分,我想接收令牌
one
two
three four

答案 1

我今天遇到了类似的问题,它看起来不像任何标准选项,如StringTokenizer,StrTokenizer,Scanner都是一个很好的选择。但是,实现基础知识并不难。

此示例处理当前对其他答案进行注释的所有边缘事例。请注意,我还没有检查它是否完全符合POSIX。Gist包括GitHub上提供的单元测试 - 通过unlicense在公共领域发布。

public List<String> shellSplit(CharSequence string) {
    List<String> tokens = new ArrayList<String>();
    boolean escaping = false;
    char quoteChar = ' ';
    boolean quoting = false;
    int lastCloseQuoteIndex = Integer.MIN_VALUE;
    StringBuilder current = new StringBuilder();
    for (int i = 0; i<string.length(); i++) {
        char c = string.charAt(i);
        if (escaping) {
            current.append(c);
            escaping = false;
        } else if (c == '\\' && !(quoting && quoteChar == '\'')) {
            escaping = true;
        } else if (quoting && c == quoteChar) {
            quoting = false;
            lastCloseQuoteIndex = i;
        } else if (!quoting && (c == '\'' || c == '"')) {
            quoting = true;
            quoteChar = c;
        } else if (!quoting && Character.isWhitespace(c)) {
            if (current.length() > 0 || lastCloseQuoteIndex == (i - 1)) {
                tokens.add(current.toString());
                current = new StringBuilder();
            }
        } else {
            current.append(c);
        }
    }
    if (current.length() > 0 || lastCloseQuoteIndex == (string.length() - 1)) {
        tokens.add(current.toString());
    }

    return tokens;
}

答案 2

看看Apache Commons Lang

org.apache.commons.lang.text.StrTokenizer 应该能够做你想做的事:

new StringTokenizer("one two \"three four\"", ' ', '"').getTokenArray();

推荐