在java中拆分为最大长度的大字符串

2022-09-01 12:33:27
String input = "THESE TERMS AND CONDITIONS OF SERVICE (the Terms) ARE A LEGAL AND BINDING AGREEMENT BETWEEN YOU AND NATIONAL GEOGRAPHIC governing your use of this site, www.nationalgeographic.com, which includes but is not limited to products, software and services offered by way of the website such as the Video Player, Uploader, and other applications that link to these Terms (the Site). Please review the Terms fully before you continue to use the Site. By using the Site, you agree to be bound by the Terms. You shall also be subject to any additional terms posted with respect to individual sections of the Site. Please review our Privacy Policy, which also governs your use of the Site, to understand our practices. If you do not agree, please discontinue using the Site. National Geographic reserves the right to change the Terms at any time without prior notice. Your continued access or use of the Site after such changes indicates your acceptance of the Terms as modified. It is your responsibility to review the Terms regularly. The Terms were last updated on 18 July 2011.";

//text copied from http://www.nationalgeographic.com/community/terms/

我想将这个大字符串拆分为行,并且每行中的行内容不应超过MAX_LINE_LENGTH个字符。

到目前为止我尝试了什么

int MAX_LINE_LENGTH = 20;    
System.out.print(Arrays.toString(input.split("(?<=\\G.{MAX_LINE_LENGTH})")));
//maximum length of line 20 characters

输出:

[THESE TERMS AND COND, ITIONS OF SERVICE (t, he Terms) ARE A LEGA, L AND B ...

它会导致单词的断裂。我不想要这个。而不是我想得到这样的输出:

[THESE TERMS AND , CONDITIONS OF , SERVICE (the Terms) , ARE A LEGAL AND B ...

另外一个条件是添加:如果单词长度大于MAX_LINE_LENGTH则该单词应该被拆分。

解决方案应该是没有外部罐子的帮助。


答案 1

只需逐个字迭代字符串,并在单词超过限制时进行中断。

public String addLinebreaks(String input, int maxLineLength) {
    StringTokenizer tok = new StringTokenizer(input, " ");
    StringBuilder output = new StringBuilder(input.length());
    int lineLen = 0;
    while (tok.hasMoreTokens()) {
        String word = tok.nextToken();

        if (lineLen + word.length() > maxLineLength) {
            output.append("\n");
            lineLen = 0;
        }
        output.append(word);
        lineLen += word.length();
    }
    return output.toString();
}

我只是徒手打字,你可能不得不推动和推动一下才能编译。

Bug:如果输入中的单词比它长,它将被附加到当前行,而不是在它自己的太长的行上。我假设您的行长度约为80或120个字符,在这种情况下,这不太可能成为问题。maxLineLength


答案 2

最佳 : 使用 Apache Commons Lang :

org.apache.commons.lang.WordUtils

/**
 * <p>Wraps a single line of text, identifying words by <code>' '</code>.</p>
 * 
 * <p>New lines will be separated by the system property line separator.
 * Very long words, such as URLs will <i>not</i> be wrapped.</p>
 * 
 * <p>Leading spaces on a new line are stripped.
 * Trailing spaces are not stripped.</p>
 *
 * <pre>
 * WordUtils.wrap(null, *) = null
 * WordUtils.wrap("", *) = ""
 * </pre>
 *
 * @param str  the String to be word wrapped, may be null
 * @param wrapLength  the column to wrap the words at, less than 1 is treated as 1
 * @return a line with newlines inserted, <code>null</code> if null input
 */
public static String wrap(String str, int wrapLength) {
    return wrap(str, wrapLength, null, false);
}