为什么这个使用 Collections.sort 的程序只对大小为 32 或更大的列表失败?

2022-09-03 01:26:57

以下程序将引发以下异常:

java.lang.IllegalArgumentException: Comparison method violates its general contract!

我理解 .请参阅无法复制:“比较方法违反了其总合同!Comparator

我不明白为什么它只对32号或更大的s失败。谁能解释一下?List

class Experiment {

    private static final class MyInteger {
        private final Integer num;

        MyInteger(Integer num) {
            this.num = num;
        }
    }

    private static final Comparator<MyInteger> COMPARATOR = (r1, r2) -> {
        if (r1.num == null || r2.num == null)
            return 0;
        return Integer.compare(r1.num, r2.num);
    };

    public static void main(String[] args) {
        MyInteger[] array = {new MyInteger(0), new MyInteger(1), new MyInteger(null)};
        Random random = new Random();
        for (int length = 0;; length++) {
            for (int attempt = 0; attempt < 100000; attempt++) {
                List<MyInteger> list = new ArrayList<>();
                int[] arr = new int[length];
                for (int k = 0; k < length; k++) {
                    int rand = random.nextInt(3);
                    arr[k] = rand;
                    list.add(array[rand]);
                }
                try {
                    Collections.sort(list, COMPARATOR);
                } catch (Exception e) {
                    System.out.println(arr.length + " " + Arrays.toString(arr));
                    return;
                }
            }
        }
    }
}

答案 1

这取决于实现,但在 openjdk 8 中,数组的大小根据 MIN_MERGE 进行检查,该值等于 32。这避免了对 / 的调用,从而引发异常。mergeLomergeHi

From JDK / jdk / openjdk / 7u40-b43 8-b132 7-b147 - 8-b132 / java.util.TimSort

static <T> void sort(T[] a, int lo, int hi, Comparator<? super T> c,
                     T[] work, int workBase, int workLen) {
    assert c != null && a != null && lo >= 0 && lo <= hi && hi <= a.length;

    int nRemaining  = hi - lo;
    if (nRemaining < 2)
        return;  // Arrays of size 0 and 1 are always sorted

    // If array is small, do a "mini-TimSort" with no merges
    if (nRemaining < MIN_MERGE) {
        int initRunLen = countRunAndMakeAscending(a, lo, hi, c);
        binarySort(a, lo, hi, lo + initRunLen, c);
        return;
    }

    /**
     * March over the array once, left to right, finding natural runs,
     * extending short natural runs to minRun elements, and merging runs
     * to maintain stack invariant.
     */
    TimSort<T> ts = new TimSort<>(a, c, work, workBase, workLen);
    int minRun = minRunLength(nRemaining);
    do {
        // Identify next run
        int runLen = countRunAndMakeAscending(a, lo, hi, c);

        // If run is short, extend to min(minRun, nRemaining)
        if (runLen < minRun) {
            int force = nRemaining <= minRun ? nRemaining : minRun;
            binarySort(a, lo, lo + force, lo + runLen, c);
            runLen = force;
        }

        // Push run onto pending-run stack, and maybe merge
        ts.pushRun(lo, runLen);
        ts.mergeCollapse();

        // Advance to find next run
        lo += runLen;
        nRemaining -= runLen;
    } while (nRemaining != 0);

    // Merge all remaining runs to complete sort
    assert lo == hi;
    ts.mergeForceCollapse();
    assert ts.stackSize == 1;
}

答案 2

Java 8 使用 TimSort 算法对输入进行排序。当长度至少为32时,会发生合并阶段。当长度小于 32 时,将使用一个简单的排序算法,该算法可能不会检测到合同违规。让源代码注释自己说话:TimSortTimSort.java

class TimSort<T> {
    /**
     * This is the minimum sized sequence that will be merged.  Shorter
     * sequences will be lengthened by calling binarySort.  If the entire
     * array is less than this length, no merges will be performed.
     *
     * This constant should be a power of two.  It was 64 in Tim Peter's C
     * implementation, but 32 was empirically determined to work better in
     * this implementation.  In the unlikely event that you set this constant
     * to be a number that's not a power of two, you'll need to change the
     * {@link #minRunLength} computation.
     *
     * If you decrease this constant, you must change the stackLen
     * computation in the TimSort constructor, or you risk an
     * ArrayOutOfBounds exception.  See listsort.txt for a discussion
     * of the minimum stack length required as a function of the length
     * of the array being sorted and the minimum merge sequence length.
     */
    private static final int MIN_MERGE = 32;