Java通过比较器排序<T>大部分时间都花在比较（对象，对象）上

performance sorting java comparator generics

2022-09-04 03:36:10

在分析我们的代码库时，我注意到了一些奇怪的事情。似乎使用类型化比较器（例如）进行排序总是首先调用一个方法，然后调用该方法。此外，绝大多数时间都花在 .为了进一步探索，我做了一个小测试程序：Comparator<MyClass>Comparator<MyClass>.compare(Object,Object)Comparator<MyClass>.compare(MyClass,MyClass)Comparator<MyClass>.compare(Object,Object)

public class Sandbox {
    public static void main(String argv[]) {
        for(int j=0; j<100; j++) {
            int n = 10000;
            SortMe[] sortMes = new SortMe[n];
            for (int i=0; i<n; i++) {
                sortMes[i] = new SortMe(Math.random());
            }
            Arrays.sort(sortMes, new SortMeComp());
            System.out.println(Arrays.toString(sortMes));
        }
        for(int j=0; j<100; j++) {
            int n = 10000;
            SortMe[] sortMes = new SortMe[n];
            for (int i=0; i<n; i++) {
                sortMes[i] = new SortMe(Math.random());
            }
            Arrays.sort(sortMes, new SortMeCompTypeless());
            System.out.println(Arrays.toString(sortMes));
        }
    }
}

类型比较器：

public class SortMeComp implements Comparator<SortMe>{
    public int compare(SortMe one, SortMe two) {
        if(one.getValue()>two.getValue()) {
            return -1;
        } else if (one.getValue()<two.getValue()) {
            return 1;
        } else {
            return 0;
        }
    }
}

我为比较而制作的非类型比较器：

public class SortMeCompTypeless implements Comparator{
    public int compare(Object oneObj, Object twoObj) {
        SortMe one = (SortMe) oneObj;
        SortMe two = (SortMe) twoObj;
        if(one.getValue()>two.getValue()) {
            return -1;
        } else if (one.getValue()<two.getValue()) {
            return 1;
        } else {
            return 0;
        }
    }
}

以下是结果（来自YourKit分析器;如果您更喜欢屏幕截图，请告诉我）：

+----------------------------------------------------+-----------------+-----------------+--------------------+
|                        Name                        |    Time (ms)    |  Own Time (ms)  |  Invocation Count  |
+----------------------------------------------------+-----------------+-----------------+--------------------+
|  +---java.util.Arrays.sort(Object[], Comparator)   |  23,604  100 %  |          8,096  |               200  |
|    |                                               |                 |                 |                    |
|    +---SortMeComp.compare(Object, Object)          |  11,395   48 %  |          7,430  |        12,352,936  |
|    | |                                             |                 |                 |                    |
|    | +---SortMeComp.compare(SortMe, SortMe)        |   3,965   17 %  |          3,965  |        12,352,936  |
|    |                                               |                 |                 |                    |
|    +---SortMeCompTypeless.compare(Object, Object)  |   4,113   17 %  |          4,113  |        12,354,388  |
+----------------------------------------------------+-----------------+-----------------+--------------------+

我在没有过滤的情况下运行了配置文件，并且您看到了对mergeSort的递归调用（这只会使其难以阅读），但没有任何意义。

这到底是怎么回事呢？SortMeComp.compare（Object，Object）的方法来自哪里？我们认为这是Java内部为处理泛型而创建的东西，但是什么可能需要这么长时间呢？我认为jvm只会将泛型方法视为“非类型化”/对象方法。如您所见，简单的施法要快得多。除此之外，我认为即使jvm需要做前期类型的东西，这也将被JITed删除。这是怎么回事？

顺便一提：

$ java -version
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)

编辑：

为了响应 savinos 的答案，我尝试使用“非类型化”比较器模拟额外的方法调用，该比较器只是简单地转换为类型化比较：

public class SortMeCompMethodCalls implements Comparator{
    public int compare(Object oneObj, Object twoObj) {
        return compare((SortMe)oneObj, (SortMe)twoObj);
    }
    public int compare(SortMe one, SortMe two) {
        if(one.getValue()>two.getValue()) {
            return -1;
        } else if (one.getValue()<two.getValue()) {
            return 1;
        } else {
            return 0;
        }
    }
}

结果如下：

+---------------------------------------------------------+-----------------+-----------------+--------------------+
|                          Name                           |    Time (ms)    |  Own Time (ms)  |  Invocation Count  |
+---------------------------------------------------------+-----------------+-----------------+--------------------+
|  +---java.util.Arrays.sort(Object[], Comparator)        |  31,044  100 %  |          8,061  |               200  |
|    |                                                    |                 |                 |                    |
|    +---SortMeComp.compare(Object, Object)               |  11,554   37 %  |          7,617  |        12,354,392  |
|    | |                                                  |                 |                 |                    |
|    | +---SortMeComp.compare(SortMe, SortMe)             |   3,936   13 %  |          3,936  |        12,354,392  |
|    |                                                    |                 |                 |                    |
|    +---SortMeCompMethodCalls.compare(Object, Object)    |  11,427   37 %  |          7,613  |        12,352,146  |
|      |                                                  |                 |                 |                    |
|      +---SortMeCompMethodCalls.compare(SortMe, SortMe)  |   3,814   12 %  |          3,814  |        12,352,146  |
+---------------------------------------------------------+-----------------+-----------------+--------------------+

所以看起来萨维诺斯是对的！额外的时间只是额外的方法调用（加上一点强制转换）。这在我看来是疯狂的。你会认为那会被JITed带走吗？嗯。

我删除了编辑2，并将其添加为最初应该的答案。

答案 1

我可能是错的，但我想说的是，“Object”比较器和类型化比较器（由泛型比较器调用）之间的增量仅仅是由于额外的函数调用。

假设您正在执行 12，352，936 次调用，这意味着每个函数调用大约需要 5.7*10^-7 秒，这并非不合理。

答案 2

有点跑题，但你必须希望它很快...

使用随机数据，如果将内部 compare（）时间缩短约 50%，如果将其更改为以下方式：

public int compare(SortMe one, SortMe two) {
    return one.getValue() - two.getValue();
}

但是，仅当输入范围的幅度小于 2^31 时，这才有效。如果较大，则差值溢出。