Java通过比较器排序<T>大部分时间都花在比较(对象,对象)上

在分析我们的代码库时,我注意到了一些奇怪的事情。似乎使用类型化比较器(例如)进行排序总是首先调用一个方法,然后调用该方法。此外,绝大多数时间都花在 .为了进一步探索,我做了一个小测试程序:Comparator<MyClass>Comparator<MyClass>.compare(Object,Object)Comparator<MyClass>.compare(MyClass,MyClass)Comparator<MyClass>.compare(Object,Object)

public class Sandbox {
    public static void main(String argv[]) {
        for(int j=0; j<100; j++) {
            int n = 10000;
            SortMe[] sortMes = new SortMe[n];
            for (int i=0; i<n; i++) {
                sortMes[i] = new SortMe(Math.random());
            }
            Arrays.sort(sortMes, new SortMeComp());
            System.out.println(Arrays.toString(sortMes));
        }
        for(int j=0; j<100; j++) {
            int n = 10000;
            SortMe[] sortMes = new SortMe[n];
            for (int i=0; i<n; i++) {
                sortMes[i] = new SortMe(Math.random());
            }
            Arrays.sort(sortMes, new SortMeCompTypeless());
            System.out.println(Arrays.toString(sortMes));
        }
    }
}

类型比较器:

public class SortMeComp implements Comparator<SortMe>{
    public int compare(SortMe one, SortMe two) {
        if(one.getValue()>two.getValue()) {
            return -1;
        } else if (one.getValue()<two.getValue()) {
            return 1;
        } else {
            return 0;
        }
    }
}

我为比较而制作的非类型比较器:

public class SortMeCompTypeless implements Comparator{
    public int compare(Object oneObj, Object twoObj) {
        SortMe one = (SortMe) oneObj;
        SortMe two = (SortMe) twoObj;
        if(one.getValue()>two.getValue()) {
            return -1;
        } else if (one.getValue()<two.getValue()) {
            return 1;
        } else {
            return 0;
        }
    }
}

以下是结果(来自YourKit分析器;如果您更喜欢屏幕截图,请告诉我):

+----------------------------------------------------+-----------------+-----------------+--------------------+
|                        Name                        |    Time (ms)    |  Own Time (ms)  |  Invocation Count  |
+----------------------------------------------------+-----------------+-----------------+--------------------+
|  +---java.util.Arrays.sort(Object[], Comparator)   |  23,604  100 %  |          8,096  |               200  |
|    |                                               |                 |                 |                    |
|    +---SortMeComp.compare(Object, Object)          |  11,395   48 %  |          7,430  |        12,352,936  |
|    | |                                             |                 |                 |                    |
|    | +---SortMeComp.compare(SortMe, SortMe)        |   3,965   17 %  |          3,965  |        12,352,936  |
|    |                                               |                 |                 |                    |
|    +---SortMeCompTypeless.compare(Object, Object)  |   4,113   17 %  |          4,113  |        12,354,388  |
+----------------------------------------------------+-----------------+-----------------+--------------------+

我在没有过滤的情况下运行了配置文件,并且您看到了对mergeSort的递归调用(这只会使其难以阅读),但没有任何意义。

这到底是怎么回事呢?SortMeComp.compare(Object,Object)的方法来自哪里?我们认为这是Java内部为处理泛型而创建的东西,但是什么可能需要这么长时间呢?我认为jvm只会将泛型方法视为“非类型化”/对象方法。如您所见,简单的施法要快得多。除此之外,我认为即使jvm需要做前期类型的东西,这也将被JITed删除。这是怎么回事?

顺便一提:

$ java -version
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)

编辑:

为了响应 savinos 的答案,我尝试使用“非类型化”比较器模拟额外的方法调用,该比较器只是简单地转换为类型化比较:

public class SortMeCompMethodCalls implements Comparator{
    public int compare(Object oneObj, Object twoObj) {
        return compare((SortMe)oneObj, (SortMe)twoObj);
    }
    public int compare(SortMe one, SortMe two) {
        if(one.getValue()>two.getValue()) {
            return -1;
        } else if (one.getValue()<two.getValue()) {
            return 1;
        } else {
            return 0;
        }
    }
}

结果如下:

+---------------------------------------------------------+-----------------+-----------------+--------------------+
|                          Name                           |    Time (ms)    |  Own Time (ms)  |  Invocation Count  |
+---------------------------------------------------------+-----------------+-----------------+--------------------+
|  +---java.util.Arrays.sort(Object[], Comparator)        |  31,044  100 %  |          8,061  |               200  |
|    |                                                    |                 |                 |                    |
|    +---SortMeComp.compare(Object, Object)               |  11,554   37 %  |          7,617  |        12,354,392  |
|    | |                                                  |                 |                 |                    |
|    | +---SortMeComp.compare(SortMe, SortMe)             |   3,936   13 %  |          3,936  |        12,354,392  |
|    |                                                    |                 |                 |                    |
|    +---SortMeCompMethodCalls.compare(Object, Object)    |  11,427   37 %  |          7,613  |        12,352,146  |
|      |                                                  |                 |                 |                    |
|      +---SortMeCompMethodCalls.compare(SortMe, SortMe)  |   3,814   12 %  |          3,814  |        12,352,146  |
+---------------------------------------------------------+-----------------+-----------------+--------------------+

所以看起来萨维诺斯是对的!额外的时间只是额外的方法调用(加上一点强制转换)。这在我看来是疯狂的。你会认为那会被JITed带走吗?嗯。

我删除了编辑2,并将其添加为最初应该的答案。


答案 1

我可能是错的,但我想说的是,“Object”比较器和类型化比较器(由泛型比较器调用)之间的增量仅仅是由于额外的函数调用。

假设您正在执行 12,352,936 次调用,这意味着每个函数调用大约需要 5.7*10^-7 秒,这并非不合理。


答案 2

有点跑题,但你必须希望它很快...

使用随机数据,如果将内部 compare() 时间缩短约 50%,如果将其更改为以下方式:

public int compare(SortMe one, SortMe two) {
    return one.getValue() - two.getValue();
}

但是,仅当输入范围的幅度小于 2^31 时,这才有效。如果较大,则差值溢出。


推荐