如果不分析输入,很难给出一个好的答案。无论如何,我会分享我的结果:
我为单个做了一个容器,为单个做了一个容器。此外,我使所有测试都以将数据复制到普通数组中结束。此外,我还添加了两种方法:Foo
long
MyObj
Foo
4). 简单数组
@Benchmark
public void simpleArray(Blackhole bh) {
MyObj[] ret = new MyObj[foos.size()];
for (int i=0;i<foos.size();i++)
ret[i] = new MyObj(foos.get(i));
Arrays.sort(ret);
int lastDistinct = 0;
for (int i = 1; i < ret.length; i++) {
if (ret[i].equals(ret[lastDistinct])) {
continue;
}
lastDistinct++;
ret[lastDistinct] = ret[i];
}
MyObj[] ret2 = new MyObj[lastDistinct + 1];
System.arraycopy(ret, 0, ret2, 0, lastDistinct + 1);
bh.consume(ret2);
}
和 (2) 的反转顺序:distinct
order
@Benchmark
public void simpleStream_distinctAfterSort(Blackhole bh) {
List<MyObj> ret = foos.stream().map(MyObj::new)
.sorted().distinct()
.collect(Collectors.toList());
bh.consume(ret.toArray(new MyObj[ret.size()]));
}
测试设置:
public static final int MAX_SIZE = 10_000;
public static final long ELEM_THRESHOLD = MAX_SIZE * 10;
private List<Foo> foos;
@Setup
public void init() throws IOException, IllegalAccessException, InstantiationException {
foos = new ArrayList<>(MAX_SIZE);
for (int i = 0; i < MAX_SIZE; i++) {
foos.add(new Foo(ThreadLocalRandom.current().nextLong(ELEM_THRESHOLD)));
}
}
现在,具有不同大小和阈值的结果:
Size=10_000
Threshold=Size*10
Benchmark Mode Cnt Score Error Units
StreamBenchmark5.enhancedLoop_TreeSet thrpt 2 478,978 ops/s
StreamBenchmark5.simpleArray thrpt 2 591,287 ops/s
StreamBenchmark5.simpleStream thrpt 2 407,556 ops/s
StreamBenchmark5.simpleStream_distinctAfterSort thrpt 2 533,091 ops/s
StreamBenchmark5.treeSetStream thrpt 2 492,709 ops/s
Size=10_000
Threshold=Size/10
StreamBenchmark5.enhancedLoop_TreeSet thrpt 2 937,908 ops/s
StreamBenchmark5.simpleArray thrpt 2 593,983 ops/s
StreamBenchmark5.simpleStream thrpt 2 3344,508 ops/s
StreamBenchmark5.simpleStream_distinctAfterSort thrpt 2 560,652 ops/s
StreamBenchmark5.treeSetStream thrpt 2 1000,585 ops/s
Size=500_000
Threshold=Size*10
Benchmark Mode Cnt Score Error Units
StreamBenchmark5.enhancedLoop_TreeSet thrpt 2 1,809 ops/s
StreamBenchmark5.simpleArray thrpt 2 4,009 ops/s
StreamBenchmark5.simpleStream thrpt 2 2,443 ops/s
StreamBenchmark5.simpleStream_distinctAfterSort thrpt 2 4,141 ops/s
StreamBenchmark5.treeSetStream thrpt 2 2,040 ops/s
Size=500_000
Threshold=Size/10
Benchmark Mode Cnt Score Error Units
StreamBenchmark5.enhancedLoop_TreeSet thrpt 2 5,724 ops/s
StreamBenchmark5.simpleArray thrpt 2 4,567 ops/s
StreamBenchmark5.simpleStream thrpt 2 19,001 ops/s
StreamBenchmark5.simpleStream_distinctAfterSort thrpt 2 4,840 ops/s
StreamBenchmark5.treeSetStream thrpt 2 5,407 ops/s
Size=1_000_000
Threshold=Size/100
Benchmark Mode Cnt Score Error Units
StreamBenchmark5.enhancedLoop_TreeSet thrpt 2 4,529 ops/s
StreamBenchmark5.simpleArray thrpt 2 2,402 ops/s
StreamBenchmark5.simpleStream thrpt 2 35,699 ops/s
StreamBenchmark5.simpleStream_distinctAfterSort thrpt 2 2,232 ops/s
StreamBenchmark5.treeSetStream thrpt 2 4,889 ops/s
如您所见,根据重复的数量,算法会进行更可取的更改。最平衡的方法是(3),但最快的方法几乎总是简单的流(具有和定位对应于输入数据)。TreeSet
order
distinct
如果你愿意玩一下,这里是测试源。您将需要 JMH。