To be honest I don't trust myself too much either in  (unless I understand the assembly, which takes lots of time in my case), especially since I've used , but here is a small test (I took the  generation from some other test I did, but it should not matter, it's just some data to sort)JMH@Setup(Level.Invocation)StringInput
@State(Scope.Thread)
public static class StringInput {
    private String[] letters = { "q", "a", "z", "w", "s", "x", "e", "d", "c", "r", "f", "v", "t", "g", "b",
            "y", "h", "n", "u", "j", "m", "i", "k", "o", "l", "p" };
    public String s = "";
    public List<String> list;
    @Param(value = { "1000", "10000", "100000" })
    int next;
    @TearDown(Level.Invocation)
    public void tearDown() {
        s = null;
    }
    @Setup(Level.Invocation)
    public void setUp() {
         list = ThreadLocalRandom.current()
                .ints(next, 0, letters.length)
                .mapToObj(x -> letters[x])
                .map(x -> Character.toString((char) x.intValue()))
                .collect(Collectors.toList());
    }
}
@Fork(1)
@Benchmark
public List<String> testCollection(StringInput si){
    Collections.sort(si.list, Comparator.naturalOrder());
    return si.list;
}
@Fork(1)
@Benchmark
public List<String> testStream(StringInput si){
    return si.list.stream()
            .sorted(Comparator.naturalOrder())
            .collect(Collectors.toList());
}
Results show that  is faster, but not by a big margin:Collections.sort
Benchmark                                 (next)  Mode  Cnt   Score   Error  Units
streamvsLoop.StreamVsLoop.testCollection    1000  avgt    2   0.038          ms/op
streamvsLoop.StreamVsLoop.testCollection   10000  avgt    2   0.599          ms/op
streamvsLoop.StreamVsLoop.testCollection  100000  avgt    2  12.488          ms/op
streamvsLoop.StreamVsLoop.testStream        1000  avgt    2   0.048          ms/op
streamvsLoop.StreamVsLoop.testStream       10000  avgt    2   0.808          ms/op
streamvsLoop.StreamVsLoop.testStream      100000  avgt    2  15.652          ms/op