我是Java的MappedByteBuffers
的忠实粉丝,用于这种情况。它的速度非常快。下面是我为您整理的一个片段,它将缓冲区映射到文件,查找到中间,然后向后搜索到换行符。这应该足以让你继续前进吗?
我在自己的应用程序中有类似的代码(搜索,阅读,重复直到完成),在生产环境中对流进行基准测试,并将结果发布在我的博客上(标记为“java.nio”的Geekomatic帖子)与原始数据,图形和所有内容。java.io
MappedByteBuffer
两个第二个总结?我基于MappedByteBuffer
的实现速度提高了约275%。新浪网.
为了处理大于~2GB的文件,由于强制转换和,这是一个问题,我精心设计了由数组支持的分页算法。您需要在64位系统上工作才能处理大于2-4GB的文件,因为MBB使用操作系统的虚拟内存系统来发挥其魔力。.position(int pos)
MappedByteBuffer
public class StusMagicLargeFileReader {
private static final long PAGE_SIZE = Integer.MAX_VALUE;
private List<MappedByteBuffer> buffers = new ArrayList<MappedByteBuffer>();
private final byte raw[] = new byte[1];
public static void main(String[] args) throws IOException {
File file = new File("/Users/stu/test.txt");
FileChannel fc = (new FileInputStream(file)).getChannel();
StusMagicLargeFileReader buffer = new StusMagicLargeFileReader(fc);
long position = file.length() / 2;
String candidate = buffer.getString(position--);
while (position >=0 && !candidate.equals('\n'))
candidate = buffer.getString(position--);
//have newline position or start of file...do other stuff
}
StusMagicLargeFileReader(FileChannel channel) throws IOException {
long start = 0, length = 0;
for (long index = 0; start + length < channel.size(); index++) {
if ((channel.size() / PAGE_SIZE) == index)
length = (channel.size() - index * PAGE_SIZE) ;
else
length = PAGE_SIZE;
start = index * PAGE_SIZE;
buffers.add(index, channel.map(READ_ONLY, start, length));
}
}
public String getString(long bytePosition) {
int page = (int) (bytePosition / PAGE_SIZE);
int index = (int) (bytePosition % PAGE_SIZE);
raw[0] = buffers.get(page).get(index);
return new String(raw);
}
}