java如何高效的读取超长字符串?

字符串大概有1~2mb,每次读取都要1s多,如何进行优化?
根据以下方法,为什么nio比io还慢?哪里写的有问题?

//普通IO耗时1.3s
    public static String openStringFileIO(String path, String fileName) {
        long time = System.currentTimeMillis();
        String result = null;
        File f = new File(path, fileName);
        try {
            FileInputStream fileInputStream = new FileInputStream(f);
            StringBuilder buffer = new StringBuilder();
            String line;
            BufferedReader in = new BufferedReader(new InputStreamReader(fileInputStream));
            while ((line = in.readLine()) != null) {
                buffer.append(line);
            }
            result = buffer.toString();
        } catch (IOException e) {
            e.printStackTrace();
        }
        Logger.d("openStringFileIO " + (System.currentTimeMillis() - time));
        return result;
    }

    //NIO耗时 2.5s
    public static String openStringFileNIO(String path, String fileName) {
        long time = System.currentTimeMillis();
        FileInputStream in = null;
        StringBuilder result = new StringBuilder();
        try {
            File f = new File(path, fileName);
            in = new FileInputStream(f);
            FileChannel fc = in.getChannel();
            ByteBuffer byteBuffer = ByteBuffer.allocate(1024); // 定义字节缓冲区
            CharBuffer charBuffer = CharBuffer.allocate(1024); // 定义解码后字符存储缓冲区
            CharsetDecoder decoder = Charset.forName("UTF-8").newDecoder();// 定义合适的字符集解码器

            while ((fc.read(byteBuffer)) != -1) { // 读取字符串到缓冲区
                byteBuffer.flip();
                charBuffer.clear();

                // 对byteBuffer进行解码
                if (fc.position() < fc.size()) {
                    decoder.decode(byteBuffer, charBuffer, false);
                } else {
                    // 最后一次解码
                    decoder.decode(byteBuffer, charBuffer, true);
                    decoder.flush(charBuffer);
                }

                byteBuffer.compact(); // 注意此处调用compact方法,而不是clear方法
                charBuffer.flip();
                // 将charBuffer放入返回结果中
                char[] chars = new char[charBuffer.remaining()];
                charBuffer.get(chars, 0, charBuffer.remaining());
                result.append(chars);
            }
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            CloseUtil.close(in);
        }
        Logger.d("openStringFileNIO " + (System.currentTimeMillis() - time));
        return result.toString();
    }

    //OKIO 耗时0.9s
    public static String openStringFile(String path, String fileName) {
        long time = System.currentTimeMillis();
        BufferedSource bufferedSource = null;
        String result = null;
        try {
            File file = new File(path, fileName);
            Source source = Okio.source(file);
            bufferedSource = Okio.buffer(source);
            result = bufferedSource.readUtf8();

        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            CloseUtil.close(bufferedSource);
        }
        Logger.d("openStringFileOkio " + (System.currentTimeMillis() - time));
        return result;
    }

    //第二种NIO耗时 2s
    public static String openStringFileNIO2(String path, String fileName) {

        long time = System.currentTimeMillis();
        File file = new File(path, fileName);
        RandomAccessFile raf;
        StringBuffer sb = new StringBuffer();
        try {
            raf = new RandomAccessFile(file, "rw");
            FileChannel cha = raf.getChannel();
            ByteBuffer buf = ByteBuffer.allocate(1024);
            int size;
            while ((size = cha.read(buf)) != -1) {
                buf.flip();
                byte[] buff = new byte[size];
                buf.get(buff);
                sb.append(new String(buff, 0, size));

            }
        } catch (IOException e) {
            e.printStackTrace();
        }
        Logger.d("openStringFileNIO2 " + (System.currentTimeMillis() - time));

        return sb.toString();
    }
阅读 9.9k
6 个回答

不用readline,用read应该会很快,前提是你大概知道字符串会多长,给它分配一个静态数组:

Reader reader = new FileReader("input.txt");
try {
    char[] chars = new char[8192];
    for(int len; (len = reader.read(chars)) > 0;) {
        // 处理
    }
} finally {
    reader.close();
}

new StringBuilder(1000000)这里写大点,最好可以预估你要存的字符串的长度,减少扩容可以提升时间

可以使用RandomAccessFile配合多线程来操作,但是要注意同步问题。

  1. 建议尝试OKIO,guava或者apache下的comons

1.可以把文件切成几块,然后通过多线程读取
2.可以使用RandomAccessFile配合多线程来操作

使用nio


import java.io.File;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;

public class NIOReadFile
{

    public static void main(String[] args) throws Exception
    {
        long time = System.currentTimeMillis();
        File file = new File("test");
        RandomAccessFile raf = new RandomAccessFile(file, "rw");

        FileChannel cha = raf.getChannel();

        ByteBuffer buf = ByteBuffer.allocate(1024);
        int size = 0;
        StringBuffer sb = new StringBuffer();
        while ((size = cha.read(buf)) != -1)
        {
            buf.flip();
            byte[] buff = new byte[size];
            buf.get(buff);
            sb.append(new String(buff, 0, size));

        }
        System.out.println(System.currentTimeMillis() - time);
    }
}

20M文件,耗时250ms左右。你可以自行调节Buffer的size

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题