HDFS读取excel内容出现乱码

1.使用poi生成excel,上传hdfs
2.当有需要的时候,去下载该文件
3.下载文件是通过rest接口
4.接口中通过hdfs api读取hdfs上的excel文件内容,并转成字符串,响应到客户端

图片描述
图1 通过hdfs的get命令拉去的文件,显示正常

图片描述
图2 通过hdfs的java api下载的文件内容,显示乱码
最原始的代码:

public static String readFile(Configuration conf, String filePath) throws IOException,
            URISyntaxException {
        String fileContent = null;
        FileSystem fs = getFileSystem(conf);
        Path path = new Path(filePath);
        InputStream inputStream = null;
        ByteArrayOutputStream outputStream = null;
        try {
            inputStream = fs.open(path);
            outputStream = new ByteArrayOutputStream(inputStream.available());
            IOUtils.copyBytes(inputStream, outputStream, conf);
            byte[] lens = outputStream.toByteArray(); //解决中文乱码
            fileContent = new String(lens, "UTF-8");
        } finally {
            IOUtils.closeStream(inputStream);
            IOUtils.closeStream(outputStream);
            fs.close();
        }
        return fileContent;
    }

修改后的代码:

/**
     * 读取文件内容
     *
     * @param conf
     * @param filePath
     * @return
     * @throws IOException
     */
    public static String readFile(Configuration conf, String filePath) throws IOException,
            URISyntaxException {
        String fileContent = null;
        FileSystem fs = getFileSystem(conf);
        Path path = new Path(filePath);
        FSDataInputStream in = null;
        ByteArrayOutputStream outputStream = null;
        try {
            in = fs.open(path);
            outputStream = new ByteArrayOutputStream();
            //IOUtils.copyBytes(in, outputStream, conf);   //这个也是乱码

            byte[] b = new byte[1024];
            int numBytes = 0;
            while ((numBytes = in.read(b)) > 0) {
                outputStream.write(b, 0, numBytes);
            }
            fileContent = new String(outputStream.toByteArray(), "UTF8");
        } finally {
            IOUtils.closeStream(in);
            IOUtils.closeStream(outputStream);
            fs.close();
        }
        return fileContent;

网上的方法基本都用过了,ByteArrayOutputStream换成FsDataOutputStream也尝试过了,也是乱码。

public static String readFile(Configuration conf, String filePath) throws IOException,
            URISyntaxException {
        String fileContent = null;
        FileSystem fs = getFileSystem(conf);
        Path path = new Path(filePath);
        FSDataInputStream fin = null;
        InputStreamReader bin = null;
        int line;
        try {
            StringBuffer buffer = new StringBuffer();
            fin = fs.open(path);
            bin = new InputStreamReader(fin, "UTF-8");

            while ((line = bin.read()) > 0) {
                buffer.append(line);
            }
            fileContent = new String(buffer);
        } finally {
            IOUtils.closeStream(fin);
            IOUtils.closeStream(bin);
            fs.close();
        }
        return fileContent;
    }

使用System.out也是乱码:

public static String readFile(Configuration conf, String filePath) throws IOException,
            URISyntaxException {
        String fileContent = null;
        FileSystem fs = getFileSystem(conf);
        Path path = new Path(filePath);
        FSDataInputStream in = null;
        int line;
        try {
            in = fs.open(path);
            IOUtils.copyBytes(in, System.out, 4096, false);
            fileContent = new String("");
        } finally {
            IOUtils.closeStream(in);
            fs.close();
        }
        return fileContent;
    }

我对IO流不熟悉,求大神们帮忙看看!

阅读 6.6k
2 个回答

fileContent = new String(outputStream.toByteArray(), "UTF8");
你这转码是干什么用的?

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题
宣传栏