Java 多线程写zip文件遇到的错误 write beyond end of stream!

最近在写一个大量小文件直接压缩到一个zip的需求,由于zip中的entry每一个都是独立的,不需要追加写入,也就是一个entry文件 , 写一个内容,
因此直接使用了多线程来处理,结果就翻车了,代码给出了如下的错误:write beyond end of stream!
下面直接还原当时的代码场景:
1 public class MultiThreadWriteZipFile { 2 3private static ExecutorService executorService = Executors.newFixedThreadPool(50); 4 5private staticCountDownLatch countDownLatch = new CountDownLatch(50); 6 7 8@Test 9public void multiThreadWriteZip() throws IOException, InterruptedException {10File file = new File("D:\\Gis开发\\数据\\影像数据\\china_tms\\2\\6\\2.jpeg");11//创建一个zip12ZipOutputStream zipOutputStream =13new ZipOutputStream(new FileOutputStream(new File("E:\\java\\test\\test.zip")));1415for (int i = 0; i < 50; i++){16String entryName = i + File.separator + i + File.separator + i + ".jpeg";17executorService.submit(() -> {18try {19writeSource2ZipFile(new FileInputStream(file),entryName,zipOutputStream);20countDownLatch.countDown();21} catch (IOException e) {22e.getLocalizedMessage();23}24});25}26//阻塞主线程27countDownLatch.await();28//关闭流29zipOutputStream.close();30}313233public void writeSource2ZipFile(InputStream inputStream,34String zipEntryName,35ZipOutputStream zipOutputStream) throws IOException {36//新建entry37zipOutputStream.putNextEntry(new ZipEntry(zipEntryName));38byte[] buf = new byte[1024];39int position;40//entry中写数据41while((position = inputStream.read(buf)) != -1){42zipOutputStream.write(buf);43}44zipOutputStream.closeEntry();45zipOutputStream.flush();46}47 }直接运行上面的代码就会报错:write beyond end of stream
将 private static ExecutorService executorService = Executors.newFixedThreadPool(50);
修改为
private static ExecutorSercvice executorService = Executors.newSingleThreadExecutor();
此时代码运行正常!
至于原因嘛,我们跟踪下代码也就明白其中的原因了,我们先来看报错的代码出处:
在java.util包下的DeflaterOutputStream的201行(jdk1.8,其它版本可能会有差异),我们来看代码
public void write(byte[] b, int off, int len) throws IOException {if (def.finished()) {throw new IOException("write beyond end of stream");}if ((off | len | (off + len) | (b.length - (off + len))) < 0) {throw new IndexOutOzfBoundsException();} else if (len == 0) {return;}if (!def.finished()) {def.setInput(b, off, len);while (!def.needsInput()) {deflate();}}}关键的原因就是def.finished()对应的状态信息,而这个状态是在Deflater这个类中定义的,这个类也是Java基于ZLIB压缩库实现的,一个压缩工具类 。
而下面的这段代码就是改变这个状态的,
public void finish() {synchronized (zsRef) {finish = true;}}而这个代码的调用之处 , 最源头就是我们上面的zipOutputStream.putNextEntry(new ZipEntry(zipEntryName)); 这行代码 , 
其实先思路 , 就是每次新增一个entry的时候,都需要将上一次的entry关闭掉,此时也就触发了这个条件,而这个状态并不是线程私有的,我们通过下面的代码就可以知道
publicclass Deflater {private final ZStreamRef zsRef;private byte[] buf = new byte[0];private int off, len;private int level, strategy;private boolean setParams;private boolean finish, finished;private long bytesRead;private long bytesWritten;因此在多线程下 , 这个状态肯定是线程不安全的!
好了本次关于多线程下写zip报错的问题,就介绍到这里!
【Java 多线程写zip文件遇到的错误 write beyond end of stream!】

    推荐阅读