A better zip bomb (2019)
Posted by kekqqq 1 day ago
Comments
Comment by arjie 1 day ago
Comment by st_goliath 23 hours ago
no, not totally. The directory at the end of the archive points backwards to local headers, which in turn include all the necessary information, e.g. the compressed size inside the archive, compression method, the filename and even a checksum.
If the archive isn't some recursive/polyglot nonsense as in the article, it's essentially just a tightly packed list of compressed blobs, each with a neat, local header in front (that even includes a magic number!), the directory at the end is really just for quick access.
If your extraction program supports it (or you are sufficiently motivated to cobble together a small C program with zlib....), you can salvage what you have by linearly scanning and extracting the archive, somewhat like a fancy tarball.
Comment by nwallin 22 hours ago
This works great on campus, but when everyone went remote during COVID it wasn't anymore. It went from three minutes to like twenty minutes.
However. Most files change only rarely. I don't need all the files, just the ones which are different. So I wrote a scanner thing which compares the zip file's filesize and checksum to the checksum of the local file. If they're the same, we skip it, otherwise, we decompress out of the zip file. This cut the time to get the daily build from 20 minutes to 4 minutes.
Obviously this isn't resilient to an attacker, crc32 is not secure, but as an internal tool it's awesome.
Comment by tonyedgecombe 18 hours ago
Comment by brabel 15 hours ago
No, its purpose was to allow multi floppy disks archives. You would insert the last disk, then the other ones, one by one…
Comment by st_goliath 13 hours ago
If the archive is on a hard disk, the program reads the directory at the end and then seeks to the local header, rather than doing a linear scan. Or the floppy motor, if it is a small archive on a single floppy.
If you have multiple floppies, you insert the last one, the program reads the header and then tells you what floppy to insert, rather than having to go through them one by one, which you know, would be slower.
In one case, a hard disk arm, or the floppy motor, does the seeking, in the other case, your hands do the seeking. But it's still the same algorithm, doing the same thing, for the same reason.
Comment by Karliss 23 hours ago
This redundant information has lead to multiple vulnerabilities over the years. As having redundant information means that a maliciously crafted zip file with conflicting headers can have 2 different interpretations when processed by 2 different parsers.
Comment by EvanAnderson 22 hours ago
The PKZIP tools came with PKZIPFIX.EXE, which would scan the file from the beginning and rebuild a missing central archive. You could extract any files up to the truncated file where your download stopped.
Comment by halapro 21 hours ago
[1]: https://forum.videohelp.com/threads/393096-Fixing-Partially-Download-MP4-FilesComment by thunderfork 4 hours ago
Comment by cat_plus_plus 23 hours ago
Comment by danudey 1 day ago
unzip zbsm.zip
Archive: zbsm.zip
inflating: 0
error: invalid zip file with overlapped components (possible zip bomb)
This seems to have been done in a patch to address https://nvd.nist.gov/vuln/detail/cve-2019-13232https://sources.debian.org/patches/unzip/6.0-29/23-cve-2019-...
Comment by layer8 1 day ago
Comment by Retr0id 19 hours ago
Comment by Twirrim 1 day ago
Someone shared a link to that site in a conversation earlier this year on HN. For a long time now, I've had a gzip bomb sitting on my server that I provide to people that make a certain categories of malicious calls, such as attempts to log in to wordpress, on a site not using wordpress. That post got me thinking about alternative types of bombs, particularly as newer compression standards have become ubiquitous, and supported in browsers and http clients.
I spent some time experimenting with brotli as a compression bomb to serve to malicious actors: https://paulgraydon.co.uk/posts/2025-07-28-compression-bomb/
Unfortunately, as best as I can see, malicious actors are all using clients that only accept gzip, rather than brotli'd contents, and I'm the only one to have ever triggered the bomb when I was doing the initial setup!
Comment by est 1 day ago
Like bomb the CPU time instead of memory.
Comment by nwallin 22 hours ago
That's how self extraction archives and installers work and are also valid zip files. The extractor part is just a regular executable that is a zip decompresser that decompresses itself.
This is specific to zip files, not the deflate algorithm.
Comment by Retr0id 19 hours ago
import zlib
zlib.decompress(b"\x00\x00\x00\xff\xff" * 1000 + b"\x03\x00", wbits=-15)
If you want to spin more CPU, you'd probably want to define random huffman trees and then never use them.Comment by Retr0id 17 hours ago
The minimal version boils down to:
bytes.fromhex("04c001090000008020ffaf96") * 1000000 + b"\x03\x00"Comment by ks2048 22 hours ago
Comment by zipping1549 23 hours ago
Comment by hayley-patton 16 hours ago
Comment by hdjrudni 23 hours ago
Comment by RGamma 10 hours ago
> A final plea
It's time to put an end to Facebook. Working there is not ethically neutral: every day that you go into work, you are doing something wrong. If you have a Facebook account, delete it. If you work at Facebook, quit.
And let us not forget that the National Security Agency must be destroyed.
Comment by kleiba 1 day ago
Comment by Computer0 1 day ago
Comment by colechristensen 1 day ago
Comment by lossyalgo 1 day ago
Comment by dpifke 1 day ago
But you probably don't want to be investigated for either.
Comment by colechristensen 20 hours ago
Comment by jclarkcom 18 hours ago
Comment by fragmede 1 day ago
Comment by colechristensen 1 day ago
Comment by drob518 1 day ago
Comment by tjpnz 18 hours ago
Comment by 542458 1 day ago
Comment by danudey 1 day ago
https://sources.debian.org/patches/unzip/6.0-29/23-cve-2019-...
The detection maintains a list of covered spans of the zip files
so far, where the central directory to the end of the file and any
bytes preceding the first entry at zip file offset zero are
considered covered initially. Then as each entry is decompressed
or tested, it is considered covered. When a new entry is about to
be processed, its initial offset is checked to see if it is
contained by a covered span. If so, the zip file is rejected as
invalid.
So effectively it seems as though it just keeps track of which parts of the zip file have already been 'used', and if a new entry in the zip file starts in a 'used' section then it fails.Comment by necovek 17 hours ago
I.e. an advanced compressor could abuse the zip file format to share base data for files which only incrementally change (get appended to, for instance).
And then this patch would disallow such practice.
Comment by 10000truths 1 day ago
1. A exceeds some unreasonable threshold
2. A/B exceeds some unreasonable threshold
Comment by integralid 15 hours ago
On the other hand, zip bomb described in this blog post relies on decompressing the same data multiple times - so it wouldn't trigger your A/B heuristics necessarily.
Finally, A just means "you can't compress more than X bytes with my file format", right? Not a desirable property to have. If deflate authors had this idea when they designed the algorithm, I bet files larger than "unreasonable" 16MB would be forbidden.
Comment by 10000truths 13 hours ago
Sure, if you expect to decompress files with high compression ratios, then you'll want to adjust your knobs accordingly.
> On the other hand, zip bomb described in this blog post relies on decompressing the same data multiple times - so it wouldn't trigger your A/B heuristics necessarily.
If you decompress the same data multiple times, then you increment A multiple times. The accounting still works regardless of whether the data is same or different. Perhaps a better description of A and B in my post would be {number of decompressed bytes written} and {number of compressed bytes read}, respectively.
> Finally, A just means "you can't compress more than X bytes with my file format", right? Not a desirable property to have. If deflate authors had this idea when they designed the algorithm, I bet files larger than "unreasonable" 16MB would be forbidden.
The limitation is imposed by the application, not by the codec itself. The application doing the decompression is supposed to process the input incrementally (in the case of DEFLATE, reading one block at a time and inflating it), updating A and B on each iteration, and aborting if a threshold is violated.
Comment by nrhrjrjrjtntbt 1 day ago
Comment by dang 20 hours ago
A better zip bomb [WOOT '19 Paper] [pdf] - https://news.ycombinator.com/item?id=20685588 - Aug 2019 (2 comments)
A better zip bomb - https://news.ycombinator.com/item?id=20352439 - July 2019 (131 comments)
Comment by NaOH 18 hours ago
I use zip bombs to protect my server - https://news.ycombinator.com/item?id=43826798 - April 2025 (452 comments)
How to defend your website with ZIP bombs (2017) - https://news.ycombinator.com/item?id=38937101 - Jan 2024 (75 comments)
The Most Clever 'Zip Bomb' Ever Made Explodes a 46MB File to 4.5 Petabytes - https://news.ycombinator.com/item?id=20410681 - July 2019 (5 comments)
Defending a website with Zip bombs - https://news.ycombinator.com/item?id=14707674 - July 2017 (183 comments)
Zip Bomb - https://news.ycombinator.com/item?id=4616081 - Oct 2012 (108 comments)
Comment by measurablefunc 1 day ago
Comment by shakna 17 hours ago
It is a much easier problem to solve than you would expect. No need to drag in a data centre when heuristics can get you close enough.
[0] https://sources.debian.org/patches/unzip/6.0-29/23-cve-2019-...
Comment by measurablefunc 6 hours ago
Comment by bikeshaving 1 day ago
Comment by machinationu 1 day ago
Comment by cuechan 1 day ago
Comment by dontdoxxme 1 day ago
Comment by moreati 1 day ago
Comment by chupasaurus 1 day ago
Comment by dang 20 hours ago
Comment by KingLancelot 7 hours ago
Comment by gazabbqparty 10 hours ago