We had an issue at work. Our dev. team is managing a php server, which sends and receives binary KET files of constant size (say, 5kb) to production servers. I was rewriting some of the production servers and I notices that sometimes I’m getting TLE files in instead of KET files, and that they are 11 byte too big.
I checked it out. Apparently the dev. team was sending the KET files for compression (via php gzcompress). They were very much surprised to learn that the output files were larger than the original KETs. “Apparently”, they sighed, “gzip is not a very good compression. Let’s find another one”.
Which is the wrong conclusion. They have a meagre chance of finding a compression engine which wouldn’t expand the files. And this is a good excuse to explain this to the rest of the world.
An introduction to data compression
Pigeon hole principle → no general compression scheme possible.
Range limitation → Data redundancy → range-specific compression possible.
Prima facie, compression is an impossible task.