compression algorithm

Brian23

Banned
Dec 28, 1999
1,655
1
0
I'm trying to write a C++ program that compresses files. I figured that while I'm doing this, I might as well make the compressed files compatible with WinZip. However I can't find any information on the internet about the structure of zip files, or the algorithm that's used to compress them. Can someone point me in the right direction?
 

DaveSimmons

Elite Member
Aug 12, 2001
40,730
670
126
Searching google on "gzip" will get you some links to free compression libraries and I believe some of the official gzip pages have links to info on creating zip-compatible files.

If you don't mind spending some money, www.inner-media.com sells various flavors of DynaZip that will do all of the work for you. The C++ library edition was really easy to use for adding zip creation to a program at work.
 

m0ti

Senior member
Jul 6, 2001
975
0
0
OK, the standard zip methods not bzip, but gzip) use some type of Lempel-Ziv Compression.

Here's how you do it (I've done the code in Java, though we were more interested in the tree for prediction than the compression):

Basically, you're building up a dictionary. adding on 1 word at a time. Each word consists of the number of a previous word in the dictionary and an additional character (suffix).

Decide on your alphabet (binary, ascii, or whatever).


Create a root node.

Go over your sequence letter by letter, moving down the tree appropriately as you do so, until you come to a leaf. Make the next entry in the dictionary as follows: the word number (stored in the leaf) and the next letter. Create a descendent for that leaf for the next letter and save its words number inside it. Reset back to the root, and repeat while you still have letters in the sequence.

I'm not sure if this will make it 100% compatable with WinZip, take a look at GZip to be absolutely sure. In Java, you can always use a GZipOutputStream to compress files (like I said, we were interested in the tree, and not the compression).