|
-
Apr 16th, 2004, 04:17 PM
#1
Thread Starter
Frenzied Member
winzip compressions (and general compressions I guess)
I've noticed that making a .zip file self-executing only makes it ever so slightly bigger (the file). this means that the file contains all the info it contained earlier, but obviously compressed.
If I were to make a compressions algorithm (at least for text) I would make certain words replaced with a variable. this variable would not be stored in the zip file, but in winzip itself.
you see, winzip can be almost as large as it wants (well, maybe not 50+ MB), as long as it can compress it well.
doing this would surely help right? what am I missing?
also how does it compress images? I've never figured out how.
on a slightly seperate note, do most encryption algorithms increase, decrease or keep constant the size of the file. My encryption doubles to triples the file length.
Have I helped you? Please Rate my posts. 
-
Apr 16th, 2004, 04:26 PM
#2
Hyperactive Member
There are multiple ways of compression.
RLE method (Run Length Encoding) which makes a table of the "mode" (the most of character(s)) values within the program. And then compresses using 12 bits instead of 8 but decreases the amount of bytes used because the removal of repeated bytes in the file.
LZW or LZX, which also creates a table, but does it more efficently but also makes it slower to decompress.
JPG and MPEG, which are 24-bit progressive compressions and do not need a table for compression but rather similar bytes next to each other in 4 bytes are compressed into just 3 or 2 1/2 (12 bits.)
There are many others.
53323737 15 743 313402 05 740313063. 17 15 4150 743 313402 05 140393403437 5203 743 30210.

-
Apr 16th, 2004, 04:29 PM
#3
Thread Starter
Frenzied Member
ok, I sorta get it, thanks. basically they note similair groups of letters, and yeah, sorta got it.
Have I helped you? Please Rate my posts. 
-
Apr 17th, 2004, 07:50 AM
#4
LZW looks at repeated blocks of data, for example
"blabla" -> "bla[-3,3]" meaning, start 3 characters ago, continue for 3 characters. (Note that this is of course a simplification)
Image compression and audio compression (can) work differently. PNG and GIF also use RLE like schemes, but Jpeg and Mpeg do something completely different. They transform the signal using variations of the Fourier transform. This is a mathamatical procedure. It is the simpelest to understand for sound. If you have a constant tone of say 50 Hz, the data will look like a sine wave. The transform will take that data and turn it into a table of frequencies, that would mean that that sound wave could be stored as "50 hz for 10 seconds" instead of the entire wave.
Jpeg does something similair, only in 2 dimensions.
-
Apr 17th, 2004, 03:23 PM
#5
So Unbanned
It depends on the type of encryption. Ideally the encryption should not make the data any larger(this would be pointless). If anything it should become slightly larger(due to related encryption information) or smaller than before.
For example... XOR encryption would tend to keep the data the same size. The key to XOR encryption is in the pattern. You'll want something that won't repeat(no real pattern). This is achieved through a good random number generator(key-based).
To make a simple encryption routine you could use Randomize and Rnd in VB(you'll have to set the seed), then loop: int(rnd*256) Xor char(x). When you Xor it again with the same value it'll become the original character. To become a more secure routine you can make it dependent on the other data in the file(the previous character). The problem with XOR is even multiple XORs could be simplified to one XOR(which is why you need to have a good random factor to deter bruteforce). Say you want to overlay a password with the random numbers. The random numbers could be found out, so then if they can figure out the additional scheme(xoring the password chars for example) the data is revealed. So if you can make multiple random factors, the data will be very hard to decrypt. Making the data larger will probably only make the pattern(s) easier to detect.
-
Apr 21st, 2004, 05:16 AM
#6
-
Apr 21st, 2004, 10:22 AM
#7
Hyperactive Member
Re: hmm...can I ask too?
Originally posted by sql_lall
Ok, seeing as this is close to my question, i have a few things to ask:
1) What types of binary strings does winzip compress well?
2) What types of binary strings do zip files end up as?
3) What types of binary strings do .mp3 files end up as?
by 'what types', i mean such descriptions as 'contain lots of 0s', or 'will on average contain 1's and 0's continually swapping', or 'will have approximately the same distribution of each string of length n bits'
Thanks
1) Depends, it's a matter of context. If the file has over 10% of bytes that are the same in any random position within the file, it can compress it well. If there are over 63% (>5 bits) more 1's or 0's, it can compress it well.
2) Think of it as taking an air out of a baloon, all is left is really dense file of a mathematically even amount of data (bytes, strands, or bits.)
3) MP3 loses some of it's data that's why it can be compressed to such a small size. It just cuts off the limit of the algorithm to only what the human ear can hear, which is a lot, and also the fact that it is progressive compression. So it doesn't really how dense it is.
53323737 15 743 313402 05 740313063. 17 15 4150 743 313402 05 140393403437 5203 743 30210.

Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|