-
Bare (or is it bear like the big furry things) with me here.
If I save a text file, what are the smallest units that make up that text file? (Or any other file for that matter.) Is it 0s and 1s? If so I guess it would be bits, right?
Is it just 1s and 0s though, or do different characters in a different number system (or ascii) characters make up certain combinations of 1s and 0s. I really would appreciate someone explaining this to me clearly.
But why do I ask?
Well, and I think this is not right, but let's say...
File x.txt is made up of numbers in base 8 (octal) like this-
3343472
If I made a program that makes certain combinations equal a single character in base 16 (hex), thereby making two or three characters equal one character like so-
D = 33
A = 47
D43A2
would it be possible to use this as a compression utility?
If this worked would it be possible to go even further and use all the ascii characters?
Can you get data from a file bit by bit or byte by byte (question is can you get it)?
This was just a thought and it seems like it could make sense, but then again I could be wrong.
-
I don't know very much about Compression but I'll have a go
you are sying you have a text file, which you read in as a string of numbers in base 8
you then pair these numbers off so you get a list of pairs of octal characters
you then map every combination of 2 octal characters to a single Hex character (I couldn't figure out the algorithm you were sing to do this)
you then store the string of hex characters as the compressed file.
The big problem I see with this is that there are 64 combinations of 2 octal characters (77 = 63 remember) so If you are t map these onto hex characters you would have to have 4 different combinations represented by each different hex character. so you can't get the data back.
-
-
One of the most common compression algorithms is the Lempel-Zef-Welch (LZW) system, used in Zip files, GZip, Compressed TIFFs, etc. Go to: ftp://ftp.freesoftware.com/pub/infozip/zlib/index.html
This has C source code (sorry), but there is a good description of how it works.
-
cool thanks for the link :)
I need as much c/c++ source as I can get..... because I suck at it, and I want to learn it before I start taking classes on it(so I can be the best in my class :D)