Results 1 to 5 of 5

Thread: Subject hard to explain, come in and read about it.

  1. #1

    Thread Starter
    Addicted Member krah's Avatar
    Join Date
    Jan 1999
    Location
    Arkansas, her hyuck!
    Posts
    163

    Question

    Bare (or is it bear like the big furry things) with me here.

    If I save a text file, what are the smallest units that make up that text file? (Or any other file for that matter.) Is it 0s and 1s? If so I guess it would be bits, right?
    Is it just 1s and 0s though, or do different characters in a different number system (or ascii) characters make up certain combinations of 1s and 0s. I really would appreciate someone explaining this to me clearly.

    But why do I ask?
    Well, and I think this is not right, but let's say...
    File x.txt is made up of numbers in base 8 (octal) like this-

    3343472

    If I made a program that makes certain combinations equal a single character in base 16 (hex), thereby making two or three characters equal one character like so-

    D = 33
    A = 47

    D43A2

    would it be possible to use this as a compression utility?
    If this worked would it be possible to go even further and use all the ascii characters?
    Can you get data from a file bit by bit or byte by byte (question is can you get it)?

    This was just a thought and it seems like it could make sense, but then again I could be wrong.
    Is it tired in here or is it just me?

    Ryan Williams
    -Using Vb6-

  2. #2
    Frenzied Member
    Join Date
    Mar 2000
    Posts
    1,089
    I don't know very much about Compression but I'll have a go

    you are sying you have a text file, which you read in as a string of numbers in base 8

    you then pair these numbers off so you get a list of pairs of octal characters

    you then map every combination of 2 octal characters to a single Hex character (I couldn't figure out the algorithm you were sing to do this)

    you then store the string of hex characters as the compressed file.


    The big problem I see with this is that there are 64 combinations of 2 octal characters (77 = 63 remember) so If you are t map these onto hex characters you would have to have 4 different combinations represented by each different hex character. so you can't get the data back.

    If it wasn't for this sentence I wouldn't have a signature at all.

  3. #3

  4. #4
    Monday Morning Lunatic parksie's Avatar
    Join Date
    Mar 2000
    Location
    Mashin' on the motorway
    Posts
    8,169
    One of the most common compression algorithms is the Lempel-Zef-Welch (LZW) system, used in Zip files, GZip, Compressed TIFFs, etc. Go to: ftp://ftp.freesoftware.com/pub/infozip/zlib/index.html

    This has C source code (sorry), but there is a good description of how it works.
    I refuse to tie my hands behind my back and hear somebody say "Bend Over, Boy, Because You Have It Coming To You".
    -- Linus Torvalds

  5. #5
    Guest
    cool thanks for the link

    I need as much c/c++ source as I can get..... because I suck at it, and I want to learn it before I start taking classes on it(so I can be the best in my class )

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width