Results 1 to 19 of 19

Thread: Data-Compression Algorithms.

  1. #1

    Thread Starter
    Registered User Lior's Avatar
    Join Date
    Jan 2000
    Posts
    307

    Question

    Hi,
    How can I teach the computer base 65535 with only 256 different ASCII codes ?!

  2. #2
    Guest

    Question

    what do you mean base 65535? Do you mean Binary/&Octal/&Hexadecimal or what???

    I have a function that converts long values into any base system betwen 2 and 36, if you need it.

  3. #3
    transcendental analytic kedaman's Avatar
    Join Date
    Mar 2000
    Location
    0x002F2EA8
    Posts
    7,221
    Use a integer array instead of string
    Use
    writing software in C++ is like driving rivets into steel beam with a toothpick.
    writing haskell makes your life easier:
    reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
    To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.

  4. #4

    Thread Starter
    Registered User Lior's Avatar
    Join Date
    Jan 2000
    Posts
    307

    I meant...

    Hi,
    When I said "base 65535" I meant, for instance, we have in base 10, 10 digits (0-9), right?
    and in Octal, 8 digits (0-7), and in Hex 16 digits (0-F).
    So I need to find a way to represent and to "teach" the computer, a new base, base 65535, base with 65535 different digits.
    since I have just 255 different ASCII codes, how can I represent different 65535 digits, when each of them may take only 1 byte. ???

    Kedaman, what did u mean about the Integer array?


  5. #5
    transcendental analytic kedaman's Avatar
    Join Date
    Mar 2000
    Location
    0x002F2EA8
    Posts
    7,221
    I mean that you can store the base 65535 values in a integer array instead of a string, as you don't have only 256 different ascii codes. Do you need to know how to convert between bases?
    Use
    writing software in C++ is like driving rivets into steel beam with a toothpick.
    writing haskell makes your life easier:
    reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
    To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.

  6. #6

    Thread Starter
    Registered User Lior's Avatar
    Join Date
    Jan 2000
    Posts
    307
    Hi.
    Well, 2 things:
    1) Thanks, but I already know how to convert between bases.
    2) With your method of representing the digits, how will you, for example, store the 30000th digit only in 1 byte into a file? I need any digit to be represented as 1 byte exactly.

    let me demonstrate the problem with base 65535, (or any base greater than 255):

    1st digit = chr(1)
    2nd digit = chr(2)
    3rd digit = chr(3)
    .
    .
    .
    255th digit = chr(255)
    256th digit = ????? ' no more available ASCII codes.

    Any idea?

  7. #7
    Hyperactive Member
    Join Date
    Mar 2000
    Location
    Pittsburgh, PA
    Posts
    329
    i dont think you can. 1 byte is 8 bits. 2^8 = 256. you cant have more that 255 chars. you need to do the compression mathimatically. (poorly spelled)

    ______________

  8. #8
    Hyperactive Member
    Join Date
    Jun 2000
    Location
    Auckland, NZ
    Posts
    411
    since I have just 255 different ASCII codes, how can I
    represent different 65535 digits, when each of them may
    take only 1 byte
    HAVocINCARNATE29 said it. If you can find a way to store
    65535 unique ANYTHINGS in 8 bits (one byte), then you will
    make millions or billions of dollars

    So clearly, you must use 2 bytes to store each
    "Character". Interestingly, VB does this already because
    it stores each character in a string as 2 bytes (Unicode).

    If you try:

    Code:
    Debug.Print LenB(someString)
    You will find that the byte length is of course twice the
    number of characters.


    Of course, if you could make a processor in which each bit
    could hold up to 4 states (i.e. on, off, almost on, almost
    off) then you could do it in a single byte too

    If you tell us why you want to have a way of storing 65535
    unique values in a single byte, maybe we can find a way of
    achieving your goal that is in the realm of possibility.

    Cheers
    Paul Lewis

  9. #9
    transcendental analytic kedaman's Avatar
    Join Date
    Mar 2000
    Location
    0x002F2EA8
    Posts
    7,221
    2) With your method of representing the digits, how will you, for example, store the 30000th digit only in 1 byte into a file? I need any digit to be represented as 1 byte exactly.

    let me demonstrate the problem with base 65535, (or any base greater than 255):

    1st digit = chr(1)
    2nd digit = chr(2)
    3rd digit = chr(3)
    .
    .
    .
    255th digit = chr(255)
    256th digit = ????? ' no more available ASCII codes.

    Any idea?
    Yep, youre still trying to put it in a string instead of a integer array. Now there's the integer that is just perfect for this issue since its ranged from -32,768 to 32,767 that means you enter the value into the integer as follows:
    Code:
    IntArray(decimalplace)=digit-32768
    where digit can be ranged from 0 to 65535

    To store it in a file, open it binary:
    Code:
    Open file for binary as #1
    Put#1,,IntArray
    Close #1
    So, it's actually very easy to get around this prob.



    Use
    writing software in C++ is like driving rivets into steel beam with a toothpick.
    writing haskell makes your life easier:
    reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
    To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.

  10. #10

    Thread Starter
    Registered User Lior's Avatar
    Join Date
    Jan 2000
    Posts
    307
    Hi kedaman,
    When I used your code, with an example of array ranged from 1 to 3, with values like:
    Code:
    Dim IntArray(1 To 3) As Integer
    IntArray(1)=20000
    IntArray(2)=20001
    IntArray(3)=20002
    
    Open file for binary as #1 'Then I saved it with:
    Put#1,,IntArray
    Close #1
    It made me a file with a length of 6 bytes.
    I want it to be 3 bytes only (because of only 3 numbers).

    Thanks in advance.
    Lior Asher.

  11. #11
    transcendental analytic kedaman's Avatar
    Join Date
    Mar 2000
    Location
    0x002F2EA8
    Posts
    7,221
    Consider it impossible, each integer take up 2 bytes. Even if you converted it back to base10 or even base256 you won't get down to 3 bytes, because 3 bytes makes 2^24 combinations, not 2^48
    Use
    writing software in C++ is like driving rivets into steel beam with a toothpick.
    writing haskell makes your life easier:
    reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
    To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.

  12. #12
    Hyperactive Member
    Join Date
    Jun 2000
    Location
    Auckland, NZ
    Posts
    411

    Angry hello?

    Just making certain I am not invisible...

    Paul Lewis

  13. #13

    Thread Starter
    Registered User Lior's Avatar
    Join Date
    Jan 2000
    Posts
    307
    I have been noticed you were right, Paul. and so was you kedaman.
    Thanks.
    So, can we summery it as: "No chance at all for representing a digit of a 65535 base structure in 1 byte? (or less than the 2 bytes which can be used for accomplishing that). ?
    If so, my compression algorithm is dead.

  14. #14
    Hyperactive Member
    Join Date
    Jun 2000
    Location
    Auckland, NZ
    Posts
    411

    R.I.P

    If so, my compression algorithm is dead.
    Sorry, and in case in your country this is an insult, it is not meant to be.

    R.I.P = Rest In Peace:

    Regards
    Paul Lewis

  15. #15
    transcendental analytic kedaman's Avatar
    Join Date
    Mar 2000
    Location
    0x002F2EA8
    Posts
    7,221
    BTW, there are millions of compression algoritms out there, nobody said you could even try to work out your own one This was just not the right way to do it
    Use
    writing software in C++ is like driving rivets into steel beam with a toothpick.
    writing haskell makes your life easier:
    reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
    To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.

  16. #16

    Thread Starter
    Registered User Lior's Avatar
    Join Date
    Jan 2000
    Posts
    307

    Hmmm....

    Well keda,
    Acctually I've been to this issue already.
    I mean, I didnt wake up at the morning and said: "What a nice day, lets make my own compression-algorithm."
    I have read a lot of info from the net about these algorithms.
    I already understood the Huffman coding, the Ziv-Lempel coding (the one used for the popular WinZip) and some simplier ones like the RLE algorithm.
    all of these algorithms are lossless algorithms of course, ones where you extract the compressed file into the original file exactly, 100%, even not a byte changed.
    Therefore, I didnt tried to understand the MP3 or the JPEG or MPEG algorithms, because they harm the quality of the file a bit.

    So, to the point, what do you think IS the right way to do it ?

    Thanks, Lior.

  17. #17
    New Member
    Join Date
    Oct 2000
    Posts
    2
    I'm assuming that you have 16 bit bytes minimum, if you only have 8 bit bytes then each digit must be represented by two bytes.

    There is a way although it may not be very efficient and probably is difficult to code. Each digit in the base is converted to the form (x.y).

    x ranges from 0 to 255
    y ranges from 1 to 255


    The program converts this by the following formula:-

    decimal value of digit = x+ (y-1)*256


    0,1 is 0
    1,1 is 1
    2,1 is 2
    3,1 is 3
    4,1 is 4
    5,1 is 5
    ...
    255,1 is 255
    0,2 is 256
    1,2 is 257
    ....
    255,2 is 511
    0,3 is 512
    etc etc

    The program must first convert the 16 bit number into an (x,y) format. (using the above formula should be easy enough). The program must then interpret each digit to print a character.

    You need to have 256 different character sets, for example:-

    255, 1 would be the 255th character in character set English
    255, 2 would be the 255th character in character set German.

    etc etc

    Example

    Stored as bytes:-
    255 512 - - in decimal is 16,711,937

    converted to
    (255,1) (0,3)
    The computer would then convert (255,1) and (0,3) into ascii characters from different character sets!!!
    where x is the ascii code and y is the character set.

    Hope this works!

    The system therefore stores the base as numbers but converts to display....

    Hope this makes sense!

  18. #18
    transcendental analytic kedaman's Avatar
    Join Date
    Mar 2000
    Location
    0x002F2EA8
    Posts
    7,221
    Hmm, actually i know just plain nothing about compression algoritms, at least those that keeps quality, like pkzip and Ace...
    On the other hand i always have ideas So maybe you could listen to them but you'd better listen to professionals instead.

    Find general patterns, like if you have 11111111111111, then you could compress that to /14/1 or something like that, or if you find 12345678, 13579, you could try having an aritmetical formula between the tags. If something repeats, 123123123123, then you could have /R123/4.
    Use
    writing software in C++ is like driving rivets into steel beam with a toothpick.
    writing haskell makes your life easier:
    reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
    To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.

  19. #19

    Thread Starter
    Registered User Lior's Avatar
    Join Date
    Jan 2000
    Posts
    307
    Well...

    Kedaman, your idea involves AI (Artificial Intelligence). It is hard to find a formula between the chars. The computer would have to look at the file and think like a human being who see the file. Pretty hard to implement. Maybe you have other idea?

    SimonCook, In any way, there are only 255 different ASCII codes in EVERY computer. your idea would work only if each computer would have these 255 different character sets (The different languages), which includes different 255 chars each.
    There are no computers with 255 different character sets.
    When I'm thinking about it again now, it seems just impossible to accomplish. and the reason is very simple:
    number 1 = chr(1)
    number 2 = chr(2)
    number 3 = chr(3)
    .
    .
    .
    number 500 = cant be represented.
    just not enough different ASCII codes. and maybe that's it.
    there is no way to create something from nothing. you must have the material first. I very hope I'm wrong.



Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width