Data-Compression Algorithms.

**Lior** · Oct 22nd, 2000, 11:14 AM

Hi,
How can I teach the computer base 65535 with only 256 different ASCII codes ?!

Oct 22nd, 2000, 12:04 PM

what do you mean base 65535? Do you mean Binary/&Octal/&Hexadecimal or what???

I have a function that converts long values into any base system betwen 2 and 36, if you need it.

**kedaman** · Oct 22nd, 2000, 12:30 PM

Use a integer array instead of string

**Lior** · Oct 22nd, 2000, 12:43 PM

Hi,
When I said "base 65535" I meant, for instance, we have in base 10, 10 digits (0-9), right?
and in Octal, 8 digits (0-7), and in Hex 16 digits (0-F).
So I need to find a way to represent and to "teach" the computer, a new base, base 65535, base with 65535 different digits.
since I have just 255 different ASCII codes, how can I represent different 65535 digits, when each of them may take only 1 byte. ???

Kedaman, what did u mean about the Integer array?

**kedaman** · Oct 22nd, 2000, 12:53 PM

I mean that you can store the base 65535 values in a integer array instead of a string, as you don't have only 256 different ascii codes. Do you need to know how to convert between bases?

**Lior** · Oct 22nd, 2000, 02:51 PM

Hi.
Well, 2 things:
1) Thanks, but I already know how to convert between bases.
2) With your method of representing the digits, how will you, for example, store the 30000th digit only in 1 byte into a file? I need any digit to be represented as 1 byte exactly.

let me demonstrate the problem with base 65535, (or any base greater than 255):

1st digit = chr(1)
2nd digit = chr(2)
3rd digit = chr(3)
.
.
.
255th digit = chr(255)
256th digit = ????? ' no more available ASCII codes.

Any idea?

**HAVocINCARNATE29** · Oct 22nd, 2000, 03:36 PM

i dont think you can. 1 byte is 8 bits. 2^8 = 256. you cant have more that 255 chars. you need to do the compression mathimatically. (poorly spelled)

**PaulLewis** · Oct 22nd, 2000, 04:01 PM

since I have just 255 different ASCII codes, how can I
represent different 65535 digits, when each of them may
take only 1 byte

HAVocINCARNATE29 said it. If you can find a way to store
65535 unique ANYTHINGS in 8 bits (one byte), then you will
make millions or billions of dollars

So clearly, you must use 2 bytes to store each
"Character". Interestingly, VB does this already because
it stores each character in a string as 2 bytes (Unicode).

If you try:

Code:

Debug.Print LenB(someString)

You will find that the byte length is of course twice the
number of characters.

Of course, if you could make a processor in which each bit
could hold up to 4 states (i.e. on, off, almost on, almost
off) then you could do it in a single byte too

If you tell us why you want to have a way of storing 65535
unique values in a single byte, maybe we can find a way of
achieving your goal that is in the realm of possibility.

Cheers

**kedaman** · Oct 22nd, 2000, 04:10 PM

2) With your method of representing the digits, how will you, for example, store the 30000th digit only in 1 byte into a file? I need any digit to be represented as 1 byte exactly.

let me demonstrate the problem with base 65535, (or any base greater than 255):

1st digit = chr(1)
2nd digit = chr(2)
3rd digit = chr(3)
.
.
.
255th digit = chr(255)
256th digit = ????? ' no more available ASCII codes.

Any idea?

Yep, youre still trying to put it in a string instead of a integer array. Now there's the integer that is just perfect for this issue since its ranged from -32,768 to 32,767 that means you enter the value into the integer as follows:

Code:

IntArray(decimalplace)=digit-32768

where digit can be ranged from 0 to 65535

To store it in a file, open it binary:

Code:

Open file for binary as #1
Put#1,,IntArray
Close #1

So, it's actually very easy to get around this prob.

**Lior** · Oct 22nd, 2000, 05:41 PM

Hi kedaman,
When I used your code, with an example of array ranged from 1 to 3, with values like:

Code:

Dim IntArray(1 To 3) As Integer
IntArray(1)=20000
IntArray(2)=20001
IntArray(3)=20002

Open file for binary as #1 'Then I saved it with:
Put#1,,IntArray
Close #1

It made me a file with a length of 6 bytes.
I want it to be 3 bytes only (because of only 3 numbers).

Thanks in advance.
Lior Asher.

**kedaman** · Oct 22nd, 2000, 06:13 PM

Consider it impossible, each integer take up 2 bytes. Even if you converted it back to base10 or even base256 you won't get down to 3 bytes, because 3 bytes makes 2^24 combinations, not 2^48

**PaulLewis** · Oct 22nd, 2000, 06:21 PM

Just making certain I am not invisible...

**Lior** · Oct 22nd, 2000, 06:28 PM

I have been noticed you were right, Paul. and so was you kedaman.
Thanks.
So, can we summery it as: "No chance at all for representing a digit of a 65535 base structure in 1 byte? (or less than the 2 bytes which can be used for accomplishing that). ?
If so, my compression algorithm is dead.

**PaulLewis** · Oct 22nd, 2000, 06:35 PM

If so, my compression algorithm is dead.

Sorry, and in case in your country this is an insult, it is not meant to be.

R.I.P = Rest In Peace:

Regards

**kedaman** · Oct 23rd, 2000, 03:24 AM

BTW, there are millions of compression algoritms out there, nobody said you could even try to work out your own one

This was just not the right way to do it

**Lior** · Oct 26th, 2000, 02:04 PM

Well keda,
Acctually I've been to this issue already.
I mean, I didnt wake up at the morning and said: "What a nice day, lets make my own compression-algorithm."
I have read a lot of info from the net about these algorithms.
I already understood the Huffman coding, the Ziv-Lempel coding (the one used for the popular WinZip) and some simplier ones like the RLE algorithm.
all of these algorithms are lossless algorithms of course, ones where you extract the compressed file into the original file exactly, 100%, even not a byte changed.
Therefore, I didnt tried to understand the MP3 or the JPEG or MPEG algorithms, because they harm the quality of the file a bit.

So, to the point, what do you think IS the right way to do it ?

Thanks, Lior.

**SimonCook** · Oct 26th, 2000, 03:41 PM

I'm assuming that you have 16 bit bytes minimum, if you only have 8 bit bytes then each digit must be represented by two bytes.

There is a way although it may not be very efficient and probably is difficult to code. Each digit in the base is converted to the form (x.y).

x ranges from 0 to 255
y ranges from 1 to 255

The program converts this by the following formula:-

decimal value of digit = x+ (y-1)*256

0,1 is 0
1,1 is 1
2,1 is 2
3,1 is 3
4,1 is 4
5,1 is 5
...
255,1 is 255
0,2 is 256
1,2 is 257
....
255,2 is 511
0,3 is 512
etc etc

The program must first convert the 16 bit number into an (x,y) format. (using the above formula should be easy enough). The program must then interpret each digit to print a character.

You need to have 256 different character sets, for example:-

255, 1 would be the 255th character in character set English
255, 2 would be the 255th character in character set German.

etc etc

Example

Stored as bytes:-
255 512 - - in decimal is 16,711,937

converted to
(255,1) (0,3)
The computer would then convert (255,1) and (0,3) into ascii characters from different character sets!!!
where x is the ascii code and y is the character set.

Hope this works!

The system therefore stores the base as numbers but converts to display....

Hope this makes sense!

**kedaman** · Oct 26th, 2000, 06:01 PM

Hmm, actually i know just plain nothing about compression algoritms, at least those that keeps quality, like pkzip and Ace...
On the other hand i always have ideas

So maybe you could listen to them but you'd better listen to professionals instead.

Find general patterns, like if you have 11111111111111, then you could compress that to /14/1 or something like that, or if you find 12345678, 13579, you could try having an aritmetical formula between the tags. If something repeats, 123123123123, then you could have /R123/4.

**Lior** · Oct 27th, 2000, 04:35 AM

Well...

Kedaman, your idea involves AI (Artificial Intelligence). It is hard to find a formula between the chars. The computer would have to look at the file and think like a human being who see the file. Pretty hard to implement. Maybe you have other idea?

SimonCook, In any way, there are only 255 different ASCII codes in EVERY computer. your idea would work only if each computer would have these 255 different character sets (The different languages), which includes different 255 chars each.
There are no computers with 255 different character sets.
When I'm thinking about it again now, it seems just impossible to accomplish. and the reason is very simple:
number 1 = chr(1)
number 2 = chr(2)
number 3 = chr(3)
.
.
.
number 500 = cant be represented.
just not enough different ASCII codes. and maybe that's it.
there is no way to create something from nothing. you must have the material first. I very hope I'm wrong.

Thread: Data-Compression Algorithms.

Thread Tools

Display

I meant...

hello?

R.I.P

Hmmm....

Posting Permissions