-
Oct 4th, 2011, 04:46 PM
#1
Thread Starter
New Member
Reading bytes from files and encoding
Hello
I need to do some operations that require converting array of bytes to string and the other way. After struggling with it for a few hours I found out the core of my problems. Look at this piece of code:
Code:
Dim encoding As New System.Text.ASCIIEncoding()
Dim bytes() As Byte = File.ReadAllBytes("C:\WINDOWS\notepad.exe")
Dim str As String = encoding.GetString(bytes)
Dim rebytes() As Byte = encoding.GetBytes(str)
It turns out that rebytes isn't equal to bytes. In some encodings even the lengths are wrong. In others, some characters are messed up. I have tried ASCII, UTF8 and UNICODE and none of them get the job done right.
What is the proper way of doing this?
PS. I can not stick to just bytes or just string, I need to use them both.
-
Oct 4th, 2011, 05:20 PM
#2
Re: Reading bytes from files and encoding
Normally, one doesn't create an instance of an encoder for this, though it should work fine, so that isn't the issue.
The code you posted is clearly sample code, but have you actually run that very code? If so, have you stepped through the code looking at it after each step? It has been a while since I have done that, but the steps look right. I would also look at str.length, since ASCII strings should be the same length as their bytes.
My usual boring signature: Nothing
-
Oct 4th, 2011, 05:46 PM
#3
Thread Starter
New Member
Re: Reading bytes from files and encoding
Yes, I did run this code and yes I did check it in a debugger. I just wrote some code to simply test what I said, take a look:
Code:
Dim encoding1 As New System.Text.ASCIIEncoding()
Dim bytes() As Byte = File.ReadAllBytes("C:\WINDOWS\notepad.exe")
Dim str As String = encoding1.GetString(bytes)
Dim rebytes() As Byte = encoding1.GetBytes(str)
Dim same As String = "no"
If Convert.ToBase64String(bytes) = Convert.ToBase64String(rebytes) Then
same = "yes"
End If
Console.WriteLine("ASCII - bytes.length == " & bytes.Length & ", rebytes.length == " & rebytes.Length & ", same? - " & same.ToString())
// etc for all other available encoding types
and the result is:
Code:
ASCII - bytes.length == 70144, rebytes.length == 70144, same? - no
Unicode - bytes.length == 70144, rebytes.length == 70144, same? - no
UTF8 - bytes.length == 70144, rebytes.length == 107789, same? - no
UTF7 - bytes.length == 70144, rebytes.length == 182818, same? - no
UTF32 - bytes.length == 70144, rebytes.length == 70144, same? - no
What should I do?
-
Oct 4th, 2011, 06:06 PM
#4
Re: Reading bytes from files and encoding
Try
Code:
Dim encoding1 As System.Text.Encoding = System.Text.Encoding.Default
But the MSDN gives a warning about using Default Encoding.
I don't think it's going to solve your underlying problem, tho. It's going to depend on the source of your bytes and what their original encoding was (assuming they refer to Text to begin with).
-
Oct 4th, 2011, 06:21 PM
#5
Thread Starter
New Member
Re: Reading bytes from files and encoding
Wow, it actually solved my problem. Thank you very much for that
I'm unsure if the way I'm doing things is correct though. What I'm doing is getting bytes of an assembly file, converting them to string and then crypting with AES. Program then decrypts the string, converts it back to bytes and loads the assembly. Considering default encoding types are different on different locales, will the assembly load on every machine? I mean - will Assembly.Load work with this specific assembly taken from my system?
EDIT:
I forgot to add, that the code decrypting (and using default encoding) will also be run on different machines. I think because of that, I need to find the specific encoding used in my system and force the application to use it no matter what the default is. How would I go about checking that?
Last edited by danw8; Oct 4th, 2011 at 06:27 PM.
-
Oct 4th, 2011, 06:52 PM
#6
Re: Reading bytes from files and encoding
I was just thinking the same. Code Page 1252 on mine.
Code:
Dim encoding As System.Text.Encoding = System.Text.Encoding.Default
Dim page As Integer = encoding.CodePage
' gives page = 1252
Code:
Dim encoding1 As System.Text.Encoding = System.Text.Encoding.GetEncoding(1252)
Docs here.
Don't ask me if it will work on all PCs. What you are doing is beyond me
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|