Results 1 to 6 of 6

Thread: Reading bytes from files and encoding

  1. #1

    Thread Starter
    New Member
    Join Date
    Oct 2011
    Posts
    3

    Reading bytes from files and encoding

    Hello
    I need to do some operations that require converting array of bytes to string and the other way. After struggling with it for a few hours I found out the core of my problems. Look at this piece of code:
    Code:
    Dim encoding As New System.Text.ASCIIEncoding()
    Dim bytes() As Byte = File.ReadAllBytes("C:\WINDOWS\notepad.exe")
    Dim str As String = encoding.GetString(bytes)
    Dim rebytes() As Byte = encoding.GetBytes(str)
    It turns out that rebytes isn't equal to bytes. In some encodings even the lengths are wrong. In others, some characters are messed up. I have tried ASCII, UTF8 and UNICODE and none of them get the job done right.
    What is the proper way of doing this?

    PS. I can not stick to just bytes or just string, I need to use them both.

  2. #2
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    39,044

    Re: Reading bytes from files and encoding

    Normally, one doesn't create an instance of an encoder for this, though it should work fine, so that isn't the issue.

    The code you posted is clearly sample code, but have you actually run that very code? If so, have you stepped through the code looking at it after each step? It has been a while since I have done that, but the steps look right. I would also look at str.length, since ASCII strings should be the same length as their bytes.
    My usual boring signature: Nothing

  3. #3

    Thread Starter
    New Member
    Join Date
    Oct 2011
    Posts
    3

    Re: Reading bytes from files and encoding

    Yes, I did run this code and yes I did check it in a debugger. I just wrote some code to simply test what I said, take a look:
    Code:
    Dim encoding1 As New System.Text.ASCIIEncoding()
            Dim bytes() As Byte = File.ReadAllBytes("C:\WINDOWS\notepad.exe")
            Dim str As String = encoding1.GetString(bytes)
            Dim rebytes() As Byte = encoding1.GetBytes(str)
            Dim same As String = "no"
            If Convert.ToBase64String(bytes) = Convert.ToBase64String(rebytes) Then
                same = "yes"
            End If
            Console.WriteLine("ASCII - bytes.length == " & bytes.Length & ", rebytes.length == " & rebytes.Length & ", same? - " & same.ToString())
    // etc for all other available encoding types
    and the result is:
    Code:
    ASCII - bytes.length == 70144, rebytes.length == 70144, same? - no
    Unicode - bytes.length == 70144, rebytes.length == 70144, same? - no
    UTF8 - bytes.length == 70144, rebytes.length == 107789, same? - no
    UTF7 - bytes.length == 70144, rebytes.length == 182818, same? - no
    UTF32 - bytes.length == 70144, rebytes.length == 70144, same? - no
    What should I do?

  4. #4
    Frenzied Member
    Join Date
    Jul 2011
    Location
    UK
    Posts
    1,335

    Re: Reading bytes from files and encoding

    Try
    Code:
       Dim encoding1 As System.Text.Encoding = System.Text.Encoding.Default
    But the MSDN gives a warning about using Default Encoding.

    I don't think it's going to solve your underlying problem, tho. It's going to depend on the source of your bytes and what their original encoding was (assuming they refer to Text to begin with).

  5. #5

    Thread Starter
    New Member
    Join Date
    Oct 2011
    Posts
    3

    Re: Reading bytes from files and encoding

    Wow, it actually solved my problem. Thank you very much for that

    I'm unsure if the way I'm doing things is correct though. What I'm doing is getting bytes of an assembly file, converting them to string and then crypting with AES. Program then decrypts the string, converts it back to bytes and loads the assembly. Considering default encoding types are different on different locales, will the assembly load on every machine? I mean - will Assembly.Load work with this specific assembly taken from my system?

    EDIT:
    I forgot to add, that the code decrypting (and using default encoding) will also be run on different machines. I think because of that, I need to find the specific encoding used in my system and force the application to use it no matter what the default is. How would I go about checking that?
    Last edited by danw8; Oct 4th, 2011 at 06:27 PM.

  6. #6
    Frenzied Member
    Join Date
    Jul 2011
    Location
    UK
    Posts
    1,335

    Re: Reading bytes from files and encoding

    I was just thinking the same. Code Page 1252 on mine.

    Code:
          Dim encoding As System.Text.Encoding = System.Text.Encoding.Default
          Dim page As Integer = encoding.CodePage
          ' gives page = 1252
    Code:
         Dim encoding1 As System.Text.Encoding = System.Text.Encoding.GetEncoding(1252)
    Docs here.

    Don't ask me if it will work on all PCs. What you are doing is beyond me

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width