Results 1 to 6 of 6

Thread: [RESOLVED] Best way to extract string from byte array or memorystream

  1. #1

    Thread Starter
    Frenzied Member
    Join Date
    Nov 2005
    Posts
    1,834

    Resolved [RESOLVED] Best way to extract string from byte array or memorystream

    Hi, I'm using WinPcap to capture incoming data. Basically I'm dealing with a continuous stream of incoming data, either as byte array or as memorystream (pieces up to 64kb).

    I need to check this data to see if there's a certain url and then extract it.

    The url I need to find looks like this:

    Code:
    http://201.122.38.5/data/today/798572987-589571?139805890582
    The length of the 'code' after "/today/" is always the same. The IP address changes.

    What's the best/fastest/most efficient way to continuously check these byte arrays or memorystreams and extract the urls without hogging the CPU too much?

    I'm using Framework 4.0.


    vb.net Code:
    1. Private Sub PacketHandler(ByVal packet As Packet)
    2.  
    3.     packet.Ethernet.IpV4.Tcp.Payload.ToMemoryStream()
    4.  
    5.     '// or
    6.  
    7.     packet.Ethernet.IpV4.Tcp.Payload.ToArray()
    8. End Sub

  2. #2
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    111,221

    Re: Best way to extract string from byte array or memorystream

    Encoding.GetString will convert a Byte array to a String. You just have to pick the appropriate Encoding object, e.g. Encoding.ASCII.
    Why is my data not saved to my database? | MSDN Data Walkthroughs
    VBForums Database Development FAQ
    My CodeBank Submissions: VB | C#
    My Blog: Data Among Multiple Forms (3 parts)
    Beginner Tutorials: VB | C# | SQL

  3. #3

    Thread Starter
    Frenzied Member
    Join Date
    Nov 2005
    Posts
    1,834

    Re: Best way to extract string from byte array or memorystream

    Thanks. I'm wondering if it wouldn't be better to loop through the byte array and search for a pattern of bytes and only convert the required bytes to a String?

    The data might be coming in at a few MegaBytes per second and converting all that data to a String will probably hog the CPU quite a lot.

  4. #4
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    9,017

    Re: Best way to extract string from byte array or memorystream

    If its an ANSI string(basically 1 byte per character) and then that would be more efficient. However, if its a unicode string then it may not be so simple since unicode have some nuances that may get in the way(Eg . Somtimes a character could be represented as 3 bytes instead of 2...in the same string!!). Evil_Giraffe seems to knows a lot about unicode strings and strings in general in the .Net framework. He may be able to provide a more solid answer.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  5. #5

    Thread Starter
    Frenzied Member
    Join Date
    Nov 2005
    Posts
    1,834

    Re: Best way to extract string from byte array or memorystream

    The best solution seems to be the Boyer–Moore string search algorithm. It's also used in the ngrep commandline tool.

    http://en.wikipedia.org/wiki/Boyer%E...arch_algorithm

  6. #6
    PowerPoster Evil_Giraffe's Avatar
    Join Date
    Aug 2002
    Location
    Suffolk, UK
    Posts
    2,555

    Re: Best way to extract string from byte array or memorystream

    Quote Originally Posted by Niya View Post
    Evil_Giraffe seems to knows a lot about unicode strings and strings in general in the .Net framework. He may be able to provide a more solid answer.
    The only thing I would have said is that you must know the encoding used at the source. But jmc already said that:

    Quote Originally Posted by jmcilhinney View Post
    You just have to pick the appropriate Encoding object, e.g. Encoding.ASCII.
    If you don't know the encoding used, then you don't have an encoded string, you have a bunch of bytes.

    The other issue to deal with is that the stream is not going to be one long string, you will have 'messages' that you need to extract and find the string of. It may be tempting to try and do a byte search down at the low level stream, but I think you'll stay saner if you use the message protocol to decompose the stream into messages and be able to grab the bytes that relate to the address directly, and then decode into the string.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width