Results 1 to 14 of 14

Thread: Extract text from a binary file

  1. #1

    Thread Starter
    PowerPoster MidgetsBro's Avatar
    Join Date
    Oct 2000
    Location
    Apparently, Internet.com
    Posts
    3,125
    I have a file with a .bsp extension (a Quake 2 map). I know that there is some text I would like to extract from the file, because I looked at it with a hex editor. Before the text in everyone of these files is the word "message" then a space then "<the text i want to extract.>" The maximum length of text that can fit in there is 50 characters, including the quotes. How can I search the binary file, then extract the text and put it in a text box? The text that would be extracted is the title of the map so I want it to be pretty quick. Thanks.
    <removed by admin>

  2. #2
    Frenzied Member HarryW's Avatar
    Join Date
    Jan 2000
    Location
    Heiho no michi
    Posts
    1,827
    How big is the file? Is the whole string on one line? I'm wondering because it's easiest to read the whole file into a string, but that may not be practical if it's a huge file. If the message is always on one line you can read one line at a time. Otherwise I guess you could take samples of a certain number of characters, overlapping by the size of the string "message" (7 characters obviously). Or just read it character by character, and keep track of the last 6 characters. When they make up "message" then you know you've hit the spot.

    Okay, using the last method, try this:

    Code:
    Dim Input(1 To 6) As String * 1
    Dim StartChar As Long, NextChar As Long
    Dim TestStr As String * 6
    Dim Found As Boolean
    Dim Buffer As String
    
    Found = False
    Buffer = ""
    StartChar = 1
    
    Open YourFileName For Random As #1
    	Do Until EOF(1) Or Found
    		If StartChar = 6 Then
    			StartChar = 1
    			NextChar = 2
    		Else 
    			StartChar = StartChar + 1
    			If NextChar = 6 Then 
    				NextChar = 1
    			Else
    				NextChar = StartChar + 1
    			End If
    		End If
    		Get #1, , Input(NextChar)
    		If Input(NextChar) = "e" Then
    			For x = StartChar To 6
    					TestStr = TestStr & Input(x)
    			Next X
    			For x = 1 To StartChar
    					TestStr = TestStr & Input(x)
    			Next x
    			If TestStr = "message" Then Found = True
    		End If
    	Loop
    	If Found Then	'Found the String "message"
    		Get #1, , Input(2)	'Let the space go
    		Get #1, , Input(2)	'And the first speech mark
    		Do Until EOF(1) Or Input(1) = Chr(34)  'Chr(34) is a speech mark... I think
    			Get #1, , Input(1)
    			Buffer = Buffer & Input(1)
    		Loop
    		MsgBox Chr(34) & "Message" & Chr(34)  & " = "  & Buffer
    	Else
    		MsgBox "Message not found"
    	End If
    Close #1
    I haven't tested it (as usual) cos it's 5 am and I ought to bloody well go to bed but I think that should work with a little tweaking.



    [Edited by HarryW on 11-27-2000 at 12:02 AM]
    Harry.

    "From one thing, know ten thousand things."

  3. #3

    Thread Starter
    PowerPoster MidgetsBro's Avatar
    Join Date
    Oct 2000
    Location
    Apparently, Internet.com
    Posts
    3,125
    The files that I am opening are usually about a meg. They can be smaller or larger, but that's the average. Will this code find the message in a reasonable amount of time? I just want the user to be able to click on the file name in the file box, and then the title of the map will load into another textbox. The filename isn't the same as the Title, so I can't use that. I will test this code to see if it works in a reasonable time. I think this code would be similar to reading the ID3 tag of an MP3 file. I tried getting code from that, but I couldn't figure out how to tweak it the way I wanted. I will try your code.

    Thanks
    <removed by admin>

  4. #4
    transcendental analytic kedaman's Avatar
    Join Date
    Mar 2000
    Location
    0x002F2EA8
    Posts
    7,221
    Does the text exist on a fixed location in the file? Or do you know whereabout the text is located? Can you find any information in the file that points out where or whereabout the text is located? If so you could speed it upto notime at all, otherways you may experience a quarter of a second per click. Harrys method is good but a bit slow if you want speed, you could read the whole file and the go trough it with instr...
    Use
    writing software in C++ is like driving rivets into steel beam with a toothpick.
    writing haskell makes your life easier:
    reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
    To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.

  5. #5

    Thread Starter
    PowerPoster MidgetsBro's Avatar
    Join Date
    Oct 2000
    Location
    Apparently, Internet.com
    Posts
    3,125
    There is no fixed position for the text. I know that the word "message" always comes before it. That's why I thought that if I found the word "message", then got the next 50 characters, it would work. I know that the max characters is 50, but I don't think it works the same as an ID3 tag, because it doesn't take up all the 50 bytes of space. If the text is only 10 characters long, then the 40 free bytes are filled in by the rest of the file. It doesn't leave blank space. How could I use instr to get the info, if I can use instr, that is
    <removed by admin>

  6. #6
    transcendental analytic kedaman's Avatar
    Join Date
    Mar 2000
    Location
    0x002F2EA8
    Posts
    7,221
    I haven't done any tests with this code, but should work generally, there might be some offset problems but you could easily fix them. It reads your file in 65535 byte chunks and search trough them for message, after each chunk is checked it seeks 7 step backwards in case the message is between two chunks. If found it either take the next 50 characters directly from the chunk or reads it from the file if it's in the end of the chunk.
    Code:
    Dim buffer As String * 65535, pos, message As String
    Open file For Binary As 1
        Do While Loc(1) < LOF(1)
            Get #1, , buffer
            Seek #1, Seek(1) - 7
            pos = InStr(buffer, "message")
            If pos Then
                If pos > 65478 Then
                    message = Space(50)
                    Get #1, Seek(1) - 65535 + pos, message
                Else
                    message = Mid(buffer, pos, 50)
                End If
                'and here you strip of the characters you don't need
                Exit Do
            End If
        Loop
    Close 1
    Use
    writing software in C++ is like driving rivets into steel beam with a toothpick.
    writing haskell makes your life easier:
    reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
    To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.

  7. #7

    Thread Starter
    PowerPoster MidgetsBro's Avatar
    Join Date
    Oct 2000
    Location
    Apparently, Internet.com
    Posts
    3,125
    I get an error from this line:
    Code:
    Dim buffer As String * 65535, pos, message As String
    The error says this:
    Invalid length for fixed-length string

    Does this mean that the file that it is trying to load needs to be under 65535 bytes? The code doesn't really make that much sense to me. Maybe you can comment it more and I can get the gist of it.

    Thanks
    <removed by admin>

  8. #8
    transcendental analytic kedaman's Avatar
    Join Date
    Mar 2000
    Location
    0x002F2EA8
    Posts
    7,221
    Nope, that was something stupid by me, fixed length strings can't be longer than 65535 bytes, but i forgot that the string descriptor itself takes up 20 bytes. What more confuses me is that fixed length strings don't need this descriptor, and furthermore you actually need 131052 bytes since the strings in vb are unicode. that is 2^17, and not 2^16 which would make more sense.

    ok if that confused you even more, just forget it, strings in vb are horrible. Just decrease all string lengths by 10 and it shoud work.

    I have a file tutorial on my homepage that explains almost everything about accessing files in vb, including how open files in binary. For fast reference on
    Seek statement - moves the "cursor" in the file to a position while
    Seek function - returns the "cursor" position in the file.
    Get statement: reads data into a variable, where you can also specify the position (the middle argument) otherways it will read at the current position, which will also move forward as you read.

    Use
    writing software in C++ is like driving rivets into steel beam with a toothpick.
    writing haskell makes your life easier:
    reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
    To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.

  9. #9

    Thread Starter
    PowerPoster MidgetsBro's Avatar
    Join Date
    Oct 2000
    Location
    Apparently, Internet.com
    Posts
    3,125
    Ok. The code doesn't throw errors, and I temporarily inserted code that popped up a message box containing the message, but it only made a blank messagebox. how do i get the text that I am extracting?

    Thanks

    [Edited by MidgetsBro on 11-27-2000 at 10:53 PM]
    <removed by admin>

  10. #10
    transcendental analytic kedaman's Avatar
    Join Date
    Mar 2000
    Location
    0x002F2EA8
    Posts
    7,221
    Well this is guarantied to work, i just tested on a 4M .bsp file and it took less than a quarter of a second to search (it was in the end of the file)
    I added 4 extra characters "message" " that seems to be common.
    Code:
        Dim buffer As String * 65525, pos, message As String
        Open file For Binary As 1
            Do While Loc(1) < LOF(1)
                Get #1, , buffer
                Seek #1, Seek(1) - 11
                pos = InStr(buffer, """message"" """)
                If pos Then
                    If pos > 65464 Then
                        message = Space(50)
                        Get #1, Seek(1) - 65536 + pos, message
                    Else
                        message = Mid(buffer, pos + 11, 50)
                    End If
                    message = Left(message, InStr(message, """") - 1)
                    Exit Do
                End If
            Loop
        Close 1
        
        Debug.Print message
    Use
    writing software in C++ is like driving rivets into steel beam with a toothpick.
    writing haskell makes your life easier:
    reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
    To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.

  11. #11

    Thread Starter
    PowerPoster MidgetsBro's Avatar
    Join Date
    Oct 2000
    Location
    Apparently, Internet.com
    Posts
    3,125
    I tried the code, but the debug.print thing didn't print anything to the form. I tried using a messagebox instead of debug.print, and it just opened a blank message box. How am I supposed to get the code out of the binary file? Thank you kedaman for your patience, I know this is probably getting annoying by now. I just want to find this out so that I can release my program that I made. Thanks for all the help you have given me so far.
    <removed by admin>

  12. #12
    Frenzied Member HarryW's Avatar
    Join Date
    Jan 2000
    Location
    Heiho no michi
    Posts
    1,827
    MidgetsBro, why don't you try and debug it yourself? I mean, I'm not getting at ya or anything, but it would be much easier and quicker if you were to put in a few breakpoints and test it yourself to see why it isn't working for you. Kedaman seems to have got it to work on his computer so the problem should logically be in your implementation of the code.

    Make sure you are opening the right file for one thing, I know it might sound patronising but it's probably just a silly little mistake like that that's causing you all the grief. Make sure you have specified the full path for the file or it will create a file named that in the root directory. You can use App.Path for the directory that contains the executable if you need to.
    Harry.

    "From one thing, know ten thousand things."

  13. #13

    Thread Starter
    PowerPoster MidgetsBro's Avatar
    Join Date
    Oct 2000
    Location
    Apparently, Internet.com
    Posts
    3,125
    I have tried implementing this code in lots of different ways. I've tried using a command button, I've tried using file1_click, and file1_dblClick. None of these worked, and all I did was copy and pasted the code directly from his post. I just can't figure it out. I don't have to set any break points, because I knew where the code had an error. The error is fixed, but the variable that is supposed to contain the text is blank. It's supposed to have the text in it but when I do a MsgBox message, nothing happens. Well not nothing. The message box is blank. I will probably find another way to do this, or not at all, because this is proving to be too difficult for me. I'll go back to making simple (useless) projects.
    <removed by admin>

  14. #14

    Thread Starter
    PowerPoster MidgetsBro's Avatar
    Join Date
    Oct 2000
    Location
    Apparently, Internet.com
    Posts
    3,125
    Thank you guys soooo much. I finally got it to work. I wasn't getting the path right. I tried it with a fixed file, instead of a file that the user would choose, and it worked perfectly. You are now my favorite two people on this message board. Now all I have to do is fix it so that the user can choose what map to see what the title of the map is. Once again, thanks for all your efforts to help me.

    MidgetsBro
    <removed by admin>

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width