Results 1 to 11 of 11

Thread: Parsing concatenated string

  1. #1

    Thread Starter
    PowerPoster
    Join Date
    Apr 2007
    Location
    The Netherlands
    Posts
    5,070

    Parsing concatenated string

    Hi,

    I am writing a tool that logs stats for a first person shooter game. The game writes its information (including who killed who with which weapon) to a log file, which I read every so many seconds and parse so that I can write that information to a database.

    I am now having trouble parsing the weapon out of that string of information.


    Some additional information is required: a weapon in this game can contain attachments (such as a scope, a grenade launcher, etc). Weapons can have either 1 or 2 attachments. Some weapons however cannot have any attachments at all.

    Each log file entry that describes a kill contains a code that describes the weapon that was used. This code is a concatenation of either 3 or 4 parts:
    Code:
    <weapon>_<attachment>_mp
    <weapon>_<attachment1>_<attachment2>_mp
    where <weapon> is a code that describes a weapon and <attachment> is another code describing the attachment.


    This should be easy to parse by just splitting along the underscore characters, but there's a few catches:

    1. Some attachments are able to kill players as well. Specifically: grenade launchers, flame throwers and underbarrel shotguns. In this case, the attachment is listed before the weapon:
    Code:
    <attachment>_<weapon>_mp
    Note also that in this case there is always only 1 attachment.

    2. The biggest catch: some weapon names have an underscore in them (some even have 3 underscores)! So simply splitting along the underscore won't work in all cases; if the weapon name contains an underscore I'm splitting the name of the weapon...


    This makes the list of possible combinations a lot longer. The ones I can think of (I think these are all):
    Code:
    Weapon names without underscores:
    
    1. <weapon>_mp								-	No attachments
    2. <weapon>_<attachment>_mp					-	One attachment
    3. <attachment>_<weapon>_mp					- 	Attachment kill
    4. <weapon>_<attachment>_<attachment>_mp	-	Two attachments
    
    Weapon names with underscores:
    
    5. <weaponpart1>_<weaponpart2>_mp										- No attachments
    6. <weaponpart1>_<weaponpart2>_<weaponpart3>_mp							- No attachments
    7. <weaponpart1>_<weaponpart2>_<weaponpart3>_<weaponpart4>_mp			- No attachments
    Luckily, all weapons with an underscore in their name cannot have any attachments. Probably a coincidence, but that makes the list a lot shorter (otherwise the first 4 options would be repeated for option 5, 6 and 7, making a total of 19 options if I counted correctly).


    My question now I guess: can anyone see an easy way to parse this weapon code so that I can extract the weapon name and the attachments used separately? I started out with splitting along the underscore and checking the length of the resulting array. If the length is 2 then it's always option 1. If the length is 3 then there's already 3 options. This adds up really fast, my code was already way too long at this point and impossible to understand This approach requires a lot of "trial and error", where I parse out the first part and basically do;
    1. Check if it represents a weapon
    2. If not, check if it's an attachment
    3. If not, check if it's part of a weapon

    As you can imagine this gets really ugly. There must be a better way to parse this stuff?

    If anyone can see an easy way please let me know!


    Finally if you need it here's the list of weapons and attachments:

    Weapons (code, full name, bitmask of possible attachments):
    Code:
    ak47   AK47   37627
    ak74u   AK74u   49595
    asp   ASP   262144
    aug   AUG   37627
    knife_ballistic   Ballistic Knife   0
    china_lake   China Lake   0
    m1911   M1911   295968
    commando   Commando   37627
    crossbow_explosive   Crossbow   0
    cz75   CZ75   820256
    dragunov   Dragunov   164385
    enfield   Enfield   37627
    famas   Famas   37627
    fnfal   FN FAL   37627
    g11   G11   133120
    galil   Galil   37627
    hk21   HK21   563
    hs10   HS10   262144
    ithaca   Stakeout   256
    kiparis   Kiparis   311603
    knife   Knife   0
    l96a1   L96A1   164385
    m14   M14   37875
    m16   M16   37627
    m60   M60   819
    m72_law   M72 LAW   0
    mac11   MAC11   311602
    makarov   Makarov   295968
    mp5k   MP5K   49203
    mpl   MPL   49435
    pm63   PM63   278816
    psg1   PSG1   164385
    python   Python   335873
    rottweil72   Olympia   0
    rpg   RPG   0
    rpk   RPK   571
    skorpion   Skorpion   311584
    spas   SPAS-12   32768
    spectre   Spectre   49459
    stoner63   Stoner63   563
    strela   Strela-3   0
    uzi   Uzi   49459
    wa2000   WA2000   164385
    concussion_grenade   Concussion Grenade   0
    flash_grenade   Flash Grenade   0
    frag_grenade   Frag Grenade   0
    sticky_grenade   Semtex   0
    tabun_gas   Nova Gas   0
    willy_pete   Willy Pete   0
    ft   Flamethrower   0
    gl   Grenade Launcher   0
    mk   Masterkey Shotgun   0
    airstrike   Rolling Thunder   0
    auto_gun_turret   Sentry Gun   0
    cobra_20mm_comlink   Attack Helicopter   0
    dog_bite   Attack Dogs   0
    hind_minigun_pilot_firstperson   Gunship [Minigun]   0
    huey_minigun_gunner   Chopper Gunner   0
    m220_tow   Valkyrie Rockets   0
    mortar   Mortar Team   0
    napalm   Napalm Strike   0
    rcbomb   RC-XD   0
    claymore   Claymore   0
    satchel_charge   C4   0
    hatchet   Tomahawk   0
    explosive_bolt   Explosive Bolt   0
    explodable_barrel   Explodable Barrel   0
    hind_rockets_firstperson   Gunship [Rockets]   0
    minigun_mp   Death Machine   0
    m202_flash   Grim Reaper   0
    Attachments (code, full name, bitmask):
    Code:
    acog   ACOG Sight   1
    reflex   Reflex Sight   2
    drum   Drum Mag   4
    dualclip   Dual Mag   8
    elbit   Red Dot Sight   16
    extclip   Extended Mag   32
    ft   Flamethrower   64
    gl   Grenade Launcher   128
    grip   Grip   256
    ir   Infrared Scope   512
    upgradesight   Upgraded Iron Sights   1024
    lps   Low Power Scope   2048
    mk   Masterkey   4096
    speed   Speed Reloader   8192
    rf   Rapid Fire   16384
    silencer   Suppressor   32768
    snub   Snub Nose   65536
    vzoom   Variable Zoom   131072
    dw   Dual Wield   262144
    auto   Full Auto Upgrade   524288
    The code is the name of the weapon as it gets written to the log file (so this is the name I am parsing out, the full name is irrelevant). As you can also see, there's some weapons with underscores in the name, but all of them have an attachments bitmask of 0 meaning they cannot have any attachments.

  2. #2
    Hyperactive Member
    Join Date
    Apr 2011
    Location
    England
    Posts
    421

    Re: Parsing concatenated string

    Hi Nick,

    I don't think there is any way you can write a function that will always return the weapon name from what you have described.

    Since the weapons and attachments are all wrapped using <>'s the only way to tell one apart from the other would be using some form of lookup.

    I was thinking a regex pattern would be the most practicle way to get the information you need. You could also write an IndexOf function to achieve the same results but the results as far as I can see are always going to contain an array of strings. And that array is always going to be either:

    WeaponName, Attachment etc.
    Or
    Attachment, WeaponName

    With those results you are never going to be able to tell the Weapon apart from the attachment, if the resulting array contains 2 strings, without a precompiled list to compare each result to.

    Can you create a list of strings containing all of the weapon values and/or the same for the attachments. Then when you have narrowed the input from your log file you can loop through your list(s) to see what weapon/attachment(s) was used.
    Last edited by JayJayson; Jun 25th, 2011 at 09:16 AM.

  3. #3

    Thread Starter
    PowerPoster
    Join Date
    Apr 2007
    Location
    The Netherlands
    Posts
    5,070

    Re: Parsing concatenated string

    Sorry, the names are not wrapped in braces <>, that was just to indicate a 'place holder'.

    A few data examples:
    Code:
    ak47_mp       (normal ak47)
    ak47_silencer_mp     (ak47 with silencer)
    ft_ak47_mp     (flamethrower under ak47, killed by fire)
    ak47_ft_mp     (ak47 with flamethrower, killed by bullets)
    hind_minigun_pilot_firstperson_mp    (helicopter bullets)
    And I do have a list of all possible weapon/attachment names, that's the two lists at the end of my post.

    I can of course just try if there's a match in those lists, but that would get quite ugly, so I was hoping someone could come up with a better way.

  4. #4
    New Member
    Join Date
    Jun 2011
    Posts
    4

    Re: Parsing concatenated string

    You can try working this from a different direction. I am assuming that _mp is the bitmap value.

    Start by first checking the last to characters in your string. If the value is _0, trim the last two characters and you are left with the weapon name. (Conditions 5 6 or 7)

    If not you can split the string by the _.
    Result 2 = Condition 1
    Result 4 = condition 4

    Result 3
    The final 2 conditions can be parsed out by grabbing the mp value and converting it to binary.
    If the result has only 1 "1" in it. it is an attachment Condition 3
    If not you are left with condition 2.


    If I am wrong about the MP value then of course this post will be no help at all.

  5. #5

  6. #6
    Hyperactive Member
    Join Date
    Apr 2011
    Location
    England
    Posts
    421

    Re: Parsing concatenated string

    Hi Nick,

    Have you made any progress? If not I have managed to make a function that can retrieve the weapons/attachments from the log string. I used a couple of HashTables to accomplish it and used the HashCodes of the weapon names as the Keys. All of the HashCodes are unique and I assume they would be on every other OS as well.

    I don't know if it's by any means the best approach or how quickly it will perform within a loop but maybe it will help you find a better method if it doesn't perform all that well:

    VB.NET Code:
    1. Imports System.IO
    2.  
    3. Public Class Form1
    4.     Dim HTWeapons As New Hashtable
    5.     Dim HTAttachments As New Hashtable
    6.  
    7.     Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
    8.  
    9.         'Weapons HashTable
    10.         Dim Weapons() As String = File.ReadAllLines("Path\Weapons.txt")
    11.         For Each w As String In Weapons
    12.             HTWeapons.Add(w.GetHashCode, w)
    13.         Next w
    14.  
    15.         'Attachments HashTable
    16.         Dim Attachments() As String = File.ReadAllLines("Path\Attachments.txt")
    17.         For Each a As String In Attachments
    18.             HTAttachments.Add(a.GetHashCode, a)
    19.         Next a
    20.  
    21.         'Get the part of the Logfile to check (assumes the _mp was stripped from the end)
    22.         Dim LogString As String = "ak47_ft"
    23.  
    24.         'Get a list of the weapons/attachments in order of appearance within the log string
    25.         Dim Results As List(Of String) = SearchLog(LogString, "_"c)
    26.  
    27.         For Each s As String In Results
    28.             MsgBox(s)
    29.             's will either be wrapped in <Weapon></Weapon> or <attachment></attachment> tags now so you can easily figure out the order etc.
    30.         Next
    31.     End Sub
    32.  
    33.     Private Function SearchLog(ByVal LogInput As String, ByVal SplitChar As Char) As List(Of String)
    34.         Dim Results As New List(Of String)
    35.  
    36.         Dim StartIndex As Integer = 0
    37.         Dim NextSplitChar As Integer = LogInput.IndexOf(SplitChar)
    38.         If NextSplitChar = -1 Then
    39.             'No underscores means that we should be looking at a weapon
    40.             Results.Add(HTWeapons.Item(LogInput.GetHashCode))
    41.             Return Results
    42.         End If
    43.  
    44.         Do While True
    45.             'As the maximum number of underscores in any weapon/attachment is 3
    46.             'control upto how many underscores to look past when creating a substring
    47.             For i As Integer = 0 To 3
    48.                 Dim GetSubString As String = LogInput.Substring(StartIndex, NextSplitChar - StartIndex)
    49.                 Dim HashCode As Integer = GetSubString.GetHashCode
    50.                 If HTWeapons.ContainsKey(HashCode) Then
    51.                     'Weapon found in log
    52.                     Results.Add("<Weapon>" & HTWeapons.Item(HashCode) & "</Weapon>")
    53.                     StartIndex = NextSplitChar
    54.                     Exit For
    55.                 ElseIf HTAttachments.ContainsKey(HashCode) Then
    56.                     'Attachment found in log
    57.                     Results.Add("<Attachment>" & HTAttachments.Item(HashCode) & "</Attachment>")
    58.                     StartIndex = NextSplitChar
    59.                     Exit For
    60.                 Else
    61.                     'Nothing found, extend substring to include chars upto the next splitchar occurence
    62.                     If NextSplitChar >= LogInput.Length Then Exit Do
    63.                     NextSplitChar = LogInput.IndexOf(SplitChar, NextSplitChar + 1)
    64.                     If NextSplitChar = -1 Then NextSplitChar = LogInput.Length
    65.                 End If
    66.             Next i
    67.             'When a result is found, reset the startindex so we can look for the next weapon/attachment in the string
    68.             If NextSplitChar >= LogInput.Length Then Exit Do
    69.             StartIndex = NextSplitChar + 1
    70.             NextSplitChar = LogInput.IndexOf(SplitChar, StartIndex)
    71.             If NextSplitChar = -1 Then NextSplitChar = LogInput.Length
    72.         Loop
    73.  
    74.         Return Results
    75.  
    76.     End Function
    77.  
    78. End Class
    And here's the contents of the files

    Attachments.txt:
    VB Code:
    1. acog
    2. reflex
    3. drum
    4. dualclip
    5. elbit
    6. extclip
    7. ft
    8. gl
    9. grip
    10. ir
    11. upgradesight
    12. lps
    13. mk
    14. speed
    15. rf
    16. silencer
    17. snub
    18. vzoom
    19. dw
    20. auto
    Weapons.txt (There were 3 attachments in the original list that I have removed):
    VB Code:
    1. ak47
    2. ak74u
    3. asp
    4. aug
    5. knife_ballistic
    6. china_lake
    7. m1911
    8. commando
    9. crossbow_explosive
    10. cz75
    11. dragunov
    12. enfield
    13. famas
    14. fnfal
    15. g11
    16. galil
    17. hk21
    18. hs10
    19. ithaca
    20. kiparis
    21. knife
    22. l96a1
    23. m14
    24. m16
    25. m60
    26. m72_law
    27. mac11
    28. makarov
    29. mp5k
    30. mpl
    31. pm63
    32. psg1
    33. python
    34. rottweil72
    35. rpg
    36. rpk
    37. skorpion
    38. spas
    39. spectre
    40. stoner63
    41. strela
    42. uzi
    43. wa2000
    44. concussion_grenade
    45. flash_grenade
    46. frag_grenade
    47. sticky_grenade
    48. tabun_gas
    49. willy_pete
    50. airstrike
    51. auto_gun_turret
    52. cobra_20mm_comlink
    53. dog_bite
    54. hind_minigun_pilot_firstperson
    55. huey_minigun_gunner
    56. m220_tow
    57. mortar
    58. napalm
    59. rcbomb
    60. claymore
    61. satchel_charge
    62. hatchet
    63. explosive_bolt
    64. explodable_barrel
    65. hind_rockets_firstperson
    66. minigun_mp
    67. m202_flash
    Hope this helps
    Last edited by JayJayson; Jun 25th, 2011 at 03:05 PM.

  7. #7
    New Member
    Join Date
    Jun 2011
    Posts
    4

    Re: Parsing concatenated string

    You can possibly clean up your code if you change the order you check things.

    Split the array and check the first string to see if the first string is a weapon part. (20 possible strings(Hind, M202, m72 ect.))
    If it is a weapon part the whole sting is a weapon with no attachments.
    Condition 5 ,6 or 7.

    Then it becomes easy to deal with the rest.

    Check the Array length.
    Result 2 = Condition 1
    Result 4 = condition 4

    Result 3
    Check first string in the array for attachment. (2 possible Strings(ft, gl))
    If attachment then condition 3

    else condition 2.

    You could speed this up a little by checking Condition 1 first, then Condition 3 Length array(0) is 2, Then (5,6,7). Then pick up condition 4and 2.
    Last edited by Qwert00; Jun 25th, 2011 at 03:17 PM.

  8. #8
    New Member
    Join Date
    Jun 2011
    Posts
    4

    Re: Parsing concatenated string

    I wrote a quick sub that i think will meet your needs. If you call Get_Weapon with your log file weapon string you can use WeaponStr, Attach1Str and Attach2Str as the values for your database. I added AttackKill as an attachment kill flag if you want to use it

    I appears that FT and GL are your only Attachment Kills so this should work. If this is not the case we can modify this slightly to deal with other attachment kills.

    I wrote the WeapPart check as a string for simplification. You can modify this to read the weap part strings from your database or a text file if you like.

    Code:
        Dim WeaponStr As String
        Dim Attach1Str As String
        Dim Attach2Str As String
        Dim AttachKill As Boolean
    
        Public Sub Get_Weapon(ByVal ReadStr As String)
    
            Dim ReadArr As Array
            Dim WeapPart As String
            Dim WeapPartArr As Array
            Dim I As Integer
    
            'Set Variables
            WeaponStr = ""
            Attach1Str = ""
            Attach2Str = ""
            AttachKill = false
            WeapPart = "auto,china,cobra,concussion,crossbow,dog,explodable,explosive,flash,frag,heuy,hind,knife,m202,m220,m72,minigun,satchel,sticky,tabun,willy"
    
            ' Read Log String
    
            ReadArr = Split(ReadStr, "_")
    
            ' Condition 1
    
            If ReadArr.Length = 2 Then
                WeaponStr = ReadArr(0)
                Exit Sub
            End If
    
            ' Condition 3
    
            If ReadArr(0).length = 2 Then
                WeaponStr = ReadArr(1)
                Attach1Str = ReadArr(0)
                AttachKill = true
                Exit Sub
            End If
    
            ' Condition 5, 6, 7
            WeapPartArr = Split(WeapPart, ",")
            For I = 0 To 20
                If ReadArr(0) = WeapPartArr(I) Then
                    WeaponStr = Replace(Microsoft.VisualBasic.Left(ReadStr, (Len(ReadStr) - 3)), "_", " ")
                    Exit Sub
                End If
            Next
    
            'Condition 2
    
            If ReadArr.Length = 3 Then
                WeaponStr = ReadArr(0)
                Attach1Str = ReadArr(1)
                Exit Sub
            End If
    
            'Condition 4
    
            If ReadArr.Length = 4 Then
                WeaponStr = ReadArr(0)
                Attach1Str = ReadArr(1)
                Attach2Str = ReadArr(2)
                Exit Sub
            End If
    
        End Sub
    Last edited by Qwert00; Jun 26th, 2011 at 08:42 AM.

  9. #9

  10. #10
    New Member
    Join Date
    Jun 2011
    Posts
    4

    Re: Parsing concatenated string

    That code should be able to handle MK with no changes.

    Fortunately, all three of your attachment kills start with 2 digit strings.

  11. #11

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width