|
-
Jun 25th, 2011, 06:06 AM
#1
Parsing concatenated string
Hi,
I am writing a tool that logs stats for a first person shooter game. The game writes its information (including who killed who with which weapon) to a log file, which I read every so many seconds and parse so that I can write that information to a database.
I am now having trouble parsing the weapon out of that string of information.
Some additional information is required: a weapon in this game can contain attachments (such as a scope, a grenade launcher, etc). Weapons can have either 1 or 2 attachments. Some weapons however cannot have any attachments at all.
Each log file entry that describes a kill contains a code that describes the weapon that was used. This code is a concatenation of either 3 or 4 parts:
Code:
<weapon>_<attachment>_mp
<weapon>_<attachment1>_<attachment2>_mp
where <weapon> is a code that describes a weapon and <attachment> is another code describing the attachment.
This should be easy to parse by just splitting along the underscore characters, but there's a few catches:
1. Some attachments are able to kill players as well. Specifically: grenade launchers, flame throwers and underbarrel shotguns. In this case, the attachment is listed before the weapon:
Code:
<attachment>_<weapon>_mp
Note also that in this case there is always only 1 attachment.
2. The biggest catch: some weapon names have an underscore in them (some even have 3 underscores)! So simply splitting along the underscore won't work in all cases; if the weapon name contains an underscore I'm splitting the name of the weapon...
This makes the list of possible combinations a lot longer. The ones I can think of (I think these are all):
Code:
Weapon names without underscores:
1. <weapon>_mp - No attachments
2. <weapon>_<attachment>_mp - One attachment
3. <attachment>_<weapon>_mp - Attachment kill
4. <weapon>_<attachment>_<attachment>_mp - Two attachments
Weapon names with underscores:
5. <weaponpart1>_<weaponpart2>_mp - No attachments
6. <weaponpart1>_<weaponpart2>_<weaponpart3>_mp - No attachments
7. <weaponpart1>_<weaponpart2>_<weaponpart3>_<weaponpart4>_mp - No attachments
Luckily, all weapons with an underscore in their name cannot have any attachments. Probably a coincidence, but that makes the list a lot shorter (otherwise the first 4 options would be repeated for option 5, 6 and 7, making a total of 19 options if I counted correctly).
My question now I guess: can anyone see an easy way to parse this weapon code so that I can extract the weapon name and the attachments used separately? I started out with splitting along the underscore and checking the length of the resulting array. If the length is 2 then it's always option 1. If the length is 3 then there's already 3 options. This adds up really fast, my code was already way too long at this point and impossible to understand This approach requires a lot of "trial and error", where I parse out the first part and basically do;
1. Check if it represents a weapon
2. If not, check if it's an attachment
3. If not, check if it's part of a weapon
As you can imagine this gets really ugly. There must be a better way to parse this stuff?
If anyone can see an easy way please let me know!
Finally if you need it here's the list of weapons and attachments:
Weapons (code, full name, bitmask of possible attachments):
Code:
ak47 AK47 37627
ak74u AK74u 49595
asp ASP 262144
aug AUG 37627
knife_ballistic Ballistic Knife 0
china_lake China Lake 0
m1911 M1911 295968
commando Commando 37627
crossbow_explosive Crossbow 0
cz75 CZ75 820256
dragunov Dragunov 164385
enfield Enfield 37627
famas Famas 37627
fnfal FN FAL 37627
g11 G11 133120
galil Galil 37627
hk21 HK21 563
hs10 HS10 262144
ithaca Stakeout 256
kiparis Kiparis 311603
knife Knife 0
l96a1 L96A1 164385
m14 M14 37875
m16 M16 37627
m60 M60 819
m72_law M72 LAW 0
mac11 MAC11 311602
makarov Makarov 295968
mp5k MP5K 49203
mpl MPL 49435
pm63 PM63 278816
psg1 PSG1 164385
python Python 335873
rottweil72 Olympia 0
rpg RPG 0
rpk RPK 571
skorpion Skorpion 311584
spas SPAS-12 32768
spectre Spectre 49459
stoner63 Stoner63 563
strela Strela-3 0
uzi Uzi 49459
wa2000 WA2000 164385
concussion_grenade Concussion Grenade 0
flash_grenade Flash Grenade 0
frag_grenade Frag Grenade 0
sticky_grenade Semtex 0
tabun_gas Nova Gas 0
willy_pete Willy Pete 0
ft Flamethrower 0
gl Grenade Launcher 0
mk Masterkey Shotgun 0
airstrike Rolling Thunder 0
auto_gun_turret Sentry Gun 0
cobra_20mm_comlink Attack Helicopter 0
dog_bite Attack Dogs 0
hind_minigun_pilot_firstperson Gunship [Minigun] 0
huey_minigun_gunner Chopper Gunner 0
m220_tow Valkyrie Rockets 0
mortar Mortar Team 0
napalm Napalm Strike 0
rcbomb RC-XD 0
claymore Claymore 0
satchel_charge C4 0
hatchet Tomahawk 0
explosive_bolt Explosive Bolt 0
explodable_barrel Explodable Barrel 0
hind_rockets_firstperson Gunship [Rockets] 0
minigun_mp Death Machine 0
m202_flash Grim Reaper 0
Attachments (code, full name, bitmask):
Code:
acog ACOG Sight 1
reflex Reflex Sight 2
drum Drum Mag 4
dualclip Dual Mag 8
elbit Red Dot Sight 16
extclip Extended Mag 32
ft Flamethrower 64
gl Grenade Launcher 128
grip Grip 256
ir Infrared Scope 512
upgradesight Upgraded Iron Sights 1024
lps Low Power Scope 2048
mk Masterkey 4096
speed Speed Reloader 8192
rf Rapid Fire 16384
silencer Suppressor 32768
snub Snub Nose 65536
vzoom Variable Zoom 131072
dw Dual Wield 262144
auto Full Auto Upgrade 524288
The code is the name of the weapon as it gets written to the log file (so this is the name I am parsing out, the full name is irrelevant). As you can also see, there's some weapons with underscores in the name, but all of them have an attachments bitmask of 0 meaning they cannot have any attachments.
-
Jun 25th, 2011, 09:06 AM
#2
Hyperactive Member
Re: Parsing concatenated string
Hi Nick,
I don't think there is any way you can write a function that will always return the weapon name from what you have described.
Since the weapons and attachments are all wrapped using <>'s the only way to tell one apart from the other would be using some form of lookup.
I was thinking a regex pattern would be the most practicle way to get the information you need. You could also write an IndexOf function to achieve the same results but the results as far as I can see are always going to contain an array of strings. And that array is always going to be either:
WeaponName, Attachment etc.
Or
Attachment, WeaponName
With those results you are never going to be able to tell the Weapon apart from the attachment, if the resulting array contains 2 strings, without a precompiled list to compare each result to.
Can you create a list of strings containing all of the weapon values and/or the same for the attachments. Then when you have narrowed the input from your log file you can loop through your list(s) to see what weapon/attachment(s) was used.
Last edited by JayJayson; Jun 25th, 2011 at 09:16 AM.
-
Jun 25th, 2011, 10:18 AM
#3
Re: Parsing concatenated string
Sorry, the names are not wrapped in braces <>, that was just to indicate a 'place holder'.
A few data examples:
Code:
ak47_mp (normal ak47)
ak47_silencer_mp (ak47 with silencer)
ft_ak47_mp (flamethrower under ak47, killed by fire)
ak47_ft_mp (ak47 with flamethrower, killed by bullets)
hind_minigun_pilot_firstperson_mp (helicopter bullets)
And I do have a list of all possible weapon/attachment names, that's the two lists at the end of my post.
I can of course just try if there's a match in those lists, but that would get quite ugly, so I was hoping someone could come up with a better way.
-
Jun 25th, 2011, 10:20 AM
#4
New Member
Re: Parsing concatenated string
You can try working this from a different direction. I am assuming that _mp is the bitmap value.
Start by first checking the last to characters in your string. If the value is _0, trim the last two characters and you are left with the weapon name. (Conditions 5 6 or 7)
If not you can split the string by the _.
Result 2 = Condition 1
Result 4 = condition 4
Result 3
The final 2 conditions can be parsed out by grabbing the mp value and converting it to binary.
If the result has only 1 "1" in it. it is an attachment Condition 3
If not you are left with condition 2.
If I am wrong about the MP value then of course this post will be no help at all.
-
Jun 25th, 2011, 10:28 AM
#5
Re: Parsing concatenated string
No, mp is just mp, always. It just means 'multiplayer'. It's kind of redundant, I know, but hey I did not decide how the log files are written
-
Jun 25th, 2011, 02:54 PM
#6
Hyperactive Member
Re: Parsing concatenated string
Hi Nick,
Have you made any progress? If not I have managed to make a function that can retrieve the weapons/attachments from the log string. I used a couple of HashTables to accomplish it and used the HashCodes of the weapon names as the Keys. All of the HashCodes are unique and I assume they would be on every other OS as well.
I don't know if it's by any means the best approach or how quickly it will perform within a loop but maybe it will help you find a better method if it doesn't perform all that well:
VB.NET Code:
Imports System.IO
Public Class Form1
Dim HTWeapons As New Hashtable
Dim HTAttachments As New Hashtable
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
'Weapons HashTable
Dim Weapons() As String = File.ReadAllLines("Path\Weapons.txt")
For Each w As String In Weapons
HTWeapons.Add(w.GetHashCode, w)
Next w
'Attachments HashTable
Dim Attachments() As String = File.ReadAllLines("Path\Attachments.txt")
For Each a As String In Attachments
HTAttachments.Add(a.GetHashCode, a)
Next a
'Get the part of the Logfile to check (assumes the _mp was stripped from the end)
Dim LogString As String = "ak47_ft"
'Get a list of the weapons/attachments in order of appearance within the log string
Dim Results As List(Of String) = SearchLog(LogString, "_"c)
For Each s As String In Results
MsgBox(s)
's will either be wrapped in <Weapon></Weapon> or <attachment></attachment> tags now so you can easily figure out the order etc.
Next
End Sub
Private Function SearchLog(ByVal LogInput As String, ByVal SplitChar As Char) As List(Of String)
Dim Results As New List(Of String)
Dim StartIndex As Integer = 0
Dim NextSplitChar As Integer = LogInput.IndexOf(SplitChar)
If NextSplitChar = -1 Then
'No underscores means that we should be looking at a weapon
Results.Add(HTWeapons.Item(LogInput.GetHashCode))
Return Results
End If
Do While True
'As the maximum number of underscores in any weapon/attachment is 3
'control upto how many underscores to look past when creating a substring
For i As Integer = 0 To 3
Dim GetSubString As String = LogInput.Substring(StartIndex, NextSplitChar - StartIndex)
Dim HashCode As Integer = GetSubString.GetHashCode
If HTWeapons.ContainsKey(HashCode) Then
'Weapon found in log
Results.Add("<Weapon>" & HTWeapons.Item(HashCode) & "</Weapon>")
StartIndex = NextSplitChar
Exit For
ElseIf HTAttachments.ContainsKey(HashCode) Then
'Attachment found in log
Results.Add("<Attachment>" & HTAttachments.Item(HashCode) & "</Attachment>")
StartIndex = NextSplitChar
Exit For
Else
'Nothing found, extend substring to include chars upto the next splitchar occurence
If NextSplitChar >= LogInput.Length Then Exit Do
NextSplitChar = LogInput.IndexOf(SplitChar, NextSplitChar + 1)
If NextSplitChar = -1 Then NextSplitChar = LogInput.Length
End If
Next i
'When a result is found, reset the startindex so we can look for the next weapon/attachment in the string
If NextSplitChar >= LogInput.Length Then Exit Do
StartIndex = NextSplitChar + 1
NextSplitChar = LogInput.IndexOf(SplitChar, StartIndex)
If NextSplitChar = -1 Then NextSplitChar = LogInput.Length
Loop
Return Results
End Function
End Class
And here's the contents of the files
Attachments.txt:
VB Code:
acog
reflex
drum
dualclip
elbit
extclip
ft
gl
grip
ir
upgradesight
lps
mk
speed
rf
silencer
snub
vzoom
dw
auto
Weapons.txt (There were 3 attachments in the original list that I have removed):
VB Code:
ak47
ak74u
asp
aug
knife_ballistic
china_lake
m1911
commando
crossbow_explosive
cz75
dragunov
enfield
famas
fnfal
g11
galil
hk21
hs10
ithaca
kiparis
knife
l96a1
m14
m16
m60
m72_law
mac11
makarov
mp5k
mpl
pm63
psg1
python
rottweil72
rpg
rpk
skorpion
spas
spectre
stoner63
strela
uzi
wa2000
concussion_grenade
flash_grenade
frag_grenade
sticky_grenade
tabun_gas
willy_pete
airstrike
auto_gun_turret
cobra_20mm_comlink
dog_bite
hind_minigun_pilot_firstperson
huey_minigun_gunner
m220_tow
mortar
napalm
rcbomb
claymore
satchel_charge
hatchet
explosive_bolt
explodable_barrel
hind_rockets_firstperson
minigun_mp
m202_flash
Hope this helps
Last edited by JayJayson; Jun 25th, 2011 at 03:05 PM.
-
Jun 25th, 2011, 03:05 PM
#7
New Member
Re: Parsing concatenated string
You can possibly clean up your code if you change the order you check things.
Split the array and check the first string to see if the first string is a weapon part. (20 possible strings(Hind, M202, m72 ect.))
If it is a weapon part the whole sting is a weapon with no attachments.
Condition 5 ,6 or 7.
Then it becomes easy to deal with the rest.
Check the Array length.
Result 2 = Condition 1
Result 4 = condition 4
Result 3
Check first string in the array for attachment. (2 possible Strings(ft, gl))
If attachment then condition 3
else condition 2.
You could speed this up a little by checking Condition 1 first, then Condition 3 Length array(0) is 2, Then (5,6,7). Then pick up condition 4and 2.
Last edited by Qwert00; Jun 25th, 2011 at 03:17 PM.
-
Jun 26th, 2011, 08:39 AM
#8
New Member
Re: Parsing concatenated string
I wrote a quick sub that i think will meet your needs. If you call Get_Weapon with your log file weapon string you can use WeaponStr, Attach1Str and Attach2Str as the values for your database. I added AttackKill as an attachment kill flag if you want to use it
I appears that FT and GL are your only Attachment Kills so this should work. If this is not the case we can modify this slightly to deal with other attachment kills.
I wrote the WeapPart check as a string for simplification. You can modify this to read the weap part strings from your database or a text file if you like.
Code:
Dim WeaponStr As String
Dim Attach1Str As String
Dim Attach2Str As String
Dim AttachKill As Boolean
Public Sub Get_Weapon(ByVal ReadStr As String)
Dim ReadArr As Array
Dim WeapPart As String
Dim WeapPartArr As Array
Dim I As Integer
'Set Variables
WeaponStr = ""
Attach1Str = ""
Attach2Str = ""
AttachKill = false
WeapPart = "auto,china,cobra,concussion,crossbow,dog,explodable,explosive,flash,frag,heuy,hind,knife,m202,m220,m72,minigun,satchel,sticky,tabun,willy"
' Read Log String
ReadArr = Split(ReadStr, "_")
' Condition 1
If ReadArr.Length = 2 Then
WeaponStr = ReadArr(0)
Exit Sub
End If
' Condition 3
If ReadArr(0).length = 2 Then
WeaponStr = ReadArr(1)
Attach1Str = ReadArr(0)
AttachKill = true
Exit Sub
End If
' Condition 5, 6, 7
WeapPartArr = Split(WeapPart, ",")
For I = 0 To 20
If ReadArr(0) = WeapPartArr(I) Then
WeaponStr = Replace(Microsoft.VisualBasic.Left(ReadStr, (Len(ReadStr) - 3)), "_", " ")
Exit Sub
End If
Next
'Condition 2
If ReadArr.Length = 3 Then
WeaponStr = ReadArr(0)
Attach1Str = ReadArr(1)
Exit Sub
End If
'Condition 4
If ReadArr.Length = 4 Then
WeaponStr = ReadArr(0)
Attach1Str = ReadArr(1)
Attach2Str = ReadArr(2)
Exit Sub
End If
End Sub
Last edited by Qwert00; Jun 26th, 2011 at 08:42 AM.
-
Jun 26th, 2011, 10:49 AM
#9
Re: Parsing concatenated string
Thanks, I'll take a look at these, looks better than my idea's. There is one other attachment kill: mk (masterkey, which is an underbarrel shotgun).
-
Jun 26th, 2011, 11:21 AM
#10
New Member
Re: Parsing concatenated string
That code should be able to handle MK with no changes.
Fortunately, all three of your attachment kills start with 2 digit strings.
-
Jul 7th, 2011, 06:02 AM
#11
Re: Parsing concatenated string
Thanks guys. There were a few technicalities that required fixing but that was easy enough. So far I have not encountered any problems with it so it seems to work fine
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|