|
-
Dec 31st, 2006, 02:11 AM
#1
Thread Starter
Hyperactive Member
[2005] Parsing Text
I believe I asked this type of question before, but in a different manner. I'm trying to write a way of reading data in this format:
Code:
"SECTION_NAME"
{
"ControlName" "Button"
"fieldName" "ButtonItem1"
"xpos" "489"
"ypos" "400"
"zpos" "290"
"wide" "400"
"tall" "180"
"autoResize" "0"
"pinCorner" "0"
"visible" "1"
"enabled" "1"
"tabPosition" "0"
"settitlebarvisible" "1"
"title" "Hellow World!"
"sizable" "0"
}
How would I first look for the section of data called SECTION_NAME (Which is contained using {} ).
Then, the more important part, how would I go about reading, lets say, the variable for "title".
I'm trying to write a function like so:
VB Code:
Public Function Data_Read(ByVal section As String, ByVal variable As String) As String
VB Code:
Public Function Data_Write(ByVal section As String, ByVal variable As String, ByVal value As String)
Thanks guys. I know I can bother some with my redudant questions.
-
Dec 31st, 2006, 10:23 AM
#2
Fanatic Member
Re: [2005] Parsing Text
You haven't mentioned what is the size of the file that constains these text but I believe it should be reasonable as it seems to the source code of a program. The way I would implement my solution would be as follows:
- Define a class that would have the properties SectionName, PropertyName and PropertyValue
- Define a Collection class to hold a collection of my user defined class.
- Sequentially read the text file and detect a Section name
- Until you detect a closing } create objects of tour class having the section name that you detected
- Implement method to receive sectionname, property to return value. Overload this method as required to receive lesser or more arguments to return the values.
This is a perfect candidate to try out the Object Oriented features of VB.NET.
Using VB.NET 2003/.NET 1.1/C# 2.0
http://del.icio.us/rajoo
Blow your mind, smoke gunpowder
Ashes to ashes, dust to dust
If God won't have you, the devil will. - Author unknown
Don't follow me, I'm lost too ...
-
Dec 31st, 2006, 11:58 AM
#3
Re: [2005] Parsing Text
Or you can just create your own class and make it serializable. Then you can just serialize to an XML file and deserialize that back into an object with very little code...
-
Dec 31st, 2006, 08:52 PM
#4
Thread Starter
Hyperactive Member
Re: [2005] Parsing Text
What would be a good way, at least for now, to just seperate sections, since sections are seperated like so:
Code:
"SECTION_1"
{
}
"SECTION_2"
{
}
-
Jan 1st, 2007, 12:19 AM
#5
Re: [2005] Parsing Text
Crudely: you could read it into a string, split on '}' and then split each part of that on '{'. Then you would have the name and the members.
-
Jan 1st, 2007, 01:34 AM
#6
Thread Starter
Hyperactive Member
Re: [2005] Parsing Text
I did a quick and dirty way of loading the section names. I just need to figure out how to strip any line brakes and quote characters.
Code:
VB Code:
lb_Sections.Items.Clear()
Dim ResourceStream As String = ResText.Text
Dim SplitsArray() As String = Split(ResourceStream, "}")
Dim i As Integer
For i = 0 To SplitsArray.GetUpperBound(0)
Dim FinalArray As Array = SplitsArray(i).Split("{")
lb_Sections.Items.Add(FinalArray(0))
Next
Any ideas?
I guess I'll just start with each step of the process and work it out slowly.
-
Jan 1st, 2007, 04:20 AM
#7
Fanatic Member
Re: [2005] Parsing Text
I'd use regular expressions.
I've amended your code a little bit
VB Code:
Imports System.Text.RegularExpressions
Sub YourMethod()
lb_Sections.Items.Clear()
Dim ResourceStream As String = ResText.Text
Dim SplitsArray() As String = Regex.Split(ResourceStream , "{", RegexOptions.Multiline)
Dim Lines()
Dim i As Integer
For i = 0 To SplitsArray.GetUpperBound(0)
Lines = Regex.Split(SplitsArray(i), Environment.NewLine, RegexOptions.Multiline)
For Each LineItem As String in Lines
LineItem = LineItem.Trim()
If LineItem.StartsWith(""") Then
' add code to detect the next double quote after the
' the first one and then the next etc.
End If
Next
' lb_Sections.Items.Add(FinalArray(0))
Next
End Sub
Using VB.NET 2003/.NET 1.1/C# 2.0
http://del.icio.us/rajoo
Blow your mind, smoke gunpowder
Ashes to ashes, dust to dust
If God won't have you, the devil will. - Author unknown
Don't follow me, I'm lost too ...
-
Jan 1st, 2007, 07:04 AM
#8
Re: [2005] Parsing Text
I don't see any reason to use regular expression split instead of regular split. You're not even supplying a regular expression.
-
Jan 1st, 2007, 08:57 AM
#9
Thread Starter
Hyperactive Member
Re: [2005] Parsing Text
I knew I'd run into this problem, but I get an error because there is three " and you need 2 to make a pair, so 4 total. Solutions?
-
Jan 1st, 2007, 08:58 AM
#10
Re: [2005] Parsing Text
 Originally Posted by tylerm
I knew I'd run into this problem, but I get an error because there is three " and you need 2 to make a pair, so 4 total. Solutions?
Not sure what you mean, can you examplify us?
-
Jan 1st, 2007, 09:07 AM
#11
Fanatic Member
Re: [2005] Parsing Text
If you get more than 4 double quotes in a line it means that "" actually means a double quote within the value i.e. the first double quote is acting like the escape character for the double quote in the value. I'd suggest you first replace "" with say ~, split the line between key and value and the replace the ~ with one double quote.
Using VB.NET 2003/.NET 1.1/C# 2.0
http://del.icio.us/rajoo
Blow your mind, smoke gunpowder
Ashes to ashes, dust to dust
If God won't have you, the devil will. - Author unknown
Don't follow me, I'm lost too ...
-
Jan 1st, 2007, 09:08 AM
#12
Fanatic Member
Re: [2005] Parsing Text
I don't see any reason to use regular expression split instead of regular split. You're not even supplying a regular expression.
Well I propose regular expressions mainly because if you're an expert in Regular expressions you can easily split these texts by writing the right regular expressions. And secondly I don't like using the Split function on its own, looks too VB6 to me.
Using VB.NET 2003/.NET 1.1/C# 2.0
http://del.icio.us/rajoo
Blow your mind, smoke gunpowder
Ashes to ashes, dust to dust
If God won't have you, the devil will. - Author unknown
Don't follow me, I'm lost too ...
-
Jan 1st, 2007, 09:25 AM
#13
Thread Starter
Hyperactive Member
Re: [2005] Parsing Text
 Originally Posted by penagate
Not sure what you mean, can you examplify us?
If I have 3 quotes, Visual Studio throws an error, because it's expecting a closing quote.
VB Code:
If LineItem.StartsWith(""") Then
-
Jan 1st, 2007, 09:53 AM
#14
Re: [2005] Parsing Text
Use a double quote within a string literal to represent a quote character:
VB Code:
If LineItem.StartsWith("""") Then
Or, better, you could use a character literal, but you have the same problem:
VB Code:
If LineItem.StartsWith(""""c) Then
Finally, you could use a ToChar:
VB Code:
If LineItem.StartsWith(Convert.ToChar(34)) Then
-
Jan 1st, 2007, 12:57 PM
#15
Fanatic Member
Re: [2005] Parsing Text
I guess I misunderstood Tylerm issue with three double quotes, I didn't realise it was my sample quote that was wrong
Well Tylerm I'd use
VB Code:
If LineItem.StartsWith(Convert.ToChar(34)) Then
as suggested by Penagate
Using VB.NET 2003/.NET 1.1/C# 2.0
http://del.icio.us/rajoo
Blow your mind, smoke gunpowder
Ashes to ashes, dust to dust
If God won't have you, the devil will. - Author unknown
Don't follow me, I'm lost too ...
-
Jan 1st, 2007, 01:00 PM
#16
Re: [2005] Parsing Text
 Originally Posted by Mr.No
Well I propose regular expressions mainly because if you're an expert in Regular expressions you can easily split these texts by writing the right regular expressions. And secondly I don't like using the Split function on its own, looks too VB6 to me.
I'm sorry, I completely missed this post in my haste to reply to the quotes issue.
You can indeed use regular expressions though splitting is IMO easier for those unfamiliar with regex, though less powerful of course.
You're also right about global Split(), it should be
VB Code:
Dim SplitsArray() As String = ResourceStream.Split("}"c)
-
Jan 2nd, 2007, 03:01 AM
#17
Thread Starter
Hyperactive Member
Re: [2005] Parsing Text
I can't help but think that the structure of of the format really looks like XML:
Code:
<section name="SECTION_1">
<Property name="ControlName">Button</Property>
<Property name="fieldName">ButtonItem1</Property>
<Property name="xpos">489</Property>
<Property name="ypos">400</Property>
<Property name="zpos">290</Property>
<Property name="wide">400</Property>
<Property name="tall">180</Property>
<Property name="autoResize">0</Property>
<Property name="pinCorner">0</Property>
<Property name="visible">1</Property>
<Property name="enabled">1</Property>
<Property name="tabPosition">0</Property>
<Property name="settitlebarvisible">1</Property>
<Property name="title">Hellow World!</Property>
<Property name="sizable">0</Property>
</section>
Code:
"SECTION_1"
{
"ControlName" "Button"
"fieldName" "ButtonItem1"
"xpos" "489"
"ypos" "400"
"zpos" "290"
"wide" "400"
"tall" "180"
"autoResize" "0"
"pinCorner" "0"
"visible" "1"
"enabled" "1"
"tabPosition" "0"
"settitlebarvisible" "1"
"title" "Hellow World!"
"sizable" "0"
}
-
Jan 2nd, 2007, 03:17 AM
#18
Fanatic Member
Re: [2005] Parsing Text
It does indeed look like XML but I guess the designer wanted to avoid redundant closining tags and was more geard towards JSON. This format is sometimes used as a subsititute to the XML in AJAX. But then its proper format has a colon between the variable and the value like below
VB Code:
"SECTION_1" :
{
"ControlName" : "Button"
"fieldName": "ButtonItem1"
"xpos" :"489"
"ypos" : "400"
"zpos" :"290"
"wide" : "400"
"tall" : "180"
"autoResize": "0"
"pinCorner" :"0"
"visible" :"1"
"enabled" :"1"
"tabPosition" :"0"
"settitlebarvisible": "1"
"title": "Hellow World!"
"sizable" :"0"
}
By the way it would seem your key/value is seperated by a TAB character, if thats the case then splitting them is much easier, just split each line on TAB and remove double quotes.
Using VB.NET 2003/.NET 1.1/C# 2.0
http://del.icio.us/rajoo
Blow your mind, smoke gunpowder
Ashes to ashes, dust to dust
If God won't have you, the devil will. - Author unknown
Don't follow me, I'm lost too ...
-
Jan 2nd, 2007, 03:19 AM
#19
Re: [2005] Parsing Text
JSON wouldn't have quotes around names, only values.
-
Jan 2nd, 2007, 03:35 AM
#20
Fanatic Member
Re: [2005] Parsing Text
JSON wouldn't have quotes around names, only values.
Not to get side tracked from the main thread but it can. Here's a sample from http://www.json.org/js.html
Code:
var myJSONObject = {"bindings": [
{"ircEvent": "PRIVMSG", "method": "newURI", "regex": "^http://.*"},
{"ircEvent": "PRIVMSG", "method": "deleteURI", "regex": "^delete.*"},
{"ircEvent": "PRIVMSG", "method": "randomURI", "regex": "^random.*"}
]
};
Using VB.NET 2003/.NET 1.1/C# 2.0
http://del.icio.us/rajoo
Blow your mind, smoke gunpowder
Ashes to ashes, dust to dust
If God won't have you, the devil will. - Author unknown
Don't follow me, I'm lost too ...
-
Jan 2nd, 2007, 03:51 AM
#21
Re: [2005] Parsing Text
Ah... yes, it can, but it needn't.
-
Jan 6th, 2007, 11:28 PM
#22
Thread Starter
Hyperactive Member
Re: [2005] Parsing Text
Okay, I'm able to parse PropertyName and PropertyValue, but I run across errors when I leave section names' in because I'm using tab a s delimnator:
As well, is it possible to specify in the .Split an infinite amount of tabs? Meaning in the source file, it can have only 1 tab at the moment, is it possible for it to contain 3 or 4 or 5 tabs and still be read.
Here's what I have for code so far:
VB Code:
ListView1.Items.Clear()
Dim ResourceStream As String = resData.Text
Dim SplitsArray() As String = Regex.Split(ResourceStream, "{", RegexOptions.Multiline)
Dim Lines()
Dim i As Integer
For i = 0 To SplitsArray.GetUpperBound(0)
Lines = Regex.Split(SplitsArray(i), Environment.NewLine, RegexOptions.Multiline)
For Each LineItem As String In Lines
LineItem = LineItem.Trim()
If LineItem.StartsWith(Convert.ToChar(34)) Then
' add code to detect the next double quote after the
' the first one and then the next etc.
Dim FinalDude() As String = LineItem.Split(ControlChars.Tab)
Dim newItem As ListViewItem = ListView1.Items.Add(FinalDude(0))
newItem.Name = FinalDude(0)
newItem.SubItems.Add(FinalDude(1))
End If
Next
Next
-
Jan 7th, 2007, 09:13 AM
#23
Hyperactive Member
Re: [2005] Parsing Text
I'd use InStr to find the section title first, before splitting ANYTHING. Otherwise, if your document is large in size and has several sections, using split without first narrowing down your search will bog your code down a bit.
Use InStr to find the title of the section, then use Mid to narrow it down to the point where that section begins, then use InStr and Mid again to find the subsection("title" in your case, as you said, correct?), and THEN use Split(thestring, """")(3) to find what is in the second set of quotations beginning from the point of "title".
-
Jan 7th, 2007, 09:22 AM
#24
Hyperactive Member
Re: [2005] Parsing Text
VB Code:
Dim str, sect, subsect As String
sect = "SECTION_NAME2" 'Section you are looking for
subsect = """title""" 'Subitem you are looking to find
str = Split(Mid(Mid(Text1.Text, InStr(Text1.Text, sect)), _
InStr(Mid(Text1.Text, InStr(Text1.Text, sect)), subsect)), """")(3)
Debug.Print str 'Returns "Hellow World!
..for example. Text1.Text represents a textbox with the following text(as you provided, modified to duplicate the section accordingly):
Code:
"SECTION_NAME1"
{
"ControlName" "Button"
"fieldName" "ButtonItem1"
"xpos" "489"
"ypos" "400"
"zpos" "290"
"wide" "400"
"tall" "180"
"autoResize" "0"
"pinCorner" "0"
"visible" "1"
"enabled" "1"
"tabPosition" "0"
"settitlebarvisible" "1"
"title" "Hellow World!"
"sizable" "0"
}
"SECTION_NAME2"
{
"ControlName" "Button"
"fieldName" "ButtonItem1"
"xpos" "489"
"ypos" "400"
"zpos" "290"
"wide" "400"
"tall" "180"
"autoResize" "0"
"pinCorner" "0"
"visible" "1"
"enabled" "1"
"tabPosition" "0"
"settitlebarvisible" "1"
"title" "Hellow World!"
"sizable" "0"
}
Last edited by BrendanDavis; Jan 7th, 2007 at 09:26 AM.
-
Jan 7th, 2007, 10:44 AM
#25
Re: [2005] Parsing Text
An example of displaying each section of data using Regex is below. Using the pasted text above, the below code should display each respective section in a messagebox...
VB Code:
Dim MyText As String = IO.File.ReadAllText("c:\test.txt") 'file with pasted text
Dim RegexPattern As String = """\w*?"".*?\{.*?\}"
Dim MyMatches As System.Text.RegularExpressions.MatchCollection = _
System.Text.RegularExpressions.Regex.Matches( _
MyText, RegexPattern, System.Text.RegularExpressions.RegexOptions.Singleline)
For Each Match As System.Text.RegularExpressions.Match In MyMatches
MessageBox.Show(Match.Value)
Next
-
Jan 7th, 2007, 01:43 PM
#26
Thread Starter
Hyperactive Member
Re: [2005] Parsing Text
Okay, I'm starting to get this down now. I guess next is how to specify the section and propertyname and then set the property value.
For example, specifying:
VB Code:
ResourceParser.Property_SetValue(resData.text, "VGUI_Display", "title", "DUDE")
This would set the value from "Hello World!" to "DUDE" inside the section "VGUI_Display" for the title property.
Last edited by tylerm; Jan 7th, 2007 at 01:48 PM.
-
Jan 24th, 2007, 03:45 PM
#27
Thread Starter
Hyperactive Member
Re: [2005] Parsing Text
 Originally Posted by gigemboy
An example of displaying each section of data using Regex is below. Using the pasted text above, the below code should display each respective section in a messagebox...
VB Code:
Dim MyText As String = IO.File.ReadAllText("c:\test.txt") 'file with pasted text
Dim RegexPattern As String = """\w*?"".*?\{.*?\}"
Dim MyMatches As System.Text.RegularExpressions.MatchCollection = _
System.Text.RegularExpressions.Regex.Matches( _
MyText, RegexPattern, System.Text.RegularExpressions.RegexOptions.Singleline)
For Each Match As System.Text.RegularExpressions.Match In MyMatches
MessageBox.Show(Match.Value)
Next
How could I refine that so Match.Value would only output the name of the section, so instead of:
Code:
"vgui_hello"
{
"ControlName" "Frame"
"fieldName" "$PROJECT_NAME$"
"xpos" "489"
"ypos" "400"
"zpos" "290"
"wide" "400"
"tall" "180"
"autoResize" "0"
"pinCorner" "0"
"visible" "1"
"enabled" "1"
"tabPosition" "0"
"settitlebarvisible" "1"
"title" "Hellow World!"
"sizable" "0"
}
It would output: vgui_hello
Thanks
-
Jan 24th, 2007, 04:00 PM
#28
Re: [2005] Parsing Text
The below code is a sample of how to do it. Notice the regex pattern, it is modified to allow each part of the pattern to be put in a named group, which you can access from the match result to get that part of the info. Each match now has a group for the section name, and a group for the section body, and each group of each section is displayed in a messagebox...
VB Code:
Dim MyText As String = IO.File.ReadAllText("c:\test.txt") 'file with pasted text
Dim RegexPattern As String = """(?<SectionName>\w*?)""(?<SectionBody>.*?\{.*?\})"
Dim MyMatches As System.Text.RegularExpressions.MatchCollection = _
System.Text.RegularExpressions.Regex.Matches( _
MyText, RegexPattern, System.Text.RegularExpressions.RegexOptions.Singleline)
For Each Match As System.Text.RegularExpressions.Match In MyMatches
MessageBox.Show(Match.Groups("SectionName").Value)
MessageBox.Show(Match.Groups("SectionBody").Value)
Next
-
Jan 24th, 2007, 04:47 PM
#29
Thread Starter
Hyperactive Member
Re: [2005] Parsing Text
Thank you so much Gigemboy!
I just have one last question and I think all my problems are solved.
How about beign able to get PropertyName and PropertyValue from inside the SectionBody?
Again, thank you so much this problem has been stumping me since Christmas.
-
Jan 24th, 2007, 11:21 PM
#30
Re: [2005] Parsing Text
I am not too sure, I tried to mess with it but ran into a snag and ran out of time to work on it. I tried to set up groups for property name and property value, and it still matches the full text with the modified pattern, however I cannot pull any values for the two named property groups and I am not too sure why. I believe it has to do something with the multiple subgroups that are in the pattern, since there are multiple property name/values. I don't think regex knows how to deal with it, so it does nothing for those two groups, yet still captures the correct text. If anyone else wants to take a stab at it, feel free. Test code is below with the modified pattern...
VB Code:
Dim MyText As String = IO.File.ReadAllText("c:\test.txt") 'file with pasted text
Dim RegexPattern As String = """(?<SectionName>\w*?)"".*?\{.*?(""(?<PropertyName>.*?)"".*?""(?<PropertyValue>.*?)"")*.*?\}"
Dim MyMatches As System.Text.RegularExpressions.MatchCollection = _
System.Text.RegularExpressions.Regex.Matches( _
MyText, RegexPattern, System.Text.RegularExpressions.RegexOptions.Singleline)
For Each Match As System.Text.RegularExpressions.Match In MyMatches
For I As Integer = 0 To Match.Groups.Count - 1
Console.WriteLine(Match.Captures.Count.ToString & " capture(s)")
Console.WriteLine(Match.Groups(I).Value & "- Value" & Environment.NewLine & "end line" & Environment.NewLine)
Next
Next
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|