Yes, I understood that. But my question was: why do you store the data in that particular way, when you know you don't know how to retrieve it back? That makes no sense. There are many cases where you don't have control over the syntax of your data (either it is coming from some third party, or some third party needs to read it), in which case you must comply with it, but in this case you can determine how to store the data, so if I were you I would store the data in a way that makes it easy for yourself to read it.
For example, I'm sure you know the String.Split function. Then why not just store the data as
and split it along the | character.
If you know how to read/write XML, then why not use that:
Code:
<value id=1>C:\text.txt</value>
It just seems silly to create a custom syntax when you don't know how to read it. That's almost like creating your own language, but not knowing how to speak it.
Anyway, if you insist to use this format, as I said regular expressions are probably easiest. I'm not very good at them, but my first try would be this pattern:
That's probably wrong though.
I'll break it up for you to explain:
Code:
<# ( d{1,4} ) " (.*) ">
The <# part simply matches those characters. The d{1,4} part should match any number 1 to 4 digits (I think). I choose 4 randomly. The parenthesis around it indicate that this is a capture group, so that you can retrieve what's inside them (the number) later. The .* is inside another capture group and it means 'any amount of any character' basically. The dot matches any character and the asterisk indicates indefinite repetition.
To use this in code would be something like
Code:
Dim pattern = "<#(d{1,4}) ""(.*)"">" 'note use of double " chars inside the string
For Each m As Match In Regex.Match(sourceText, pattern)
Dim entireMatch = m.Groups(0)
Dim number = m.Groups(1)
Dim text = m.Groups(2)
Next
Completely untested and from memory, but it should be close. The Groups in the Match object include the capture groups (anything between parenthesis in the pattern), but if I remember correctly the first group always contains the entire matched string. So the second group (index 1) contains the first capture group ("01" in your example), while the third group contains the text ("C:\test.txt" in your example).