Results 1 to 6 of 6

Thread: Reading custom syntax

  1. #1

    Thread Starter
    Frenzied Member Pc_Not_Mac's Avatar
    Join Date
    Oct 2009
    Location
    localhost
    Posts
    1,206

    Arrow Reading custom syntax

    Hello, I'm trying to read my custom made syntax from a text file.
    For example

    Code:
     <#01 "C:\text.txt">
    Okay from that line all I'm trying to read is "01" and "C:\text.txt"

    Any ideas?
    Thank you in advance.
    My Codebank:
    Windows Vista & 7 Glass Effect & Limit the amount of times your application could be opened.
    Pause Your Code & Check the OS name

    The question of whether computers can think is like the question of whether submarines can swim.

    Currently learning: Java

    Coding can be a learning experience or

  2. #2
    PowerPoster
    Join Date
    Apr 2007
    Location
    The Netherlands
    Posts
    5,070

    Re: Reading custom syntax

    If you made this syntax, why didn't you make it so that you knew how to read it? Seems a bit strange to create a custom syntax while you don't know how to read it back.

    Anyway, you'd probably want to use regular expressions.

  3. #3

    Thread Starter
    Frenzied Member Pc_Not_Mac's Avatar
    Join Date
    Oct 2009
    Location
    localhost
    Posts
    1,206

    Re: Reading custom syntax

    I'm very sorry NickThissen i should of given a more clear explanation.
    This so called custom syntax which i created is nothing more then custom text which will be stored in text document.
    Is it possible for me to read that line inside of a text document and only retract does specific values.

    Quote Originally Posted by Pc_Not_Mac
    Okay from that line all I'm trying to read is "01" and "C:\text.txt"
    My Codebank:
    Windows Vista & 7 Glass Effect & Limit the amount of times your application could be opened.
    Pause Your Code & Check the OS name

    The question of whether computers can think is like the question of whether submarines can swim.

    Currently learning: Java

    Coding can be a learning experience or

  4. #4
    PowerPoster
    Join Date
    Apr 2007
    Location
    The Netherlands
    Posts
    5,070

    Re: Reading custom syntax

    Yes, I understood that. But my question was: why do you store the data in that particular way, when you know you don't know how to retrieve it back? That makes no sense. There are many cases where you don't have control over the syntax of your data (either it is coming from some third party, or some third party needs to read it), in which case you must comply with it, but in this case you can determine how to store the data, so if I were you I would store the data in a way that makes it easy for yourself to read it.

    For example, I'm sure you know the String.Split function. Then why not just store the data as
    Code:
    01|"C:\text.txt"
    and split it along the | character.
    If you know how to read/write XML, then why not use that:
    Code:
    <value id=1>C:\text.txt</value>
    It just seems silly to create a custom syntax when you don't know how to read it. That's almost like creating your own language, but not knowing how to speak it.


    Anyway, if you insist to use this format, as I said regular expressions are probably easiest. I'm not very good at them, but my first try would be this pattern:
    Code:
    <#(d{1,4}) "(.*)">
    That's probably wrong though.
    I'll break it up for you to explain:
    Code:
    <#  (  d{1,4}   )        "  (.*)  ">
    The <# part simply matches those characters. The d{1,4} part should match any number 1 to 4 digits (I think). I choose 4 randomly. The parenthesis around it indicate that this is a capture group, so that you can retrieve what's inside them (the number) later. The .* is inside another capture group and it means 'any amount of any character' basically. The dot matches any character and the asterisk indicates indefinite repetition.

    To use this in code would be something like
    Code:
    Dim pattern = "<#(d{1,4}) ""(.*)"">"   'note use of double " chars inside the string
    For Each m As Match In Regex.Match(sourceText, pattern)
       Dim entireMatch = m.Groups(0)
       Dim number = m.Groups(1)
       Dim text = m.Groups(2)
    Next
    Completely untested and from memory, but it should be close. The Groups in the Match object include the capture groups (anything between parenthesis in the pattern), but if I remember correctly the first group always contains the entire matched string. So the second group (index 1) contains the first capture group ("01" in your example), while the third group contains the text ("C:\test.txt" in your example).

  5. #5

    Thread Starter
    Frenzied Member Pc_Not_Mac's Avatar
    Join Date
    Oct 2009
    Location
    localhost
    Posts
    1,206

    Re: Reading custom syntax

    Thank you for the reply Nick.
    I will use a database for my project and avoid using custom syntax.
    But, i was just wondering how can one read this line in XML.

    <value id=1>C:\text.txt</value>
    My Codebank:
    Windows Vista & 7 Glass Effect & Limit the amount of times your application could be opened.
    Pause Your Code & Check the OS name

    The question of whether computers can think is like the question of whether submarines can swim.

    Currently learning: Java

    Coding can be a learning experience or

  6. #6
    Fanatic Member
    Join Date
    Aug 2010
    Posts
    624

    Re: Reading custom syntax

    Quote Originally Posted by NickThissen View Post
    Yes, I understood that. But my question was: why do you store the data in that particular way, when you know you don't know how to retrieve it back? That makes no sense. There are many cases where you don't have control over the syntax of your data (either it is coming from some third party, or some third party needs to read it), in which case you must comply with it, but in this case you can determine how to store the data, so if I were you I would store the data in a way that makes it easy for yourself to read it.

    For example, I'm sure you know the String.Split function. Then why not just store the data as
    Code:
    01|"C:\text.txt"
    and split it along the | character.
    If you know how to read/write XML, then why not use that:
    Code:
    <value id=1>C:\text.txt</value>
    It just seems silly to create a custom syntax when you don't know how to read it. That's almost like creating your own language, but not knowing how to speak it.


    Anyway, if you insist to use this format, as I said regular expressions are probably easiest. I'm not very good at them, but my first try would be this pattern:
    Code:
    <#(d{1,4}) "(.*)">
    That's probably wrong though.
    I'll break it up for you to explain:
    Code:
    <#  (  d{1,4}   )        "  (.*)  ">
    The <# part simply matches those characters. The d{1,4} part should match any number 1 to 4 digits (I think). I choose 4 randomly. The parenthesis around it indicate that this is a capture group, so that you can retrieve what's inside them (the number) later. The .* is inside another capture group and it means 'any amount of any character' basically. The dot matches any character and the asterisk indicates indefinite repetition.

    To use this in code would be something like
    Code:
    Dim pattern = "<#(d{1,4}) ""(.*)"">"   'note use of double " chars inside the string
    For Each m As Match In Regex.Match(sourceText, pattern)
       Dim entireMatch = m.Groups(0)
       Dim number = m.Groups(1)
       Dim text = m.Groups(2)
    Next
    Completely untested and from memory, but it should be close. The Groups in the Match object include the capture groups (anything between parenthesis in the pattern), but if I remember correctly the first group always contains the entire matched string. So the second group (index 1) contains the first capture group ("01" in your example), while the third group contains the text ("C:\test.txt" in your example).
    Pretty good Regex from memory Nick! only a few things I'd change up, a "digit" is first preceded by a "\" so it becomes \d, also, I'd use a lookback for the <#

    Something like this:

    Code:
            Dim RetrievePattern As String = "(?<=<#)\d+ .+(?=>)"
            Dim reg() As String = Regex.Match(TextBox1.Text, RetrievePattern).Value.Split(" "c)
            MsgBox(reg(0))
            MsgBox(reg(1))
    But as Nick said earlier, if you have the freedom to create your own save format, make an easier one. This one still needs to be split about the space (unless you want to use two regular expressions per line...) So you may as well put a single delimiter as Nick suggested and split about that and call the appropriate index.
    If I helped you out, please take the time to rate me

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width