Results 1 to 12 of 12

Thread: Pratt parsers and the building blocks of a scripting language.

  1. #1

    Thread Starter
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    6,661

    Pratt parsers and the building blocks of a scripting language.



    This sample application is an implementation of a Pratt parser written in pure VB6 code. It demonstrates the use of the Parser to validate syntax and construct an expression tree from a text expression. Now I want to preface this by saying that if your intention is to simply evaluate expressions, then this is not what you would use. There are far simpler and faster methods for evaluating expressions. For example:-
    Code:
    Option Explicit
    
    Public Function Eval(ByVal Expr As String)
    Dim L As String, R As String
      Do While HandleParentheses(Expr): Loop
    
      If 0 Then
        ElseIf Spl(Expr, "Or", L, R) Then:   Eval = Eval(L) Or Eval(R)
        ElseIf Spl(Expr, "And", L, R) Then:  Eval = Eval(L) And Eval(R)
        ElseIf Spl(Expr, ">=", L, R) Then:   Eval = Eval(L) >= Eval(R)
        ElseIf Spl(Expr, "<=", L, R) Then:   Eval = Eval(L) <= Eval(R)
        ElseIf Spl(Expr, "=", L, R) Then:    Eval = Eval(L) = Eval(R)
        ElseIf Spl(Expr, ">", L, R) Then:    Eval = Eval(L) > Eval(R)
        ElseIf Spl(Expr, "<", L, R) Then:    Eval = Eval(L) < Eval(R)
        ElseIf Spl(Expr, "Like", L, R) Then: Eval = Eval(L) Like Eval(R)
        ElseIf Spl(Expr, "&", L, R) Then:    Eval = Eval(L) & Eval(R)
        ElseIf Spl(Expr, "-", L, R) Then:    Eval = Eval(L) - Eval(R)
        ElseIf Spl(Expr, "+", L, R) Then:    Eval = Eval(L) + Eval(R)
        ElseIf Spl(Expr, "Mod", L, R) Then:  Eval = Eval(L) Mod Eval(R)
        ElseIf Spl(Expr, "\", L, R) Then:    Eval = Eval(L) \ Eval(R)
        ElseIf Spl(Expr, "*", L, R) Then:    Eval = Eval(L) * Eval(R)
        ElseIf Spl(Expr, "/", L, R) Then:    Eval = Eval(L) / Eval(R)
        ElseIf Spl(Expr, "^", L, R) Then:    Eval = Eval(L) ^ Eval(R)
        ElseIf Trim(Expr) >= "A" Then:       Eval = Fnc(Expr)
        ElseIf Len(Expr) Then:               Eval = IIf(InStr(Expr, "'"), _
                                Replace(Trim(Expr), "'", ""), Val(Expr))
      End If
    End Function
    
    Private Function HandleParentheses(Expr As String) As Boolean
    Dim P As Long, i As Long, C As Long
      P = InStr(Expr, "(")
      If P Then HandleParentheses = True Else Exit Function
    
      For i = P To Len(Expr)
        If Mid(Expr, i, 1) = "(" Then C = C + 1
        If Mid(Expr, i, 1) = ")" Then C = C - 1
        If C = 0 Then Exit For
      Next i
    
      Expr = Left(Expr, P - 1) & Str(Eval(Mid(Expr, P + 1, i - P - 1))) & Mid(Expr, i + 1)
    End Function
    
    Private Function Spl(Expr As String, Op$, L$, R$) As Boolean
    Dim P As Long
      P = InStrRev(Expr, Op, , 1)
      If P Then Spl = True Else Exit Function
      If P < InStrRev(Expr, "'") And InStr("*-", Op) Then P = InStrRev(Expr, "'", P) - 1
    
      R = Mid(Expr, P + Len(Op))
      L = Trim(Left$(Expr, IIf(P > 0, P - 1, 0)))
    
      Select Case Right(L, 1)
        Case "", "+", "*", "/", "A" To "z": Spl = False
        Case "-": R = "-" & R
      End Select
    End Function
    
    Private Function Fnc(Expr As String)
      Expr = LCase(Trim(Expr))
    
      Select Case Left(Expr, 3)
        Case "abs": Fnc = Abs(Val(Mid$(Expr, 4)))
        Case "sin": Fnc = Sin(Val(Mid$(Expr, 4)))
        Case "cos": Fnc = Cos(Val(Mid$(Expr, 4)))
        Case "atn": Fnc = Atn(Val(Mid$(Expr, 4)))
        Case "log": Fnc = Log(Val(Mid$(Expr, 4)))
        Case "exp": Fnc = Exp(Val(Mid$(Expr, 4)))
        'etc...
      End Select
    End Function
    The above is a very fast bottom up recursive parser written by Olaf Schmidt which is suitable for evaluating expressions. You can find more about it in this thread.

    This is not what Pratt parsers are about. You could use them to evaluate expressions but that is like killing a mosquito with a nuke.

    So what is a Pratt parser?

    A Pratt parser belongs to a family of left to right recursive parsing algorithms called shift reduce parsers. Olaf's algorithm is also a kind of shift reduce parser though I have no idea what his one is called. These parsers specialize in parsing expressions and programming language grammars by breaking down or reducing the input into a series of simpler elements. The Pratt parser in particular was invented by a man named Vaughan Pratt. It is particularly specialized in allowing programmers to very easily control operator precedence in whatever scripting or programming language they are creating. It is extremely easy to implement and the algorithm itself can be modeled quite easily to be modular to the point where you can design the parser once and just plug in classes to it to extend whatever language it is you're writing. This is what makes Pratt parsers so appealing.

    The typical approach for writing a scripting language is to use parser generators because programming language grammars can be quite complicated and it can be very tricky to get them right if they are written by hand. However, there are algorithms suited to hand written parsers. The one most often recommended is recursive descent parsing which is very easy to write by hand. However, it is extremely difficult to control operator precedence with a recursive descent parser if you're writing it by hand. This is where you want a Pratt parser because like a recursive descent parser, it is also suitable for hand written parsers and what's more, you can combine it with a recursive descent parser to parse elements of a language that don't require taking operator precedence into account. For example, If...Then statements or While loops.

    So what is this project about?

    It started as a simple implementation of a Pratt parser capable of parsing simple expressions into expression trees. I got carried away and went a bit beyond that. The core of it is a Pratt parser that converts an expression into an expression tree which can then be evaluated to produce a result. There are two methods of evaluation, a simpler method which involves simply traversing the tree to calculate the result. The more complicated method involves traversing the tree to produce a series of instructions that when executed will produce the result. In other words, what I made is a very basic compiler.

    The instructions are meant for a virtual machine not unlike VB6's own P-code engine or Java's bytecode virtual machine. In this case the virtual machine is one of my own invention. It is far simpler than a real world virtual machine and it is no where near as fast. It also has a far simpler instruction set with no support for jumping which means it cannot support looping or conditional branching. It is not meant to be a fully fledged and capable virtual machine to be used in real world scenarios. It is meant to demonstrate the some of the important principles behind virtual CPUs and should be taken as such.

    This demo is capable of parsing highly complex expressions which includes being able to handle multiple variables, deeply nested function calls, and it can detect any and all syntax errors and report them. All of these abilities can be used as a very solid foundation for building a scripting language. All it would take to implement a scripting language is adding parslets to the core Pratt parser to parse assignment statements, If...Then statements, loops like For..Next or Do..While loops, Goto statement and whatever else there is. Conditional jump instructions would also need to be added to the virtual machine to support most of these. I intend to release a more sophisticated version with all of this but this would be done in VB.Net.

    So what kind of input can this parser evaluate?

    I will give a sample list of some of the more complicated expressions it can parse but before I do that I want to give a clear list of the functions implemented in this project. They are as follows:-
    • Sin
    • Cos
    • Tan
    • Sqr
    • Round
    • Abs
    • Max
    • Min
    • Avg
    • Random


    Here are examples of use of the functions:-
    Sin(0.7) : Sine of 0.7
    Cos(0.7) : Cosine of 0.7
    Tan(0.7) : Tangent of 0.7
    Sqr(16) : Square root of 16
    Round(3.4) : Will round to 3.0
    Abs(-90) : Absolute value of -90 which is 90
    Max(7,10,23,100) : Returns the largest value which is 100. This function can take 0 to unlimited arguments
    Min(7,10,23,100) : Returns the smallest value which is 7. This function can take 0 to unlimited arguments.
    Avg(7,10,23,100) : Returns the average of all the numbers. This function can take 0 to unlimited arguments.
    Random(10,20) : Returns a random number between 10 and 20 inclusive of both 10 and 20.

    Here are some examples of expressions that can be parsed:-
    • sqr(Random(100,500))
    • sqr(sqr(sqr(sqr(sqr(10000)))))
    • min(10,89,max(9,2,3,12,sqr(round(22.5))),45)
    • round(abs(-200+-544)/19)


    It also supports implicit multiplication with variables or bracketed sub expressions. For example 2a+3a+3b or 3(90+2*3).

    There are many more samples in the project itself. Also note that some of you might find a mismatch between what your calculator or the VB6 debug window produces and what my program produces for the result of certain expressions. The standard operators addition, subtraction, division and multiplication all the have precedence relationships you'd expect and using only those operators, you should get the same results regardless of where you calculated the result. However, there might be some differences when it comes to other operators. The MOD operator for instance, I gave it the same precedence as multiplication and division. The exponent operator(^) has the highest precedence, even higher than the unary minus operator (-). Also, it is right associative. These things mean that -2^3 is evaluated as -(2^3) and 2^3^4 would be evaluated as 2^(3^4). These may differ from how other evaluators like Excel or VB6 would evaluate them. But this is extremely easy to change. It would only take the changing of one or two values to give them whatever precedence and associativity you want. The Pratt parser was invented for this very reason of controlling operator precedence easily in hand written parsers.

    Final thoughts

    I'm very new to this kind of thing so do not take me for an expert on this subject. Rather, what I want you to take away from this is that writing a programming or scripting language is very approachable and if I can get this far towards it then anyone can do it. I'm no one special.

    I will do a full blown scripting language implementation in the future in pure VB.Net code and if the gods are willing to grant me the insight, that one would even be able to compile into x86 native code and not just into a instructions for virtual machine. I look forward to doing that one.

    Lastly, here is a little video of me demonstrating this app by plugging a few expressions into it:-



    Anyways, good day to you all.

    UPDATE: Oct, 11, 2021

    Changed the way the Dictionary class checks for the existence of a key. The project can now work with the Break in Class Module option set.
    Additional credit to wqweto for this method which can be found in this post.

    UPDATE: Oct, 13, 2021

    Made a simple UI change to make the application close when the main form is closed regardless of how many other forms are open. See this post.

    A version number has also been added to the caption of the main form. Credit to georgekar for this suggestion.
    Attached Files Attached Files
    Last edited by Niya; Oct 13th, 2021 at 09:09 AM.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  2. #2
    Frenzied Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    1,025

    Re: Pratt parsers and the building blocks of a scripting language.

    I found a problem when I run the code. If we set Break in Class Modules, then to avoid the error we have to insert an On Error Resume Next statement in property Item, in class Dictionary.

    Code:
    Public Property Get Item(ByVal key As String) As Variant
    On Error Resume Next
        Assign g_items.Item(key), Item
    End Property

  3. #3

    Thread Starter
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    6,661

    Re: Pratt parsers and the building blocks of a scripting language.

    Quote Originally Posted by georgekar View Post
    I found a problem when I run the code. If we set Break in Class Modules, then to avoid the error we have to insert an On Error Resume Next statement in property Item, in class Dictionary.

    Code:
    Public Property Get Item(ByVal key As String) As Variant
    On Error Resume Next
        Assign g_items.Item(key), Item
    End Property
    I found the problem you're talking about though I'm not sure I'd classify it as a bug. I deliberately didn't put an error trap in Item because in the Dictionary class, I put two ways to query the existence of a key before trying to obtain the item associated with it. It is supposed to break if you try to obtain an item with an invalid key.

    However, when I set Break in Class Modules, I've found that the break originates from in here:-
    Code:
    Public Function TryGetItem(ByVal key As String, ByRef value As Variant) As Boolean
        On Error Resume Next
            Assign Me.Item(key), value
            
            If Err.Number Then TryGetItem = False Else TryGetItem = True
        On Error GoTo 0
    End Function
    The line in red causes it but as you can see, it does so within an active error trap which is meant to be triggered when a key doesn't exist. In this case, Err.Number would be non-zero but the program would not break. This was by design. I'd suggest not using Break in Class Modules to induce the correct behavior. However, I am open hearing about alternate ways to handle this situation.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  4. #4

  5. #5

    Thread Starter
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    6,661

    Re: Pratt parsers and the building blocks of a scripting language.

    Quote Originally Posted by wqweto View Post
    Take a look at TryGetValue here.
    Thank you. It works well.

    I should also point out that that it took a little work around to make it work for a ByRef parameter:-
    Code:
    Public Function TryGetItem(ByVal key As String, ByRef value As Variant) As Boolean
    
        'We cant call TryGetValue on the ByRef value parameter directly. For some
        'strange reason beyond my understanding, this doesn't work. The value
        'doesn't propagate to the calling function. We must use a local variable
        'for the TryGetValue call and then assign it to the ByRef value
        'parameter.
        Dim v As Variant
        
        If modCollectionPatch.TryGetValue(g_items, key, v) Then
            TryGetItem = False
        Else
            Assign v, value
            
            TryGetItem = True
        End If
    End Function
    See the explanation in the comments. Aside from that it works well.

    Also, this is the implementation of Assign:-
    Code:
    Private Sub Assign(ByVal value As Variant, ByRef var As Variant)
        'This sub is needed for this Dictionary class to work with both
        'object types and non object types.
        '*****************************************************************
        If IsObject(value) Then
            Set var = value
        Else
            var = value
        End If
        
    End Sub
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  6. #6
    PowerPoster wqweto's Avatar
    Join Date
    May 2011
    Posts
    3,422

    Re: Pratt parsers and the building blocks of a scripting language.

    You probably had troubles when passing a VT_BYREF indirected Variant for the last [out, retval] parameter of Item method of the VBA.Collection which is unfortunate but expected.

    Btw, this would never happen if TryGetValue signature was Function TryGetValue(ByVal oCol As Collection, Index As Variant, Optional hResult As Long) As Variant i.e. the hResult was an out parameter while the result was returned as a retval like the original Collection.Item method.

    Unfortunately this deviates from .Net design pattern on TryXxx methods and is pretty ugly IMO.

    cheers,
    </wqw>

  7. #7

    Thread Starter
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    6,661

    Re: Pratt parsers and the building blocks of a scripting language.

    Quote Originally Posted by wqweto View Post
    You probably had troubles when passing a VT_BYREF indirected Variant for the last [out, retval] parameter of Item method of the VBA.Collection which is unfortunate but expected.
    I suspected something like this.

    Quote Originally Posted by wqweto View Post
    Btw, this would never happen if TryGetValue signature was Function TryGetValue(ByVal oCol As Collection, Index As Variant, Optional hResult As Long) As Variant i.e. the hResult was an out parameter while the result was returned as a retval like the original Collection.Item method.
    Ok, I'm confused. The last parameter of TryGetValue is where the item is returned according to what I observed when I tested this. It didn't return an HRESULT.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  8. #8
    PowerPoster wqweto's Avatar
    Join Date
    May 2011
    Posts
    3,422

    Re: Pratt parsers and the building blocks of a scripting language.

    Quote Originally Posted by Niya View Post
    It didn't return an HRESULT.
    The hResult was returned as retval so testing for negative result here If TryGetValue(...) < 0 Then actually tests HRESULT's failed bit.

    cheers,
    </wqw>

  9. #9
    Frenzied Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    1,025

    Re: Pratt parsers and the building blocks of a scripting language.

    @Niya
    I check the 11 oct version (a use of a revision number will be better).
    Handling of two forms was bad, so you can get an idea to fix it:

    You should update with these:

    In Private Sub cmdShowStackEval_Click() in frmMain
    put f.Show , me and not just f.Show. This keep the basic form at a lower zorder always.

    Also, when we have to close the main form, we need to close all input forms except the main form, before main form close:

    Private Sub Form_QueryUnload(Cancel As Integer, UnloadMode As Integer)
    Dim i As Long
    For i = Forms.Count - 1 To 0 Step -1
    If Not Forms(i) Is Me Then Unload Forms(i)
    Next
    End Sub
    GK

  10. #10

    Thread Starter
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    6,661

    Re: Pratt parsers and the building blocks of a scripting language.

    Quote Originally Posted by wqweto View Post
    The hResult was returned as retval so testing for negative result here If TryGetValue(...) < 0 Then actually tests HRESULT's failed bit.

    cheers,
    </wqw>
    Ah, I think I get what you're saying now.

    Quote Originally Posted by georgekar View Post
    @Niya
    I check the 11 oct version (a use of a revision number will be better).
    Handling of two forms was bad, so you can get an idea to fix it:

    You should update with these:

    In Private Sub cmdShowStackEval_Click() in frmMain
    put f.Show , me and not just f.Show. This keep the basic form at a lower zorder always.

    Also, when we have to close the main form, we need to close all input forms except the main form, before main form close:

    GK
    Yea, I didn't spend anytime thinking about the UI. All my thoughts were focused on the parser, the VM and the collection classes. However, I'm not opposed to adopting your suggestions. I will update it again.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  11. #11

    Thread Starter
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    6,661

    Re: Pratt parsers and the building blocks of a scripting language.

    Quote Originally Posted by georgekar View Post
    @Niya

    Private Sub Form_QueryUnload(Cancel As Integer, UnloadMode As Integer)
    Dim i As Long
    For i = Forms.Count - 1 To 0 Step -1
    If Not Forms(i) Is Me Then Unload Forms(i)
    Next
    End Sub
    Also, this is unnecessary in this case. Just setting the owner is enough since there are only two forms in this application.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  12. #12

    Thread Starter
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    6,661

    Re: Pratt parsers and the building blocks of a scripting language.

    Ok it's been updated. The application will now close when the main form closes.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width