Results 1 to 7 of 7

Thread: [RESOLVED] Regular expression to find Tokens in HTML page

  1. #1

    Thread Starter
    Fanatic Member Strider's Avatar
    Join Date
    Sep 2004
    Location
    Dublin, Ireland
    Posts
    612

    Resolved [RESOLVED] Regular expression to find Tokens in HTML page

    Hey there,

    im trying to create a regular expression to replace Tokens on an HTML page.

    each token will begin with |~ and end with ~|
    eg1: |~TOKEN1~|
    eg2: |~MY TOKEN~|
    eg2: |~Another_Token~|

    these can lie anywhere in HTML page, so all i care about is whats between the start and end delimiters

    im struggling with creating a pattern.
    what i have tried is "|~[A-Za-z0-9_]~| but it doesnt work for me..

    any help is appreciated
    Barry


    Visual Studio .NET 2008/Visual Studio .NET 2005/Visual Studio .NET 2003
    .NET Framework 3.0 2.0 1.1/ASP.Net 3.0 2.0 1.1/Compact Framework 1.0

    SQL Server 2005/2000/SQL Server CE 2.0


    If you like, rate this post

    Compact Framework for Beginners

  2. #2
    eXtreme Programmer .paul.'s Avatar
    Join Date
    May 2007
    Location
    Chelmsford UK
    Posts
    26,424

    Re: Regular expression to find Tokens in HTML page

    try this:

    vb Code:
    1. Dim rx As New Regex("(?<=\|~)(\d|\w| )*(?=~\|)")
    Last edited by .paul.; Oct 9th, 2008 at 10:18 PM.

  3. #3
    I'm about to be a PowerPoster! mendhak's Avatar
    Join Date
    Feb 2002
    Location
    Ulaan Baator GooGoo: Frog
    Posts
    38,170

    Re: Regular expression to find Tokens in HTML page

    Slight modification to paul's. The whitespace needs to be specified with \s.

    Also doing a named capture.

    (?<=\|~)(?<tokengesture>(\d|\w|\s)*)(?=~\|)

  4. #4

    Thread Starter
    Fanatic Member Strider's Avatar
    Join Date
    Sep 2004
    Location
    Dublin, Ireland
    Posts
    612

    Re: Regular expression to find Tokens in HTML page

    can you please explain the (?<=\\|~) part of the pattern,

    i need to replace the entire token, so |~TOKEN~| -> token replacement
    Barry


    Visual Studio .NET 2008/Visual Studio .NET 2005/Visual Studio .NET 2003
    .NET Framework 3.0 2.0 1.1/ASP.Net 3.0 2.0 1.1/Compact Framework 1.0

    SQL Server 2005/2000/SQL Server CE 2.0


    If you like, rate this post

    Compact Framework for Beginners

  5. #5
    eXtreme Programmer .paul.'s Avatar
    Join Date
    May 2007
    Location
    Chelmsford UK
    Posts
    26,424

    Re: Regular expression to find Tokens in HTML page

    try this:

    vb Code:
    1. Dim testStr As String = "eg2: |~Another_Token~|"
    2. MsgBox(Regex.Replace(testStr, "\|~(\d|\w|\s)*~\|", "new String value"))

  6. #6

    Thread Starter
    Fanatic Member Strider's Avatar
    Join Date
    Sep 2004
    Location
    Dublin, Ireland
    Posts
    612

    Re: Regular expression to find Tokens in HTML page

    yeah paul, i had changed it to that and it worked, sweet cheers mate...

    but one last question why do u put the \ between the ~ and |, when i first changed it i done like \~| what does this backspace mean
    Barry


    Visual Studio .NET 2008/Visual Studio .NET 2005/Visual Studio .NET 2003
    .NET Framework 3.0 2.0 1.1/ASP.Net 3.0 2.0 1.1/Compact Framework 1.0

    SQL Server 2005/2000/SQL Server CE 2.0


    If you like, rate this post

    Compact Framework for Beginners

  7. #7
    eXtreme Programmer .paul.'s Avatar
    Join Date
    May 2007
    Location
    Chelmsford UK
    Posts
    26,424

    Re: Regular expression to find Tokens in HTML page

    its an escape character. it means to treat the | as a literal character. the | character has special meaning in regex.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width