Results 1 to 2 of 2

Thread: regex multiline argh

  1. #1

    Thread Starter
    Hyperactive Member
    Join Date
    Aug 2002
    Location
    Norwich, UK
    Posts
    405

    regex multiline argh

    i'm trying to pass google results to get each result

    the regex i'm using should get every string that begins with
    class=l and ends with nobr

    but it only gets 2 results from the html below. the reason is that all other
    results contain \n or \r.

    when creating the regex object i set mutliline, whcih it thought would sort the problem, but obviously not.

    Code:
    Regex reg = new Regex("class=l.*?nobr", RegexOptions.Multiline);
    i've donea bit of googling and found mentions that . matches anything without a newline and mentions using \s but when i try and use it within my expression it errors and says unrecognised escape sequence.

    so what i'd like to know is how to amend my regular expression so that i get back all strings that start with class=l and end siwth nobr, regardless of whether there are newlines or not.

    Code:
    <html><head><meta HTTP-EQUIV=\"content-type\" CONTENT=\"text/html; charset=ISO-8859-1\"><title>thom yorke - Google Search</title><style><!--\nbody,td,div,.p,a{font-family:arial,sans-serif }\ndiv,td{color:#000}\n.f{color:#6f6f6f}\n.flc,.fl:link{color:#77c}\na:link,.w,a.w:link,.w a:link{color:#00c}\na:visited,.fl:visited{color:#551a8b}\na:active,.fl:active{color:#f00}\n.t a:link,.t a:active,.t a:visited,.t{color:#000}\n.t{background-color:#e5ecf9}\n.k{background-color:#36c}\n.j{width:34em}\n.h{color:#36c}\n.i,.i:link{color:#a90a08}\n.a,.a:link{color:#008000}\n.z{display:none}\ndiv.n{margin-top:1ex}\n.n a{font-size:10pt;color:#000}\n.n .i{font-size:10pt;font-weight:bold}\n.q:visited,.q:link,.q:active,.q{color:#00c;}\n.b a{font-size:12pt;color:#00c;font-weight:bold}\n.ch{cursor:pointer;cursor:hand}\n.e{margin-top:.75em;margin-bottom:.75em}\n.g{margin-top:1em;margin-bottom:1em}\n.sm{display:block;margin-top:0px;margin-bottom:0px;margin-left:40px}\n-->\n</style>\n<script>\n<!--\nfunction ss(w,id){window.status=w;return true;}\nfunction cs(){window.status='';}\nfunction ga(o,e) {return true;}\n//-->\n</script>\n</head><body bgcolor=#ffffff topmargin=3 marginheight=3><table border=0 cellspacing=0 cellpadding=0 width=100%><tr><td align=right nowrap><font size=-1><a href=\"https://www.google.com/accounts/Login?continue=http://www.google.co.uk/search%3Fhl%3Den%26q%3Dthom%2520yorke&hl=en\">Sign in</a></font></td></tr><tr height=4><td><img alt=\"\" width=1 height=1></td></tr></table><table border=0 cellpadding=0 cellspacing=0 width=100%><tr><form name=gs method=GET action=/search><td valign=top><a href=\"http://www.google.co.uk/webhp?hl=en\"><img src=\"/images/logo_sm.gif\" width=150 height=55 alt=\"Go to Google Home\" border=0 vspace=12></a></td><td>&nbsp;&nbsp;</td><td valign=top width=100% style=\"padding-top:0px\"><table cellpadding=0 cellspacing=0 border=0><tr><td height=14 valign=bottom><table border=0 cellpadding=4 cellspacing=0><tr><td nowrap><font size=-1><b>Web</b>&nbsp;&nbsp;&nbsp;&nbsp;<a id=t1a class=q href=\"http://images.google.co.uk/images?hl=en&q=thom+yorke&sa=N&tab=wi\">Images</a>&nbsp;&nbsp;&nbsp;&nbsp;<a id=t2a class=q href=\"http://groups.google.co.uk/groups?hl=en&q=thom+yorke&sa=N&tab=wg\">Groups</a>&nbsp;&nbsp;&nbsp;&nbsp;<a id=t4a class=q href=\"http://news.google.co.uk/news?hl=en&q=thom+yorke&sa=N&tab=wn\">News</a>&nbsp;&nbsp;&nbsp;&nbsp;<a id=t5a class=q href=\"http://froogle.google.co.uk/froogle?hl=en&q=thom+yorke&sa=N&tab=wf\">Froogle</a>&nbsp;&nbsp;&nbsp;&nbsp;<b><a href=\"/intl/en/options/\" class=q>more&nbsp;&raquo;</a></b></font></td></tr></table></td></tr><tr><td><table border=0 cellpadding=0 cellspacing=0><tr><td nowrap><input type=hidden name=hl value=\"en\"><input type=hidden name=ie value=\"ISO-8859-1\"><input type=text name=q size=41 maxlength=2048 value=\"thom yorke\" title=\"Search\"><font size=-1> <input type=submit name=\"btnG\" value=\"Search\"><span id=hf></span></font></td><td nowrap><font size=-2>&nbsp;&nbsp;<a href=/advanced_search?q=thom+yorke&hl=en&lr=&ie=UTF-8>Advanced Search</a><br>&nbsp;&nbsp;<a href=/preferences?q=thom+yorke&hl=en&lr=&ie=UTF-8>Preferences</a>&nbsp;&nbsp;&nbsp;&nbsp;</font></td></tr></table></td></tr></table><table cellpadding=0 cellspacing=0 border=0><tr><td><font size=-1>Search: <input id=all type=radio name=meta value=\"\" checked><label for=all> the web </label><input id=cty type=radio name=meta value=\"cr=countryUK|countryGB\" ><label for=cty> pages from the UK </label></font></td></tr><tr><td height=7><img width=1 height=1 alt=\"\"></td></tr></table></td></form></tr></table><table width=100% border=0 cellpadding=0 cellspacing=0><tr><td bgcolor=#3366cc><img  Google</font></center></body></html>\r\n

  2. #2
    C# Aficionado Lord_Rat's Avatar
    Join Date
    Sep 2001
    Location
    Cave
    Posts
    2,497

    Re: regex multiline argh

    . matches anything but newline characters.

    You are looking for

    class=l[.\r\n]*nobr

    And you still have to set the MultiLine property as well.
    Need to re-register ASP.NET?
    C:\WINNT\Microsoft.NET\Framework\v#VERSIONNUMBER#\aspnet_regiis -i

    (Edit #VERSIONNUMBER# as needed - do a DIR if you don't know)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width