Results 1 to 3 of 3

Thread: Regex: Find ampersands but ignore HTML entities

  1. #1

    Thread Starter
    Fanatic Member aconybeare's Avatar
    Join Date
    Oct 2001
    Location
    UK
    Posts
    772

    Regex: Find ampersands but ignore HTML entities

    Hi,

    I'm trying to write/find a regular expression for finding ampersands but not HTML entites

    I have this which finds entities but can't quite figure out how to ignore entities and return unmatched "&"

    Code:
    &[^\s]*;
    Test string:
    Code:
    This is sample test containing a bunch of & and entities. Do you shop at: M&S? &x#1234;
    I want to html encode the non-entity ampersands e.g.
    "bunch of & and" --> "bunch of & and"

    Any help will be greatly appreciated

    Cheers Al

  2. #2
    Frenzied Member
    Join Date
    Apr 2009
    Location
    CA, USA
    Posts
    1,516

    Re: Regex: Find ampersands but ignore HTML entities

    Could you just look for ampersands with whitespace after them? E.g. pattern = "& " and replace with "& ".

  3. #3

    Thread Starter
    Fanatic Member aconybeare's Avatar
    Join Date
    Oct 2001
    Location
    UK
    Posts
    772

    Re: Regex: Find ampersands but ignore HTML entities

    Samba,

    Thanks for your reply, yes I guess that's an option but I was looking for something a little more bullet proof.

    An option might be to surround any fields that might contain non-standard characters or & with the
    Code:
    <![CData[
    tag?

    Al

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width