|
-
Dec 8th, 2005, 12:12 AM
#1
Thread Starter
Lively Member
Regular expression, find text between two strings, excluding the two strings
I've been trying to create a regexp to find text between, but not including, two strings. Finding the text between PLUSS the two strings was easy. But finding a way to not get the the two strings also wasn't as easy...
What i'm doing now is just getting the entire text and doing a replace afterwards, but this seems a bit silly really.
What i want to is to find whatever text (text, numbers, everything really) that is between 'display>' and '</a>' and somehow exclude 'display>' and '</a>' from the result.
An example:
<table class=..... display>the text i want</a> ......</table>
What i'm doing now:
VB Code:
String content = "<table class=....display>the text i want</a>....</table>";
StringBuffer text = new StringBuffer();
pattern = Pattern.compile("display>.*?<");
m = pattern.matcher(content);
while (m.find()) {
textappend( m.group().replace("display>", "").replace("<", "").trim() + "\r\n" );
}
This will return the text AND the two strings, like this:
display>the text i want</a>
and not like I would really want it, like this:
the text i want
And then I have to, as I said, do a replace on the string afterwards to remove the 'display>' and '</a>' parts.
SO! The question is, how can I exclude them from the result using regexp?
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|