[RegEx][Preg] Find text between two tags
I need a RegEx expression to find the text between two XML tags.
I have tried the following expressions:
Code:
/<w:p[^Pr][^>]*>(([^(<w:p>)])|([^(<w:p>)]))*<\/w:p>/
/<w:p[^Pr][^>]*>[^<\/w:p>]*<\/w:p>/
The issue is that the entire XML file is being 'matched' which is why I need something in between the tags to stop the RegEx from going past the first </w:p> it encounters.
An example of the XML file is as follows:
Quote:
Life is Good.
<w:p>
<w:pPr>
<w:b />
Random text here.
</w:pPr>
Okay...
</w:p>
More text.
Yo..
<w:p src=".." href=".." style="..">Yoyo</w:p>
<w:p src=".." href=".." style=".."/>Yoyo2</w:p>
What I want to match is the pairs of w:p tags. In the above example there should be 3 matches. The output I want is the following:
Quote:
Match 1:
<w:p>
<w:pPr>
<w:b />
Random text here.
</w:pPr>
Okay...
</w:p>
Match 2:
<w:p src=".." href=".." style="..">Yoyo</w:p>
Match 3:
<w:p src=".." href=".." style=".."/>Yoyo2</w:p>
Can anyone offer some guidance/advice?
Re: [RegEx][Preg] Find text between two tags
I've improved the expression to this:
Code:
<\s*(w:p)(\s+[^>]*>)|(\s*>)([^[<\s*\/\s*(w:p)\s*>]]*)<\s*\/\s*(w:p)\s*>
This expression is giving me the full w:p tag and closing tag i.e. <w:p id=23 class="..." etc="ugotthepoint"> in one match and </w:p> in another match. I am not however getting the content in between the two tags. The content is showing up in my match as "" (null). This is with preg (in PHP).
EDIT: Just for clarification the (\s+[^>]*>)|(\s*>) prevents tags similiar to w:p from being matched such as w:pPr.
EDIT: I don't think anyone is actually reading this but I've modified my expression to this:
Code:
#<\s*(w:p)\s*[^(\/>)]*>([^(<\/\1>)]*)<\s*\/\1\s*>#is