need help with regular expressions

I'm trying to make the webcrawler that indexes my website a little better.
It allows me to do meta and title rewrites in the index using regular expressions.

my experience with them is minimal

basically all i need is how to extract the text between two tags

example
<h1 class="headline">herty smerty flip flip</h1>

The syntax to do the rewrite is easy, but i don't know how to extract 'herty smerty flip flip' from the code.
 

Tracer

Member
Oct 9, 1999
156
0
76
This should work.

/<[^>]+>([^<]+)<[^>]+>/

only works if on the same line, and if the text you want to extract doesn't have any < or >'s.
 

let me clarify

the source will be full of lots of stuff, but there will only be one <h1 class="headline"> tag (and its closing)
i want to get the phrase thats between that h1 tag and its closing /h1 tag.

can i modify what you wrote to do that?
 

Entity

Lifer
Oct 11, 1999
10,090
0
0
/<h1 class\=\"headline\">(.*?)<\/h1>/

That might be closer to what you're looking for. Then you want to save the part you extracted (the (.*?)) to a variable, or what?

Rob