• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

regex help - find replace issue

cirrrocco

Golden Member
hey guys, I need help with some regex. I have tried using some advanced find and replace but it doesnt seem to help.

so I have this

<html>
content
</html>
<doctype>
<html>
Crap content
</html>

I want to be able to remove everything below </html> for a bunch of files.

I tried using bbedit and notepad++ and I am now not sure if regular find and replace helps.

I checked online and found that regex could possibly help. Is there a search parameter where I can find

</html>
<doctype>
to End of file

and replace with
</html>

Thanks a lot for your help
 
Should be as a simple as a properly escaped </html><doctype>followed by .* The dot (any character) will start eating, and since the repeat operator (*) is greedy, it will eat up the entire rest of the file. Then just replace that match.


Wait, no, you'll need to group the dot with a newline character, since dot doesn't match those, then star repeat the group.
 
Why use a regex for this? I'm curious. Looks like a simple line-oriented file scan to me. Read from one file, scan the line for the tags you need, and write to another file. When you find the last tag close the second file, delete the first, and rename the second.
 
Code:
perl -n -i~ -e "print unless($ended);$ended=1 if(/<\/html>/i);" [files]

Replace double-quotes with single-quotes on Linux.
 
Sorry about the bump, but I was looking through some old questions and saw this one was incomplete.

I tested this regex in php, it does exactly what cirrrocco requested:
Code:
<?php 
$sub='blah blah blah</html><doctype>blah';
$sub=preg_replace('%(?s)</html>\W*<doctype>.*%','</html>',$sub);
echo $sub."<br /><br />";
?>
 
Back
Top