• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Using Regular Expressions to find keywords across N number of lines

rpping

Junior Member
First, I am extremely NEW to regular expressions, so forgive my lack of knowledge.

My situation: I have a source directory with roughly 1000 program listings. I have a search tool that allows the use of straight text or regular expressions to search across the directory. A straight search won't work here.

I want to perform a search across all of the files and pickup only those that contain "word1" and "word2" (where "word1" could be a field name and "word2" could be a record name, etc). These words can be in the same line of code or 1 to N number of lines later (I'd like N to be configurable so that I can look for specific constructs).

I have found an example, using regular expressions, to search the source modules for 2 words/phrases in a single line separated by 1 to N words, but I cannot, thus far, figure out and/or find an example of a regular expression that will allow me to search across N multiple lines.

Can someone help me out here?

Thanks!
 
Not great with regular expressions either, but something like this maybe?
word1(.*)((\r|\r\n).*){0,N}word2

(where you'd substitute word1, word2, and N)
 
Perl:

#!/usr/bin/perl

open(FH, "test.txt");
{
local($/);
$text = <FH>;
}

$N = 10;

@matches = ($text =~ /(word1(?:.*\n){0,$N}?word2)/g);

print join("\n\n", @matches);

----------------

word1 is the first word to match, word2 the second word, $N is the "N" value you are talking about.

I assumed you wanted non-greedy matching (the question mark after the {0,$N}), which means that if you have something like:

word1word2
blah
blah
word1
blah
word2

And N >= 5, you want the regex to return both word1word2 and word1\nblah\word2 rather than the whole thing (which would be the greedy match).

Anyway, no guarantees your search tool supports all of Perl's regex features but if so this is how you'd do it.
 
Back
Top