Search for a string in a HTML file c#

Jun 2, 2008
163
0
0
Is is possible to search out a string with a start and end and grab everything between?

It starts with

/content/

ends is

/NAME.tmp.x.xls


so I want to grab window.open('/content/RANDOMESTUFF/NAME.tmp.x.xls',

This is out of a html file and there can be anywhere from 1 to 30 entries that I need to grab.

Thanks
 

Gunslinger08

Lifer
Nov 18, 2001
13,234
2
81
RegEx.Match(input, "pattern", RegexOption.MultiLine);

Research Regular Expressions and you'll find what you need. Write the correct pattern (something like "/content/[^']{0,}/Name.tmp.x.xls") and plug it in to a RegEx.Match call. You'll get an enumerable collection of Matches.
 

Descartes

Lifer
Oct 10, 1999
13,968
2
0
Regular expressions is the way as josh said, though I'd prefer the following expression:

/content/(.*?)'
 

QuixoticOne

Golden Member
Nov 4, 2005
1,855
0
0
Regular expressions are a crude (but often effective under a naive presumption that there won't be false matches or markup interferences with true matches) way to do it.
If you want to be more specific in matching contextually relative to the markup and other elements of the HTML, you'd use a DOM / SCHEMA aware pattern matching setup. SAX, XML binding, ANTLR, XQUERY, XSLT, whatever.