• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Requesting some perl expression help

chuck2002

Senior member
I am working on locking down the web sites some computers on my domain can access. The software (IEURLLock) I?m using is taking expressions via perl for URLs and I am lost. He says look at this web page for syntax, but that isn?t helping.

http://www.pcre.org/

Here are his examples and an explanation of what?s going on:

Locations that the user navigates to will get checked against the regular expressions in this list, permitting the navigation as soon as it finds a regular expression that matches the location. If none of the regular expressions match, IEURLLock blocks access to that location.

IEURLLock uses Perl-Compatible Regular Expressions through the PCRE library. More information on this library and how to construct regular expressions exists at http://www.pcre.org/ and http://gnuwin32.sourceforge.net/packages/pcre.htm.

Put descriptive names for each regular expression into the Value Name field and put each corresponding regular expression into the Value field.

Example:

Value Name: Microsoft

Value: ^http://(www\.)?microsoft\.com(/|$)



Case-Insensitive Example:

Value Name: Sourceforge.net Project Web Sites

Value: (?i)^http://(\w)+\.(sf|sourceforge)\.net(/|$)


Can someone expand his examples to include urls like http://something.microsoft.com as an example for me please?

Thank you.
-Chuck
 
^http://.*\.microsoft\.com(/|$)

. matches anything (which is why you have to escape it with \ when you want a literal period) and * tells it to match as many of the previous expression as possible, so .* will match any string of any size.

If you're going to be using regexps you should probably get one of the O'Reilly books on them, it's a pretty complicated subject.
 
Actually, I spoke too soon. I had the url filter app disabled.
It still is not working. It blocks everything to microsoft.com now.
Thanks.
 
Originally posted by: chuck2002
It still is not working. It blocks everything to microsoft.com now.
You mean to any server in the microsoft.com domain, or to "http://microsoft.com" itself? You've got to be specific when you're talking regular expressions. ^http://.*\.microsoft\.com(/|$) will match (in layman's terms) http://*.microsoft.com, but not http://microsoft.com. If you also want to match (allow) http://microsoft.com, change the expression to ^http://.*\.?microsoft.com(/|$) - the added ? makes the "\." optional. That expression could be improved somewhat, because it will also allow http://ilovemicrosoft.com, which doesn't seem to be your intention. I leave the fix for that as an exercise for the reader.

Originally posted by: chuck2002
Eh. I don't know either. I guess I will try some other software.
That's right... when something is difficult, give up. Come on - regular expressions are not supposed to easy, but they're really useful once you learn them. You can do it.

edit: Reread the OP and fixed up my use of "match" and "allow". From the OP, if the site request matches the expression, it's allowed. If it doesn't match, it's denied.
 
Ahh. I will try that.
I am not just giving up on the first hint of trouble. I have been trying to get this to work for 2 days now. I'm just really frustrated at this point.....
 
What precisely are you trying to allow and deny? Give some examples. Also, double check your typing - it's really easy to enter a regular expression and miss a period or put a slash in the wrong direction. Regexp's are unforgiving biatches.
 
I am using microsoft as an example, but I am wanting to allow all urls for a given site.
I want to allow:
http://www.microsoft.com/something
http://microsoft.com/something
http://something.microsoft.com
http://www.something.microsoft.com

Also https:// urls as well, but I figured that isn't anything more than putting https in where http is.

I am using this software to block all web sites and then allow only a handful.
Since we started working with microsoft as the example, I have been copying and pasting the text examples as given and testing it with microsoft urls.
 
As I read the description in the OP, the expression I gave should work...

^http://.*\.?microsoft.com(/|$)

...though, as mentioned, it can be improved on.

It's also possible that the filter is looking to match the entire site request (i.e. the site plus everything after the /), though it doesn't look that way from the examples. You might try...

^http://.*\.?microsoft.com(/.*|$)

...as well. Try to see what the difference is from what's been said so far.

If neither of those work, then the problem isn't in the regexp. Maybe you need to restart the program or do something else to get it to reread its config. Check the docs.
 
That didn't work either. I appreciate your help. I think I will change gears and try a web proxy to accomplish the blocking goal.
 
Back
Top