• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Web service to grab html contents from website

theinsen1

Senior member
hi,
My requirement is to create a web service which will scan on certain predefined websites for certain keywords. On sucess i should save it to a database.
here not only we taking the link but we should be able to pull the data in it.
then we should be able to pick the data and display according to our own site design.

we have tried with predefied html fetchers like Teleport
only problem it does not update the sub links properly else we cud use this.
I am thinking of using asp.net 2.0 & vb.net2.0 with sql server 2005 as backend.
please need help on this as soon as possible

any kind of help is helpful
specially if there is any web service (producer) we can use for the data capturing.

thanks in advance
 
1. What you're describing is not a web service, it is a web crawler or spider or what have you. A web service is an entirely different concept.
2. What is a sub link? 😕
3. If you know Perl or could learn Perl, considering using it. It has excellent text parsing capabilities. In particular, its WWW::Mechanize package (or the Win32 equivalent) should work well for your purposes.
 
Originally posted by: sumguy1
And this is not copyright infringement or intellectual property theft why exactly?
How do you know he doesn't have permission to grab those pages? Just shut the hell up.
 
Originally posted by: BoberFett
Originally posted by: sumguy1
And this is not copyright infringement or intellectual property theft why exactly?
How do you know he doesn't have permission to grab those pages? Just shut the hell up.

Wow that was uncalled for. I don't know who wizzed in your corn flakes this morning but I find your passive-aggressive e-Peen flexing a bit childish and inappropriate.
 
Originally posted by: sumguy1
Originally posted by: BoberFett
Originally posted by: sumguy1
And this is not copyright infringement or intellectual property theft why exactly?
How do you know he doesn't have permission to grab those pages? Just shut the hell up.

Wow that was uncalled for. I don't know who wizzed in your corn flakes this morning but I find your passive-aggressive e-Peen flexing a bit childish and inappropriate.
Who "wizzed" in my corn flakes? The moron who throws around phrases and concepts they don't really understand so they can pretend to know what they're talking about. In this thread that would be you.

To the OP, I've written lots of web scrapers and any language which has the ability to use an HTTP library or has built in HTTP functionality will do the trick. As mugs stated, you're not looking for a web service. A web service is the data provider, not the data consumer. VB and SQL Server will work fine, there's no need to get ASP involved as that's server technology.
 
Originally posted by: BoberFett
Originally posted by: sumguy1
Originally posted by: BoberFett
Originally posted by: sumguy1
And this is not copyright infringement or intellectual property theft why exactly?
How do you know he doesn't have permission to grab those pages? Just shut the hell up.

Wow that was uncalled for. I don't know who wizzed in your corn flakes this morning but I find your passive-aggressive e-Peen flexing a bit childish and inappropriate.
Who "wizzed" in my corn flakes? The moron who throws around phrases and concepts they don't really understand so they can pretend to know what they're talking about. In this thread that would be you.

So you just make the assumption that I don't know what I'm talking about? Why the hostility and the personal attacks? So now I'm a moron for wanting to know the purpose that the OP wants to do this so I can make sure I'm not helping some scammer commit copyright infringement before I offer my help? If you were a locksmith and some stranger came up to you on the street and asked you to help him jimmy open the lock on a door to a house you'd never seen before would you just blindly agree to help him without first wanting to know if he was the owner of the house? If you do, then who's the moron in that picture? (I am still sumguy1 posting from another computer)
 
Originally posted by: manicfool
Originally posted by: BoberFett
Originally posted by: sumguy1
Originally posted by: BoberFett
Originally posted by: sumguy1
And this is not copyright infringement or intellectual property theft why exactly?
How do you know he doesn't have permission to grab those pages? Just shut the hell up.

Wow that was uncalled for. I don't know who wizzed in your corn flakes this morning but I find your passive-aggressive e-Peen flexing a bit childish and inappropriate.
Who "wizzed" in my corn flakes? The moron who throws around phrases and concepts they don't really understand so they can pretend to know what they're talking about. In this thread that would be you.

So you just make the assumption that I don't know what I'm talking about? Why the hostility and the personal attacks? So now I'm a moron for wanting to know the purpose that the OP wants to do this so I can make sure I'm not helping some scammer commit copyright infringement before I offer my help? If you were a locksmith and some stranger came up to you on the street and asked you to help him jimmy open the lock on a door to a house you'd never seen before would you just blindly agree to help him without first wanting to know if he was the owner of the house? If you do, then who's the moron in that picture? (I am still sumguy1 posting from another computer)

err...if the OP was a scammer he probably wouldn't let you know. gotta agree with BoberFett on this one...you guys are clueless.
 
Originally posted by: dealmaster00
Originally posted by: manicfool
Originally posted by: BoberFett
Originally posted by: sumguy1
Originally posted by: BoberFett
Originally posted by: sumguy1
And this is not copyright infringement or intellectual property theft why exactly?
How do you know he doesn't have permission to grab those pages? Just shut the hell up.

Wow that was uncalled for. I don't know who wizzed in your corn flakes this morning but I find your passive-aggressive e-Peen flexing a bit childish and inappropriate.
Who "wizzed" in my corn flakes? The moron who throws around phrases and concepts they don't really understand so they can pretend to know what they're talking about. In this thread that would be you.

So you just make the assumption that I don't know what I'm talking about? Why the hostility and the personal attacks? So now I'm a moron for wanting to know the purpose that the OP wants to do this so I can make sure I'm not helping some scammer commit copyright infringement before I offer my help? If you were a locksmith and some stranger came up to you on the street and asked you to help him jimmy open the lock on a door to a house you'd never seen before would you just blindly agree to help him without first wanting to know if he was the owner of the house? If you do, then who's the moron in that picture? (I am still sumguy1 posting from another computer)

err...if the OP was a scammer he probably wouldn't let you know. gotta agree with BoberFett on this one...you guys are clueless.

Where did he confirm or deny he was a scammer in the original post? Certainly the OP must realize that without further clarification about what he is trying to do, the motives for his actions can seem a bit shady. He didn't let us know he was a scammer. He didn't specifically say he was NOT a scammer either. He just left too much to the imagination. He doesn't say if his use of the information scraped from other websites will be legitimate and with their blessings or not. Hey, while we're on the topic, anybody want to buy a bridge in Brooklyn? I can also offer you a great deal on some nice beachfront real estate in Florida but you gotta pay me 50% down up front before I give you the details. Who's in?
 
Hey sumguy1, did you realize that your browser makes a copy of every page you view and saves it on your hard drive? You should stop using the web right away, you copyright violating fiend.
 
Originally posted by: BoberFett
Hey sumguy1, did you realize that your browser makes a copy of every page you view and saves it on your hard drive? You should stop using the web right away, you copyright violating fiend.

I, unlike the OP, do not intend on redistributing the information cached in my browsers cache directory over the internet while branding it as my own. I will be deleting the data cached in my browser on a regular basis after I have consumed it. See the difference? By the OP's own admission in his own words:

then we should be able to pick the data and display according to our own site design.

Yet I'm the one who doesn't know what I'm talking about? I'm the moron? Is that the best you can do?
He doesn't say how exactly he's going to display it or redistribute it or anything. . .on the internet, on an intranet. . .whatever. . .he doesn't give enough detail to make me feel comfortable that his use of the data is legitimate enough to offer advice or help. In fact it just seems fishy. I won't be a party in aiding something without knowing that I am not helping in committing a devious or unethical act. Surely you can see how lack of detail in his original post leaves his true motives somewhat questionable.
 
Facts can't be copyrighted. If all he's doing is ripping facts and republishing them then you're full of crap. Maybe you should do a little research, I'll give you a start: Feist v. Rural Telephone, 499 U.S. 340.
 
Originally posted by: BoberFett
Facts can't be copyrighted. If all he's doing is ripping facts and republishing them then you're full of crap. Maybe you should do a little research, I'll give you a start: Feist v. Rural Telephone, 499 U.S. 340.

You are making assumptions based on too little information. He doesn't say what the data is he's trying to scrape. He doesn't say who the original publisher is. He doesn't say enough about how he plans to use the data. The case you cited is irrelevant without that contextual information. His original post is just vague enough to make it seem fishy. The way it is now it just makes it seem to me like he is witholding information deliberately. Seems if he were going to use it legitimately he'd have said so. At least I would have if I were requesting help with something like this. It's all moot anyway at this point since we have totally hijacked this guys thread and he appears to have abandoned it anyhow. And why is it you can't make a single post without some kind of anti-social rhetoric or personal attack? I am finished here. You may continue this debate with yourself if you wish to continue to "hear yourself talk."
 
Back
Top