Question

Link scraper/grabber?

We are looking for software to speed the removal of pirated content from website for our clients.

I'm not sure if "scraper" is the right term for what I'm looking for, so here's what I need.

I want to be able to go to a website and grab all the links from a search result.

Pick any site and run a search for a course. When I run that search I get a return of 542 results for example. I want to be able to grab all of the links at once without having to grab them individually.

Is there a program that will do that?

Discussion is locked
Follow
Reply to: Link scraper/grabber?
PLEASE NOTE: Do not post advertisements, offensive materials, profanity, or personal attacks. Please remember to be considerate of other members. If you are new to the CNET Forums, please read our CNET Forums FAQ. All submitted content is subject to our Terms of Use.
Reporting: Link scraper/grabber?
This post has been flagged and will be reviewed by our staff. Thank you for helping us maintain CNET's great community.
Sorry, there was a problem flagging this post. Please try again now or at a later time.
If you believe this post is offensive or violates the CNET Forums' Usage policies, you can report it below (this will not automatically remove the post). Once reported, our moderators will be notified and the post will be reviewed.
Comments
- Collapse -
Answer
That's called screen scraping indeed.

Two hits from google to companies that can help you:
http://www.scrape-it.nl/Web-scrapen (sorry, that's Dutch, but google translate will make you understand it)
http://www.screen-scraper.com/

A poor man's solution:
Save the webpage to a text file, and use a program to extract the links on it to a file. I think a regular expression that starts with http: or https: and ends with a few characters not allowed in an url (like space and comma) will be the center piece of such a program. Any second year IT-student should be able to write it.

Kees

- Collapse -
Answer
Make the links inaffective by google

CNET Forums