Thank you for being a valued part of the CNET community. As of December 1, 2020, the forums are in read-only format. In early 2021, CNET Forums will no longer be available. We are grateful for the participation and advice you have provided to one another over the years.

Thanks,

CNET Support

Question

Link scraper/grabber?

Oct 10, 2015 8:48AM PDT

We are looking for software to speed the removal of pirated content from website for our clients.

I'm not sure if "scraper" is the right term for what I'm looking for, so here's what I need.

I want to be able to go to a website and grab all the links from a search result.

Pick any site and run a search for a course. When I run that search I get a return of 542 results for example. I want to be able to grab all of the links at once without having to grab them individually.

Is there a program that will do that?

Discussion is locked

- Collapse -
Answer
That's called screen scraping indeed.
Oct 10, 2015 10:59AM PDT

Two hits from google to companies that can help you:
http://www.scrape-it.nl/Web-scrapen (sorry, that's Dutch, but google translate will make you understand it)
http://www.screen-scraper.com/

A poor man's solution:
Save the webpage to a text file, and use a program to extract the links on it to a file. I think a regular expression that starts with http: or https: and ends with a few characters not allowed in an url (like space and comma) will be the center piece of such a program. Any second year IT-student should be able to write it.

Kees

- Collapse -
Answer
Make the links inaffective by google
Oct 13, 2015 11:53AM PDT