Thank you for being a valued part of the CNET community. As of December 1, 2020, the forums are in read-only format. In early 2021, CNET Forums will no longer be available. We are grateful for the participation and advice you have provided to one another over the years.

Thanks,

CNET Support

Question

Dowloading an entire website.

Nov 15, 2019 3:57PM PST

So I work at an automotive shop, part of which paints vehicles. The painter gets all of his paint codes from an online database. The makers of thus database have announced that they switching over to there new data base, and the original one will not be accessible by the end of the year. The painter however does not like the new database and wishes to keep the original one. All the information in the database is plain text on different HTML pages. He can either find an click links to get the information he wants, or type in paint codes, or vehicle specifications to get what he needs. my goal is to make the original database available for him offline. so that all the links work as they do now, and all the search functions yield the same results offline as they do online. I know this is a big project as its a huge database but it needs to be done. I have internet download manager if that helps, but even after downloading 30,000 HTML pages i dot know how to make it so that they operate as the website does. any advice would be great

Discussion is locked

- Collapse -
Answer
Unlikely to ever happen.
Nov 15, 2019 4:10PM PST

Most web sites are not downloadable since they are driven by a database so you don't have to create 30,000 web pages. Imagine if you had to update such a site. You would die of old age if you had the many pages to keep current.

So close to 100% of sites today have a database on the backend and scripts to create a page on the fly based on what the user clicks.

So that's 30,000 clicks or more to create some 30,000 pages for you to find a way to make work like the old site.

Bow out fast.

- Collapse -
Allow me to elaborate
Nov 15, 2019 8:10PM PST

Thank you very much for your reply, very much appreciated. I do however believe that there is a way to achieve my goal,

This is what the site looks like.


This is an example of what results from an inquiry would look like


So what your saying is that the results yielded from a search on the site would be the result of a scipt/database implementation and not just a simple html link i can save for offline usage?

- Collapse -
That's what I say.
Nov 15, 2019 10:09PM PST

And know. In fact you proved it since you see the search page has you input search values then the website queries a database and creates a page on the fly.

All skill levels arrive here so I take it you don't write code for web servers or such.

Now a determined programmer might craft some web scraping script but that's not me. I see the task as something that if manually done would be your near full time job for a year or so. It certainly wasn't something the company that made the website made in a few days.

- Collapse -
Answer
To give you an legitimate answer
Nov 21, 2019 3:11PM PST

Please contact the guy who has the database. He might sell you a copy. Then you can load the site on an internal web server on your local network.

Post was last edited on November 21, 2019 3:12 PM PST

- Collapse -
Answer
Re: database
Jan 27, 2020 5:19AM PST

Another issue: you don't know what entries in the 30.000 entry database they change and when new entries they add. The changes and the new ones wouldn't be available in your copy of the data, unless you refresh that, say, every week.

- Collapse -
Answer
Well first of all website databases
Jan 27, 2020 10:44AM PST

are stored in databases separate from the application. The application pulls the data and displays it the way the application tells it to. Databases are tables that have keys to help link multiple tables using queries. So that application pulls the data using these queries and displays it using the app. The format of the display is determined by the application. Some databases used are MS SQL server or MYSQL.

- Collapse -
Trying to do the same
Jan 27, 2020 1:26PM PST

Trying to do the same with *reli@ncetek*com

They have perfect descriptions

* Moderator Note: URL disabled so this is not spam or SEO. The link doesn't change the findings and advice above.

Post was last edited on January 27, 2020 1:29 PM PST