Thank you for being a valued part of the CNET community. As of December 1, 2020, the forums are in read-only format. In early 2021, CNET Forums will no longer be available. We are grateful for the participation and advice you have provided to one another over the years.

Thanks,

CNET Support

General discussion

Crawling a domain for unknown links

Apr 20, 2008 7:13AM PDT

I have a problem. On a daomain, lets name it www.example.com are a lot of Files uploaded, lets say in www.example.com/file.MOV, they are all downloadable but I forgot/ dont know some links of the files, so is there any programm or technique which could "brute-force-try" possible links?

I mean, i could also try out the links (e.g. www.example.com/file1.MOV, www.example.com/file2.MOV) Is there any tool that could do this work for me?
or could there be any search engines or whatever that could show me all the subfolders or files that are on this domain?

soory if I put this thread in a wrond section, it would be kind if a mod could move it then to the right place^^

Discussion is locked

- Collapse -
Not sure I understand.
Apr 20, 2008 7:47PM PDT

When you say a Domain, do you mean a web site?

You say you have forgotten some of the links. Why not re-visit the site and check again?

Most Web Masters would not take kindly to a 'brute force' attempt to download their files, even if they are downloadable. Best to take your time, and download one or two at a time.

Mark

- Collapse -
Well...
Apr 21, 2008 3:17AM PDT

Depending on the website a listing of all hosted files may already be available. If such doesn't exist, though, it can be very problematic. For instance, if you wanted to find all .MOV files on a server you could create a simple application to attempt every possible filename and log the HTTP 200 responses.

However, consider the number of requests you'd be trying. Each filename could contain between 1 and, on average, 30 characters, with each character being one of 40+ permissible characters. Thus, you're talking about, conservatively, 40^30 requests, which comes out to well over 1 septillion. That's indeed brute force, but certainly would not be appreciated by the website.

Perhaps it would be possible to simply contact the webmaster and ask for a listing of the filenames you're looking for?

John

- Collapse -
a bit complicated
Apr 21, 2008 4:06AM PDT

well.. its a bit complicated with asking, cause the webmaster has a big community and wants everybody to find the files that they need by themselves if they can, its a kind of a "game"...
such a program would be useful I know vaguely what i search so I could lower the requests (the files have about 8 characters, i know in which folder i have to search and that they end with .mov or .MOV)

what is the name of such programs?

- Collapse -
I just write them myself...
Apr 21, 2008 4:42AM PDT

I've never found an application with all of the abilities I want, so I just write up one of my own to get the job done. You should note, though, that 40^8 (40 possible characters, with a filename being up to 8 characters long) is still approximately 6,000,000,000,000 possibilities.

Now I'm curious, though; which website is this?

John