PDA

View Full Version : Old website down, archive blocked by robot


pratchett
07-02-2010, 9:11 AM
I want to see/download an old web site that's no longer up. I visited the archive site www.archive.org (aka the wayback machine), and it shows the site is blocked by a robot.

Is there any way I can see the contents of the site that were previously posted to the internet?

CaMakarovnik
07-02-2010, 12:17 PM
robots.txt is a file webmasters include on their site to allow or disallow spiders and web crawlers to see the site.

Probably what you are experiencing is that the site was either not archived due to the limits in place by the robots.txt file, or the robots.txt file now prevents access to the site, so the wayback machine doesn't permit access to archived pages.

Sinixstar
07-02-2010, 12:59 PM
robots.txt is a file webmasters include on their site to allow or disallow spiders and web crawlers to see the site.

Probably what you are experiencing is that the site was either not archived due to the limits in place by the robots.txt file, or the robots.txt file now prevents access to the site, so the wayback machine doesn't permit access to archived pages.

It's not a matter of "probably" - that is exactly what's going on.
If the site is down, and it's not already in the archive - it's gone.