A couple of years ago, a law firm called McCarter & English, representing a New Jersey company called Healthcare Advocates, sued a Pennsylvania firm called Health Advocate for trademark infringement. Defendant's lawyers — a firm called Harding Earley — used the Internet Archive to pull up plaintiff's old web pages, to help in the defense. It appears that Healthcare Advocates had recently put up a robots.txt file with instructions to block public access to its old pages, but the folks at Harding Earley made a whole bunch of requests, and the pages sometimes displayed anyway.
Healthcare Advocates, represented by McCarter & English, is now suing both the Harding Earley firm — for copyright infringement, violations of the DMCA and the Computer Fraud and Abuse Act, and state-law torts — and the Internet Archive, for breach of contract, promissory estoppel, breach of fiduciary duty, negligence and misrepresentation.
This is silly. The copyright claim against Harding Earley is silly. Setting aside anything else, if there ever were a textbook example of fair use, reproducing a once-publicly available web page because its content was relevant to the proper disposition of a lawsuit would be it. The DMCA claim is, if not silly, at least wrong. It's hardly obvious that sticking a robots.txt file on your server counts as a technological protection measure within the meaning of the DMCA, since web crawlers are free to ignore such markers if they choose. If plaintiff's robots.txt file were a TPM, its instruction to the Internet Archive to withhold the file looks to me like copy protection rather than access protection, which puts defendants in the clear. And finally, as Bill Patry has noted, it's an unworkable reading of the DMCA to say that if you click on a link once and don't get anything, then you're illegally “circumventing” by clicking a bunch more times to see if your luck changes.
The silliest claims are the ones against the Internet Archive itself. Take it from me: The Internet Archive didn't have an obligation under the relevant laws to make sure that that there were no glitches in its implementation of its decision to respect robots.txt.