Surveying the Wreckage

Despite the post's title, this has nothing to do with the election. Rather, I will attempt here to describe the high points of an unfolding, even-worsening, personal computer disaster brought to you by some eldritch combination of Microsoft, Symantec, Western Digital, Samsung, and (maybe) Diskpeeper, with cameos by Mozilla, and the idiots who defined the original SATA hardware disk standard. My objective, dear reader, is not to engage your sympathy, that might have had value back before I was a gibbering wreck some days ago but no longer, but rather to trigger your schadenfreude, in the hopes that the computer disaster you had last week, or will have next week, will not seem so terrible. Let something good come of all this.

(Note: Above does not apply to reader Ed Bott, I'm sure this stuff never happens to him.)

This is, to my eye, a ridiculously complicated story, and even as it is, I'm sure I'll be leaving out parts as the mind dulls pain, and what the mind fails to dull, lack of sleep probably takes care of. The only things I can promise the reader who perseveres through this long sad geeky tale, is that things only get worse until the end, at which point they are very very bad and remain unresolved.

Let us begin.

I'm runing Windows XP, service pack 2 on an aging Intel Pentium 4 system. I tried SP3 at work, it hosed my machine, and I've been afraid to try it at home, at least until I got my backups sorted out better.

Recently, the system has been a bit weird, with very slow file access times (windows explorer would take forever to open, ditto with file dialogs in programs), and I also was worried that my copy of Firefox was compromised, at it (1) always opens connections to places it shouldin't when I start the program (even in same mode) and (2) every so often something would apparently get firefox to try every port number in sequence trying to make a connection out of the machine. Fortunately, Spybot had modified my localhots file (making it rather suspiciously enormous, in fact), so all these connection attempts ended up at local host. But it was worrying. I decided I had to do something, or several somethings.

First, I decided to take the plunge and migrate to a larger disk, and ordered up a “green” WD7500AACS. (Three quarters of a terrabyte! Whoohoo!) About three or four weeks ago, I copied my files on to it using using XXClone, a nice piece of freeware that basically makes an entire copy of Drive A (including operating system) onto drive B. But the cloning program is very slow — 12-16 hours slow for me. It didn't help that I have to jumper my drives to run at SATA 1 speeds instead of SATA 2: my ASUS P4C800-E deluxe motherboard is old enough that it will not recognize a SATA 2 drive, and without a PCI-E slot there's as far as I can tell no point in getting a new sata drive controller card.

But once that was past, my new environment was much better. I had lots of spare disk space. But things were still slow sometimes. I decided it was time to kill the trojan, or whatever, that seemed to be infesting my system. I also decided that I should go back to hardware RAID, since I don't back up my files enough.

But first things first. I called the help desk about my virus. We get our virus software from the University, which sensibly decided that it would better protect its network, if it also protected the computers that most often interact with it — the students' and the staffs'. They first upgraded me from our old Symantec software to the new “Symantec Endpoint Protection”. But that didn't seem to do anything. Using a netstat agent I could still see from time to time firefox working its way down the series of ports. So I called back, and the UM help people sent me on the Symantec help people.

Contacting them took a little time, but once done a very competent sounding tech walked me through a few things, then announced I had an old version of the software, and should upgrade — by uninstalling my version and then installing a new one. He guided me to downloading the uninstall tool, and the install tool. These were big files, downloading veeery slowly, and I had to go to a meeting, so we ended the call. He warned me that the uninstall might take a couple of reboots.

When I got back, the files were there, and I ran the first one. It duly called for a reboot and I did it — only to get error messages and a lockup. I called back, and they said to reboot again. I did, it unfroze the machine, and they said to run it again. Which I did, at which point the disk wouldn't boot any more.

But no problem, I had my backup, the 160GB version. Nervously, I copied that version onto another 160BG disk I had spare (the old hardware raid I used to run), then back on to the 750GB disk. But now that the two disks are in the system, with the 750Gb disk on the second pair of SATA ports, which are RAID capable (but were properly set for ordinary non-RAID use in the bios), the Windows system on the first 160GB disk decided they needed to be reactivated. And windows didn't give me a code to input or use when I called. And I couldn't fnd the Widows media. So that was a disaster, it seemed.

But the 750GB version worked. So that's good. But now I'm nervous, things seemed jinxed. So I order up a second WD7500AACS, and plan to RAID mirror them.

Diskeeper version 9 doesn't work on big disks. I get the 2008 edition of Diskeeper and install it. It says my MFT tables are almost full, I should grow and defrag them, so I tell it to go ahead. Nothing bad seems to happen as a result.

Now, time extra backups. I'm a little nervous about hardware raid, in part because I'm a little dyslexic. I have this nightmare that I'll take the real disk and copy the blank on to it and lose my data. I've never actually done this, but the raid setup isn't a very friendly dialog, and somehow it feels like something I could do.

So I decided to make a software clone onto the new disk with XXClone, so that whichever way I copied the data would be OK. Both disks would have the right data, so whichever gets deleted, it wouldn't matter.

The new WD7500AACS arrived the other day, and this weekend I got around to formatting it preliminary to running xxclone to stuff it full of my data. I installed the disk, started up the format, and went of to do some stuff. When I got back, I found a blue screen of death, a 0024 failure (that I gather means a loose wire, something version one the sata hardware standard made all too easy). When I tried to reboot, I got a smart drive error – the disk is bad. I flip some disks around. One of the 160GB disks won't boot either — “Disk error”. When the dust settles I have some very high-tech paperweights.

  • WD7500AACS #1 – Smart drive monitor says the disk is BAD
  • WD7500AACS #2 – disk error if I try to boot from it (which is hardly surprising, since it's probably not even formatted), but unrecognized by the machine if I put it as a second disk (which I don't understand, and yes I tried a wire I know works).
  • Samsung HD160JJ #1 – disk error.
  • Samsung HD160JJ #2 – boots up just fine (no Microsoft Activation issue perhaps because the other disks are all, from the OS's point of view, not there?).

I've lost 3 weeks or more of personal data, only most of which can be reconstructed. My work files, on the other hand, either on a unix server or on a USB stick, which I religiously back up at home and work, so that's OK. My personal financial info, which isn't backed up for the last 3+ weeks, I can recreate: the sad prospect of reclassifying a month of credit card transactions in Quicken will be followed by the fun of reliving the crash of my 403(b). There are a few other miscellaneous notes I've lost, I hope there's nothing really major.

I'm still on the old version of Symantec Endpoint protection, and SP2. Having gone back in time, hard-disk-wise, I also again have the flash 9 that hangs all the time instead of flash 10 which doesn't. And a Quicken update. And varous firefox plugins. And don't let's even talk about when I'm going to install Windows XP Service Pack 3.

The WD's are brand new, so I guess there's warranty replacement, unless I want to schlep a long way to the good computer repair store, and see if they can pull my financial records and some other notes off disk #1. I gather if smart says “BAD” there generally isn't much one can do.

I'm not sure about the warranty status of the Samsungs.

The more important question is what I do next. I'm worried. As it happens, I do have one more large unformatted hard drive in the house, a WD50000AAKS, that I was going to use for a different machine. When I get my courage back, I think I'll try formatting that and cloning this last working drive onto it.

Meanwhile, diskeeper version 9 (we're back to that) says I'm using 92% of my MFT and this is bad. But I'm afraid to touch it.

This entry was posted in Sufficiently Advanced Technology. Bookmark the permalink.

10 Responses to Surveying the Wreckage

  1. Shenton says:

    Wow, looks like a tough situation to be in. My condolences (if that helps).
    Just a few of my thoughts:
    (1) HDDs can often be highly unreliable. I had *two* 500GB Seagates die on me within a week of purchase, and they were barely used. Probably a bad batch..just my luck! QC seems to be highly erratic in this business.
    (2) Clean install is always the way to go, when it comes to Windows. I’ve found that out the hard way. My cloning attempts with ATI10 resulted in XP hosing one of my secondary drives recently (converted it into a dynamic disk :/ ). And yes, I followed the instructions.
    (3) Avoid the resource hogging, Symantec/Norton software like the plague. Always look for alternatives. I do not say this lightly.
    (4) Diskeeper is a fine product. I’ve been using it for 3 years (now on the 2008 pro version) without a single problem. It has always kept my drives fragmentation free. BTW, if you use the automatic defrag mode on V.2008, it automatically resizes the MFT for you as necessary.
    (5) Remove trojans/malware the moment they are detected. Don’t wait a sec longer.
    (6) Good luck!

  2. Ed Bott says:

    Michael, so sorry to hear of these woes. Contrary to what you think, stuff like this does happen to me, every so often. I think you made two profound mistakes here, starting with hardware that was probably defective and then trying to “clean” a bad OS installation instead of just starting fresh.

    If I were you, I certainly would consider picking up a SATA-to_USB converter and trying to hook one of the “bad” disks up to a different computer to see if anything can be recovered. They might not actually be bad, you might just be encountering some weird hardware related issues with the disk controller.

    Also, I would strongly consider replacing that ancient PC. For under $400 you could get something that would blow the doors off the old 875P-based motherboard.

    And hardware RAID? I still hate it for desktop machines. Way, way, way more trouble than it’s worth.

    If you want any help, you know where to find me.

  3. Ed Bott says:

    PS, the fact that diskeeper says you’re using 92% of your MFT is mostly irrelevant. The file system resizes the MFT on the fly if you need it when creating new files. So don’t touch it.

  4. wcw says:

    Seconding Shenton’s first two points:
    - modern HDD failure tends either to happen in the first week or so you own it, or not for a while, usually a few years. I have always assumed this pattern reflects the different failure patterns between physical damage (here in transit between manufacture and end user) and wear. Whatever the cause, I have been well served by assuming that all brand-new drives are failures waiting to happen for the first few weeks.
    - clean installs are best. This is true not only of Windows but also of substantially all OSs I have used, from BSD derivatives on down. Software reinstallation can be a huge pain, but it’s usually worth the effort.

    On security, I tend to see an actual infection of any sort as cue to wipe and reinstall. You may never have been hit with a rootkit, but (rather shamefully, if for less than four hours) I have. Don’t dismiss these things as minor little trojans and viruses. PCs are not human beings that get sniffles and recover. Someone else has installed something on your machine. You cannot know what else is on there now, and neither can your antivirus software. The only way to know you have your machine back is to wipe and reinstall.

  5. phb says:

    I will comment on this further maybe if I get my own desktop system back.

    I discovered a similar ‘issue’ with RAID not helping. In fact my machine currently has four drives attached, boots fine but the RAID array declares its state to be ‘rebuilding’ and has done for two weeks now. One of the drives is effectively a replacement that needs to be mirrored to its RAID pair but for some reason the designers of the array neglected to be fitted with brains. Rebuildin the array is a two step process, first you tell the system to add the drive back into the array, next you tell it which on to ‘rebuild’. No instructions are supplied to tell the user whether rebuild is a command issued to the master or the blank mirror.

    Oh and it is impossible to get Vista to restore a backup created on a non-RAID machine onto a RAID array. The driver has to be mounted and the restore from backup option hasn’t a clue how to do that on the fly.

  6. C.E. Petit says:

    Two potential performance notes:

    (1) Do not let Windows “index” the disk, and turn off FindFast. In XP, open My Computer, right-click on the hard disk, and (depending on your particular install) it will be in either the top tab or another one under Properties. Uncheck the box. Wait (once).

    Windows’s native “indexing” is a huge performance drag, particularly on larger hard disks. Instead, you’re much better off using a good indexer like Copernic.

    (2) Consider dividing the big disks into multiple partitions. This is no panacaea, but can be incredibly helpful when one can clearly divide “programs” from “data” (which, unfortunately, is not as easy as it sounds, especially if there’s any encryption involved).

  7. Ed Bott says:

    FindFast? Uh, that was a component in Office 97. Hasn’t been a part of any Microsoft software sold in this century. And yes, it used to be a good idea to turn it off on a Pentum 2 running Windows 95. But on reasonably modern hardware (including the five-year-old system described here) running a 32-bit OS like XP, there is no reason to even think twice about it.

  8. C.E. Petit says:

    Ed, I should have been clearer: FindFast is the internal process name still used by Office, and it’s so buggy that it can’t handle a disk over 256gb. I still refer to it that way because I do a lot of support for people with older machines and/or versions of Office.

    In any event, I should have said “any Windows native indexing service”. If you need an index, DO NOT use one that hooks in at the OS level for large (>200gb today) disks, especially if you’ve got any thought of using RAID.

  9. Sorry to hear your tale of woe. (And I htought I had problems…) I haven’t tried it with any sort of RAID, bit FWIW, for partition-related tasks including cloning and backup (as well as the unusual ability to resize all types of partitions without damage to the data on them) I recommend BootIt NG. I’ve been using it for years, and it has always worked flawlessly, even when performing seemingly “dangerous” operations on live data.

  10. Wow that is horrible! I can’t imagine loosing all of data for my business and having to recreate a month’s work of information. About 6 months ago I was taking kids that I coach on a bus trip and downloading pictures from the weekend onto my laptop. After I downloaded them one of the girls on the team asked if she could see them. She was looking at them and apparently got side tracked. She ended up knocking my laptop off of her seat and down the stairs of the bus. As I’m sure you can guess… not good. I grabbed my computer and immediately tried to check if I’d lost my photos but the computer was frozen. There was no way that I could get it to unfreeze or turn off properly. My computer was toast. Luckly , after my sister had pointed out to me how devastating it woudl be if my computer failed and I somehow lost all of my photos I looked into backup. I decided to get online backup because all of the data is automatically backed up for me. That way when I forget to actually backup my data it is ok because its done for me. To make a long story short, thanx to my online backup I was able to recover all of the photos from my backup provider to my brand new laptop. It was quick and the only pain was the fact that I had to buy a new computer. Best move I ever made! Look into it!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Notify me of followup comments via e-mail. You can also subscribe without commenting.