Laptop broken, how do I diagnose what is wrong?

So, one fucked laptop here. Machine is a Toshiba Satellite A300, four-five years old, all drivers up to date, no software or hardware changes between Sunday and today. Updated to Win 8 six months ago, has run passably until today, with occasional crashes on hibernation.

Rather odd symptoms: it will boot fine and let me sign in, all the way to the Win8 Start screen, clicking on the Desktop gets me the desktop, but then no further responsiveness. No response to mouse clicks, key presses, CTRL-ALT-DEL. So then:

  • Boot into safe mode, works fine.
  • Try refresh, can’t make a refresh disk in safe mode, can’t do anything in unsafe mode. No refresh possible.
  • Try reinstalling Win 8. No reinstall disk, running the Win8 setup utility from safe mode fails with no error message.
  • Umm… fucked if I know…

So, lazyweb, what should I do to try to diagnose what the problem is? Or should I just take it to the Toshiba service place in town who, truth be told, have been great at fixing hardware problems on several Toshiba laptops. Or should I just stump up an irritatingly large chunk of cash to replace a laptop that’s only four years old?

Before anyone says “don’t use Windows”, I’ll just point out that I’ve been using Linux since Redhat 5.2 and the current set of mechanical or electrical design software for Linux is a heap of shite. And I trust you all know me well enough not to suggest I use a Mac.

24 thoughts on “Laptop broken, how do I diagnose what is wrong?”

  1. My psychic diagnosis is that the disk has gone bad in one section, and it happens to be a section is involved in displaying additional things after the desktop. Since both the disk and Windows will retry reading from the disk many times (think well over 100 between them), with pretty long timeouts, a non-responsive disk can be almost indistinguishable from frozen. (FWIW, my usual in-front-of-keyboard test for “is it completely frozen” is “does capslock toggle the LED”, since that’s one of the few things that doesn’t normally trigger a disk access; remotely my usual test is “ping” — if you can ping it, but not get it to launch new things, it could be a disk issue or a userspace issue.)

    Given the things you’ve already tried, my next move would be to boot a Linux LiveCD and run the SMART tools (eg, “smartctl -d ata -s on /dev/sda; smartctl -d ata -a /dev/sda”) to see if the disk is reporting problems. If nothing is immediately apparent there, I’d run a short and/or long SMART disk check (ie surface scan) (“smartctl -d ata -t short /dev/sda” or “smartctl -d ata -t long /dev/sda”, wait at least the time it tells you, then run the “smartctl -d ata -a /dev/sda” again to see the updated report, particular the error log at the end and the test results near the end). I’d guess that the surface scan will show read errors. (The “short” scan usually takes about 2 minutes and reads a sampling of sectors from the disk; the “long” scan will read everything and usually take an hour or more.)

    Assuming it’s the disk, the routine is (a) copy as much as possible (eg, while running said Linux LiveCD) onto another disk (eg, an external drive/another machine over the network), (b) install a new disk, possibly involving a technician — but beware most service desks will (1) declare success as soon as they’ve reinstalled Windows on the new disk, and (2) probably fail to give you the old disk back, assuming you’ll restore from backup. Then (c) copy the things you managed to rescue/things from your backup onto the new disk. On most disks that have only recently failed you can salvage 50-90% of the contents by mounting the disk read-only and copying folder by folder then file by file to fill in the gaps.

    Ewen

    PS: While I was there I’d probably run the memory tester. But it doesn’t feel like a memory error from what you describe.

    PPS: 4-5 years is about the median reasonable life for a hard drive. I’ve had some fail in under 6 months. And just last week had one fail that was about 2.5 years old.

    1. FYI, I’ve used System Rescue CD for that sort of thing before (and there are a bunch of others that I’ve not tried). But any Linux CD that you have handy which will boot on the hardware and recognise the disk and has a LiveCD should have “smartmontools” or similar on it. Ubuntu and Knoppix CDs tend to work fine, but will take longer to boot as they start a full GUI environment (from a relatively slow CD).

      It’s possible that the SMART log and short test will show no issues, but the long test will turn up something, so it might be worth the time to run the long test too if nothing else shows. But usually I’ve found by the point that the disk gets bad enough to cause noticeable problems in day-to-day usage, it’ll be filling the SMART log with errors and the short test will rapidly find a problem.

      And I would genuinely try to copy as much as possible off the disk to something else, starting with the things you most want to save (eg, “more recent than backups” versions of things). With many of my failing disks I’ve managed to copy almost everything I didn’t have in a backup off them.

      Ewen

    2. FYI, I’ve used System Rescue CD for that sort of thing before (and there are a bunch of others that I’ve not tried). But any Linux CD that you have handy which will boot on the hardware and recognise the disk and has a LiveCD should have “smartmontools” or similar on it. Ubuntu and Knoppix CDs tend to work fine, but will take longer to boot as they start a full GUI environment (from a relatively slow CD).

      It’s possible that the SMART log and short test will show no issues, but the long test will turn up something, so it might be worth the time to run the long test too if nothing else shows. But usually I’ve found by the point that the disk gets bad enough to cause noticeable problems in day-to-day usage, it’ll be filling the SMART log with errors and the short test will rapidly find a problem.

      And I would genuinely try to copy as much as possible off the disk to something else, starting with the things you most want to save (eg, “more recent than backups” versions of things). With many of my failing disks I’ve managed to copy almost everything I didn’t have in a backup off them.

      Ewen

    3. FYI, I’ve used System Rescue CD for that sort of thing before (and there are a bunch of others that I’ve not tried). But any Linux CD that you have handy which will boot on the hardware and recognise the disk and has a LiveCD should have “smartmontools” or similar on it. Ubuntu and Knoppix CDs tend to work fine, but will take longer to boot as they start a full GUI environment (from a relatively slow CD).

      It’s possible that the SMART log and short test will show no issues, but the long test will turn up something, so it might be worth the time to run the long test too if nothing else shows. But usually I’ve found by the point that the disk gets bad enough to cause noticeable problems in day-to-day usage, it’ll be filling the SMART log with errors and the short test will rapidly find a problem.

      And I would genuinely try to copy as much as possible off the disk to something else, starting with the things you most want to save (eg, “more recent than backups” versions of things). With many of my failing disks I’ve managed to copy almost everything I didn’t have in a backup off them.

      Ewen

  2. My psychic diagnosis is that the disk has gone bad in one section, and it happens to be a section is involved in displaying additional things after the desktop. Since both the disk and Windows will retry reading from the disk many times (think well over 100 between them), with pretty long timeouts, a non-responsive disk can be almost indistinguishable from frozen. (FWIW, my usual in-front-of-keyboard test for “is it completely frozen” is “does capslock toggle the LED”, since that’s one of the few things that doesn’t normally trigger a disk access; remotely my usual test is “ping” — if you can ping it, but not get it to launch new things, it could be a disk issue or a userspace issue.)

    Given the things you’ve already tried, my next move would be to boot a Linux LiveCD and run the SMART tools (eg, “smartctl -d ata -s on /dev/sda; smartctl -d ata -a /dev/sda”) to see if the disk is reporting problems. If nothing is immediately apparent there, I’d run a short and/or long SMART disk check (ie surface scan) (“smartctl -d ata -t short /dev/sda” or “smartctl -d ata -t long /dev/sda”, wait at least the time it tells you, then run the “smartctl -d ata -a /dev/sda” again to see the updated report, particular the error log at the end and the test results near the end). I’d guess that the surface scan will show read errors. (The “short” scan usually takes about 2 minutes and reads a sampling of sectors from the disk; the “long” scan will read everything and usually take an hour or more.)

    Assuming it’s the disk, the routine is (a) copy as much as possible (eg, while running said Linux LiveCD) onto another disk (eg, an external drive/another machine over the network), (b) install a new disk, possibly involving a technician — but beware most service desks will (1) declare success as soon as they’ve reinstalled Windows on the new disk, and (2) probably fail to give you the old disk back, assuming you’ll restore from backup. Then (c) copy the things you managed to rescue/things from your backup onto the new disk. On most disks that have only recently failed you can salvage 50-90% of the contents by mounting the disk read-only and copying folder by folder then file by file to fill in the gaps.

    Ewen

    PS: While I was there I’d probably run the memory tester. But it doesn’t feel like a memory error from what you describe.

    PPS: 4-5 years is about the median reasonable life for a hard drive. I’ve had some fail in under 6 months. And just last week had one fail that was about 2.5 years old.

  3. My psychic diagnosis is that the disk has gone bad in one section, and it happens to be a section is involved in displaying additional things after the desktop. Since both the disk and Windows will retry reading from the disk many times (think well over 100 between them), with pretty long timeouts, a non-responsive disk can be almost indistinguishable from frozen. (FWIW, my usual in-front-of-keyboard test for “is it completely frozen” is “does capslock toggle the LED”, since that’s one of the few things that doesn’t normally trigger a disk access; remotely my usual test is “ping” — if you can ping it, but not get it to launch new things, it could be a disk issue or a userspace issue.)

    Given the things you’ve already tried, my next move would be to boot a Linux LiveCD and run the SMART tools (eg, “smartctl -d ata -s on /dev/sda; smartctl -d ata -a /dev/sda”) to see if the disk is reporting problems. If nothing is immediately apparent there, I’d run a short and/or long SMART disk check (ie surface scan) (“smartctl -d ata -t short /dev/sda” or “smartctl -d ata -t long /dev/sda”, wait at least the time it tells you, then run the “smartctl -d ata -a /dev/sda” again to see the updated report, particular the error log at the end and the test results near the end). I’d guess that the surface scan will show read errors. (The “short” scan usually takes about 2 minutes and reads a sampling of sectors from the disk; the “long” scan will read everything and usually take an hour or more.)

    Assuming it’s the disk, the routine is (a) copy as much as possible (eg, while running said Linux LiveCD) onto another disk (eg, an external drive/another machine over the network), (b) install a new disk, possibly involving a technician — but beware most service desks will (1) declare success as soon as they’ve reinstalled Windows on the new disk, and (2) probably fail to give you the old disk back, assuming you’ll restore from backup. Then (c) copy the things you managed to rescue/things from your backup onto the new disk. On most disks that have only recently failed you can salvage 50-90% of the contents by mounting the disk read-only and copying folder by folder then file by file to fill in the gaps.

    Ewen

    PS: While I was there I’d probably run the memory tester. But it doesn’t feel like a memory error from what you describe.

    PPS: 4-5 years is about the median reasonable life for a hard drive. I’ve had some fail in under 6 months. And just last week had one fail that was about 2.5 years old.

      1. FWIW, the caps lock LED is under the control of the keyboard driver in the OS — so being able to toggle the caps lock LED, with the caps lock key, means that parts of the OS are still running (and hence the CPU isn’t completely wedged).

        Ewen

      2. FWIW, the caps lock LED is under the control of the keyboard driver in the OS — so being able to toggle the caps lock LED, with the caps lock key, means that parts of the OS are still running (and hence the CPU isn’t completely wedged).

        Ewen

      3. FWIW, the caps lock LED is under the control of the keyboard driver in the OS — so being able to toggle the caps lock LED, with the caps lock key, means that parts of the OS are still running (and hence the CPU isn’t completely wedged).

        Ewen

      4. full removal of battery and power cable, hold power button on for a while then let it off, replace the battery etc and power it up?

        just cos it seems like some sort of pci/usb conflict or something

      5. full removal of battery and power cable, hold power button on for a while then let it off, replace the battery etc and power it up?

        just cos it seems like some sort of pci/usb conflict or something

      6. full removal of battery and power cable, hold power button on for a while then let it off, replace the battery etc and power it up?

        just cos it seems like some sort of pci/usb conflict or something

  4. Also – if you haven’t tried it yet, try it without the power supply plugged in. It’s less common in laptops for obvious reasons, but sometimes this sort of thing can be caused by a faulty PSU.

  5. Also – if you haven’t tried it yet, try it without the power supply plugged in. It’s less common in laptops for obvious reasons, but sometimes this sort of thing can be caused by a faulty PSU.

  6. Also – if you haven’t tried it yet, try it without the power supply plugged in. It’s less common in laptops for obvious reasons, but sometimes this sort of thing can be caused by a faulty PSU.

Leave a Reply

Your email address will not be published. Required fields are marked *