How to debugging Linux failure to resume from suspend?

I’m running a computer with a Gigabyte Z68A-D3H-B3 motherboard, and have never been able to get it to properly resume from suspend to RAM in Linux. It has worked fine on the rare occasion I’ve tried it in Windows 7.

My somewhat limited usual for debugging aren’t particularly helpful. The system appears to suspend perfectly fine. It just doesn’t resume. To be more precise, when I push the button to resume, the power comes up (fans whir, HDD spins up, etc.) but nothing happens. The USB keyboard and mouse don’t respond, Caps Lock doesn’t toggle any LEDs, it doesn’t respond on the wired LAN, and the display stays off.

Although it’s a desktop, I’d really like to save power on this thing by suspending it when it’s not in use. There’s no sense in wasting power I don’t need to be consuming.

I’ve tried what I used to try on laptops. I tried running in single-user mode, without X, or even the kernel modules for video acceleration loaded. I tried unloading whatever hardware modules I thought I could without completely destabilizing the system. I updated the BIOS to the latest release. I tried various combinations of video tweaks. I tried using s2ram from uswsusp instead of pm-suspend. Nothing made any difference. They all behaved exactly the same.

Googling showed a lot of resources for people that had trouble getting their machines to go to sleep. And also for people whose machines would wake up but just wouldn’t re-activate the display. But precious little for people with my particular symptoms.

What’s a good place to start looking to fix something like this?

Some details…

CPU is Core i5-2400. Kernel is wheezy’s 3.2.0-2-amd64, though this problem has persisted as long as I’ve had this machine, which was running squeeze at install time. Video is NVidia GeForce GTX 560 (GF114). Hard drives are SATA, Ethernet is integrated RTL8111/8168B. Userland is up-to-date amd64 wheezy.

17 thoughts on “How to debugging Linux failure to resume from suspend?

  1. How are you suspending? Try going to the first virtual terminal Ctrl-Alt-F1 and the running “sudo pm-suspend”. That should do a proper suspend and then try and wake up.

    1. John Goerzen says:

      I’ve tried that. In fact, I’ve tried it in single-user mode where X never even had the chance to load. And I’ve tried it with a bunch of modules unloaded. That exact pm-suspend command.

  2. jisakiel says:

    Perhaps you might try netconsole and configuring syslog to logging to a different computer, as it doesn’t seem x related…

  3. Anonymous says:

    Write up what you tested, and post it to LKML with CCs to Rafael Wysocki and Pavel Machek, requesting assistance debugging the resume failure and offering to try any troubleshooting or information-gathering steps they might suggest.

  4. Ben Hutchings says:

    I don’t know whether you’ve gone through this already, but: http://www.kernel.org/doc/Documentation/power/basic-pm-debugging.txt

  5. John Goerzen says:

    Some good tips there, Ben, Thanks.

    I don’t think the syslog will help; the problem seems to be on the way up, and the network is broken at that point, though I suppose it wouldn’t hurt to validate that assumption.

  6. Paul Hedderly says:

    Do you have a serial port on that machine? You could try setting kernel logging/console to ttyS0 and watch that with another machine obviously.

    If that mobo doesnt have serial… lament!

  7. eMHa says:

    Have you tried to suspend/resume using an up-to-date Live-CD (Debian, Ubuntu, or any other distro)? Thats what I usually “try on laptops” to figure where the problem might be.

  8. gena2x says:

    Treakiest thing in kernel is debugging suspend/resume.

    Things to try for ordinary users are:
    1. Try different kernel versions. Downgrade, upgrade.
    2. Try to disable as much hardware as possible (for example, just move out kernel modules
    Kernel hacker would do:
    1. Try serial console to view kernel log.
    2. Try to save kernel log to some storage and replay on reboot.
    3. Try to disable framebuffer to see log
    4. Try to enable kernel debugging options.

    Hope that options would be enough to entertain you for a few days =)

  9. Aaron Brooks says:

    Because of the disappearance of legacy serial devices, the USB EHCI standard has an optional specification for EHCI debug ports. A good starting point is this page: http://www.coreboot.org/EHCI_Debug_Port though you’ll probably need to dig through the current kernel docs and, if things are in the same state as the last time I used one of these, the kernel sources. (I also had to do some minor patching to get things to work.)

    The device I have is a Ajays Tech NET20DC: http://www.ajaystech.com/net20dc.htm . Having worked with several of these I’ve found that they can vary in the orientation of the primary and secondary interfaces in a way not observable from the outer case. You may need to turn things around. The only difference that I’ve found between the primary and secondary interface is that the device draws power from the primary interface and, usually, it’s more convenient to have the primary interface connected to the debugging host rather than the debugee so the device remains powered.

    Not all motherboards support these devices but all the ones that I’ve tried so far do. The main trick is to figure out which USB port is the 0th port as only that port will support the debug interface. On desktops this is likely accomplishable. With notebooks, you are at the mercy of the system designer to wire up and provide a USB connector to the 0th port. Note that most motherboards have additional headers for USB interfaces that may not be connected to the backplate or the chassis. Also, the debug device will not work through a hub.

    I wish I remembered the details of the correct kernel commandline options for EHCI debug support but those are gone from my memory.

    Good luck!

    -A.

  10. Christian says:

    Try adding this to the kernel boot command-line, it solved a similar problem for me. No idea what it does but it doesn’t seem to affect performance.

    acpi_osi=linux noapic

  11. Paul Menzel says:

    Were you able to solve this problem?

  12. Tim H says:

    I just “upgraded” from Ubuntu Lucid to precise. Under Lucid, I had 100% success with suspend and resume. Under Precise it kernel panics on resume every time (blinking keyboard lights, no display). It panics about 2 seconds after pressing the power button.

    Of course I have no serial port on this computer.

    So basically I am screwed, as best I can tell. This is a pretty critical feature for me.

  13. caique_ti says:

    In my case, when resuming the netbook works normally until automatic shutdown.. with no logs. =/

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.