Daily Archives: March 3, 2012

Unreported Disk Data Corruption – Kernel Bug?

Well this is new, and I’m utterly baffled. Here’s a file that’s not in use by anything.


$ md5sum xppro.vdi
589cbb5501dcddda047344a3550aaa95 xppro.vdi
$ md5sum xppro.vdi
a69806ec60d39e06473edbb0abd71637 xppro.vdi

Every time I run md5sum on it, I get a different answer. Same story with sha256sum. If I grab just the first 100MB, it gives the same answer each time. dmesg doesn’t show any sort of errors whatsoever during the time I’m running the tools. The file is 13GB, and was copied from one laptop to another (the new one being a Thinkpad T420s). The old laptop gives the same answer every time. The new one doesn’t.

I’ve put the file on different ext4 filesystems on the same machine (one using LUKS encryption, the other not, both under LVM) – same result. This will have also guaranteed different placement on the underlying hard disk.

I verified that nothing is modifying the file by using lsof and inotify. The system is a freshly-installed Debian wheezy running kernel 3.2.0-1-amd64. Any ideas how I go about troubleshooting/fixing this? So far I don’t know if it’s hardware or software, though my gut says software; SMART isn’t showing issues here, and the kernel didn’t log hardware issues, either.