home | list info | list archive | date index | thread index

[OCLUG-Tech] kernel retries to much reading a bad disk

I got a bad disk and trying to copy whatever I can from it to another
disk with "dd conv=noerror ..." but ran in to a problem with infinite
retries.

I start it and it runs for a while, find a bad spot, reports it, moves
on, just as I expected. 
Then it comes to some spot where it seems to be a bit worse and no it
hangs and the only way out is reboot.
I did "strace -p pid_of_dd" and left if for some H but not one command
was passed
/var/log/messages shows how the kernel reads on block/sector 1-8 times,
then moves on to next and so on. I left it over night and the only thing
that happened was that the log was counting higher.
I powered off the disk (external eSata case) and it still doesn't
release.

What's up with the kernel retrying all the time. Since it kept on
retrying even with the disk off I'm guessing it would probably keep on
going until the max sector# is reach.

What can I do to have it to try a few times and then move on?
Is there some other program that I can read the disk with then dd ?


I'm running OpenSuse 10.1 with kernel 2.6.16.21-0.13

Sample from messages

Sep 17 21:07:08 picard kernel: ata2: translated ATA stat/err 0x51/40 to SCSI SK/ASC/ASCQ 0x3/11/04
Sep 17 21:07:08 picard kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Sep 17 21:07:08 picard kernel: ata2: error=0x40 { UncorrectableError }
Sep 17 21:07:08 picard kernel: sd 3:0:0:0: SCSI error: return code = 0x8000002
Sep 17 21:07:08 picard kernel: sdb: Current: sense key: Medium Error
Sep 17 21:07:08 picard kernel:     Additional sense: Unrecovered read error - auto reallocate failed
Sep 17 21:07:08 picard kernel: end_request: I/O error, dev sdb, sector 10290296
Sep 17 21:07:08 picard kernel: Buffer I/O error on device sdb, logical block 1286287
Sep 17 21:07:12 picard kernel: ata2: translated ATA stat/err 0x51/40 to SCSI SK/ASC/ASCQ 0x3/11/04
Sep 17 21:07:12 picard kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Sep 17 21:07:12 picard kernel: ata2: error=0x40 { UncorrectableError }
Sep 17 21:07:16 picard kernel: ata2: translated ATA stat/err 0x51/40 to SCSI SK/ASC/ASCQ 0x3/11/04
Sep 17 21:07:16 picard kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Sep 17 21:07:16 picard kernel: ata2: error=0x40 { UncorrectableError }
Sep 17 21:07:20 picard kernel: ata2: translated ATA stat/err 0x51/40 to SCSI SK/ASC/ASCQ 0x3/11/04
Sep 17 21:07:20 picard kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Sep 17 21:07:20 picard kernel: ata2: error=0x40 { UncorrectableError }
Sep 17 21:07:24 picard kernel: ata2: translated ATA stat/err 0x51/40 to SCSI SK/ASC/ASCQ 0x3/11/04
Sep 17 21:07:24 picard kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Sep 17 21:07:24 picard kernel: ata2: error=0x40 { UncorrectableError }
Sep 17 21:07:27 picard kernel: ata2: translated ATA stat/err 0x51/40 to SCSI SK/ASC/ASCQ 0x3/11/04
Sep 17 21:07:27 picard kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Sep 17 21:07:27 picard kernel: ata2: error=0x40 { UncorrectableError }
Sep 17 21:07:31 picard kernel: ata2: translated ATA stat/err 0x51/40 to SCSI SK/ASC/ASCQ 0x3/11/04
Sep 17 21:07:31 picard kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Sep 17 21:07:31 picard kernel: ata2: error=0x40 { UncorrectableError }
Sep 17 21:07:31 picard kernel: sd 3:0:0:0: SCSI error: return code = 0x8000002
Sep 17 21:07:31 picard kernel: sdb: Current: sense key: Medium Error
Sep 17 21:07:31 picard kernel:     Additional sense: Unrecovered read error - auto reallocate failed
Sep 17 21:07:31 picard kernel: end_request: I/O error, dev sdb, sector 10290304
Sep 17 21:07:31 picard kernel: Buffer I/O error on device sdb, logical block 1286288
Sep 17 21:07:35 picard kernel: ata2: translated ATA stat/err 0x51/40 to SCSI SK/ASC/ASCQ 0x3/11/04
Sep 17 21:07:35 picard kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Sep 17 21:07:35 picard kernel: ata2: error=0x40 { UncorrectableError }
Sep 17 21:07:39 picard kernel: ata2: translated ATA stat/err 0x51/40 to SCSI SK/ASC/ASCQ 0x3/11/04
Sep 17 21:07:39 picard kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Sep 17 21:07:39 picard kernel: ata2: error=0x40 { UncorrectableError }
Sep 17 21:07:43 picard kernel: ata2: translated ATA stat/err 0x51/40 to SCSI SK/ASC/ASCQ 0x3/11/04
Sep 17 21:07:43 picard kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Sep 17 21:07:43 picard kernel: ata2: error=0x40 { UncorrectableError }
Sep 17 21:07:46 picard kernel: ata2: translated ATA stat/err 0x51/40 to SCSI SK/ASC/ASCQ 0x3/11/04
Sep 17 21:07:46 picard kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Sep 17 21:07:46 picard kernel: ata2: error=0x40 { UncorrectableError }
Sep 17 21:07:50 picard kernel: ata2: translated ATA stat/err 0x51/40 to SCSI SK/ASC/ASCQ 0x3/11/04
Sep 17 21:07:50 picard kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Sep 17 21:07:50 picard kernel: ata2: error=0x40 { UncorrectableError }


BTW, what's the diff between sector and block? I thought I knew but
looking in the log the numbers doesn't make sense. Sector 10290296=
block 1286287



-- 
-------------------------------------------------------------------
Techwiz, Peter Sjoberg    PGP key (12F506C8) on keyserver & homepage
Key fingerprint =  3DC2 CEBA 1590 B41A 3780  955A DB42 02BB 12F5 06C8
mailto:peters AT techwiz.ca http://www.techwiz.ca/~peters