[RESOLU] Erreur disque dur sata

j'ai acheté un disque dur sata en octembre ou novembre, je m'en souviens plus, ainsi qu'une carte sontroleur sata (chip sil3512).

Et sa m'est arrivé d'avoir des "déconnexions" de mon disque dur.

J'avais un moment échangé l'emplacement de mon disque dur avec mon ancien (pata), et donc rebrancher le tout, et j'avais plus d'erreur pendant un bout de temps, quand sa m'a reprit j'ai revérifié encore une fois la connectique et sa avait l'air d'allé.

Et aujourd'hui ça me le refait >.<

Et donc j'aimerais bien savoir d'où vient le problème, est-ce un problème du disque dur, dans ces cas je l'ai donc acheté défectueux, ou bien de connectique. (je trouve que le fait que les nappes sata n'ait pas de "bloquage" c'est vraiment de la m*rde :reflechis: ).

Bref, voici par exemple les dernier messages d'erreurs que j'ai pu relevé dans mon /var/log/syslog, ainsi que le rapport smart (smartctl -a /dev/sda)

ps: pour info, j'utilise une debian sid, et le disque dur concerné a 1 partition ext3.

Merci d'avance.

(Et j'espère que mon message est assez clair, et dans la bonne section, je voulais le posté dans la section linux au début :transpi: )

l'extrait du syslog :

Feb 15 21:40:01 redfox kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x280000 action 0x0
Feb 15 21:40:01 redfox kernel: ata1.00: BMDMA2 stat 0x86d2209
Feb 15 21:40:01 redfox kernel: ata1: SError: { 10B8B BadCRC }
Feb 15 21:40:01 redfox kernel: ata1.00: cmd 25/00:00:c7:97:04/00:01:10:00:00/e0 tag 0 dma 131072 in
Feb 15 21:40:01 redfox kernel:		  res 51/04:00:c7:97:04/00:01:10:00:00/e0 Emask 0x1 (device error)
Feb 15 21:40:01 redfox kernel: ata1.00: status: { DRDY ERR }
Feb 15 21:40:01 redfox kernel: ata1.00: error: { ABRT }
Feb 15 21:40:01 redfox kernel: ata1.00: configured for UDMA/100
Feb 15 21:40:01 redfox kernel: ata1: EH complete
Feb 15 21:40:01 redfox kernel: sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
Feb 15 21:40:01 redfox kernel: sd 0:0:0:0: [sda] Write Protect is off
Feb 15 21:40:01 redfox kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Feb 15 21:40:01 redfox kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Feb 15 21:40:41 redfox kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x280000 action 0x2 frozen
Feb 15 21:40:41 redfox kernel: ata1: SError: { 10B8B BadCRC }
Feb 15 21:40:41 redfox kernel: ata1.00: cmd 35/00:08:4f:00:d0/00:00:21:00:00/e0 tag 0 dma 4096 out
Feb 15 21:40:41 redfox kernel:		  res 40/00:00:c7:97:04/00:01:10:00:00/e0 Emask 0x4 (timeout)
Feb 15 21:40:41 redfox kernel: ata1.00: status: { DRDY }
Feb 15 21:40:46 redfox kernel: ata1: port is slow to respond, please be patient (Status 0xd8)
Feb 15 21:40:51 redfox kernel: ata1: device not ready (errno=-16), forcing hardreset
Feb 15 21:40:51 redfox kernel: ata1: hard resetting link
Feb 15 21:40:51 redfox kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Feb 15 21:40:52 redfox kernel: ata1.00: configured for UDMA/100
Feb 15 21:40:52 redfox kernel: ata1: EH complete
Feb 15 21:40:52 redfox kernel: sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
Feb 15 21:40:52 redfox kernel: sd 0:0:0:0: [sda] Write Protect is off
Feb 15 21:40:52 redfox kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Feb 15 21:40:52 redfox kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Feb 15 21:40:52 redfox kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x280000 action 0x2 frozen
Feb 15 21:40:52 redfox kernel: ata1: SError: { 10B8B BadCRC }
Feb 15 21:40:52 redfox kernel: ata1.00: cmd 25/00:00:17:4c:77/00:01:2a:00:00/e0 tag 0 dma 131072 in
Feb 15 21:40:52 redfox kernel:		  res ff/ff:ff:ff:ff:ff/ff:ff:ff:ff:ff/ff Emask 0x2 (HSM violation)
Feb 15 21:40:52 redfox kernel: ata1.00: status: { Busy }
Feb 15 21:40:52 redfox kernel: ata1.00: error: { ICRC UNC IDNF ABRT }
Feb 15 21:40:58 redfox kernel: ata1: port is slow to respond, please be patient (Status 0xff)
Feb 15 21:41:02 redfox kernel: ata1: device not ready (errno=-16), forcing hardreset
Feb 15 21:41:02 redfox kernel: ata1: hard resetting link
Feb 15 21:41:03 redfox kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Feb 15 21:41:03 redfox kernel: ata1.00: configured for UDMA/100
Feb 15 21:41:03 redfox kernel: ata1: EH complete
Feb 15 21:41:03 redfox kernel: sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
Feb 15 21:41:03 redfox kernel: sd 0:0:0:0: [sda] Write Protect is off
Feb 15 21:41:03 redfox kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Feb 15 21:41:03 redfox kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

smartctl -a /dev/sda

smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is

Model Family:	 Seagate Barracuda 7200.10 family
Device Model:	 ST3500630AS
Serial Number:	9QG47TEH
Firmware Version: 3.AAK
User Capacity:	500 107 862 016 bytes
Device is:		In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:	Fri Feb 15 21:43:56 2008 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
									was completed without error.
									Auto Offline Data Collection: Enabled.
Self-test execution status:	  (   0) The previous self-test routine completed
									without error or no self-test has ever
									been run.
Total time to complete Offline
data collection:				 ( 430) seconds.
Offline data collection
capabilities:					(0x5b) SMART execute Offline immediate.
									Auto Offline data collection on/off support.
									Suspend Offline collection upon new
									Offline surface scan supported.
									Self-test supported.
									No Conveyance Self-test supported.
									Selective Self-test supported.
SMART capabilities:			(0x0003) Saves SMART data before entering
									power-saving mode.
									Supports SMART auto save timer.
Error logging capability:		(0x01) Error logging supported.
									General Purpose Logging supported.
Short self-test routine
recommended polling time:		(   1) minutes.
Extended self-test routine
recommended polling time:		( 163) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
 1 Raw_Read_Error_Rate	 0x000f   108   094   006	Pre-fail  Always	   -	   216083163
 3 Spin_Up_Time			0x0003   093   093   000	Pre-fail  Always	   -	   0
 4 Start_Stop_Count		0x0032   100   100   020	Old_age   Always	   -	   178
 5 Reallocated_Sector_Ct   0x0033   100   100   036	Pre-fail  Always	   -	   0
 7 Seek_Error_Rate		 0x000f   076   060   030	Pre-fail  Always	   -	   48334551
 9 Power_On_Hours		  0x0032   099   099   000	Old_age   Always	   -	   1196
10 Spin_Retry_Count		0x0013   100   100   097	Pre-fail  Always	   -	   0
12 Power_Cycle_Count	   0x0032   100   100   020	Old_age   Always	   -	   179
187 Unknown_Attribute	   0x0032   100   100   000	Old_age   Always	   -	   0
189 Unknown_Attribute	   0x003a   100   100   000	Old_age   Always	   -	   0
190 Temperature_Celsius	 0x0022   066   052   045	Old_age   Always	   -	   605552674
194 Temperature_Celsius	 0x0022   034   048   000	Old_age   Always	   -	   34 (Lifetime Min/Max 0/20)
195 Hardware_ECC_Recovered  0x001a   056   051   000	Old_age   Always	   -	   211340190
197 Current_Pending_Sector  0x0012   100   100   000	Old_age   Always	   -	   0
198 Offline_Uncorrectable   0x0010   100   100   000	Old_age   Offline	  -	   0
199 UDMA_CRC_Error_Count	0x003e   200   153   000	Old_age   Always	   -	   238
200 Multi_Zone_Error_Rate   0x0000   100   253   000	Old_age   Offline	  -	   0
202 TA_Increase_Count	   0x0032   100   253   000	Old_age   Always	   -	   0

SMART Error Log Version: 1
ATA Error Count: 204 (device log contains only the most recent five errors)
	CR = Command Register [HEX]
	FR = Features Register [HEX]
	SC = Sector Count Register [HEX]
	SN = Sector Number Register [HEX]
	CL = Cylinder Low Register [HEX]
	CH = Cylinder High Register [HEX]
	DH = Device/Head Register [HEX]
	DC = Device Command Register [HEX]
	ER = Error register [HEX]
	ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 204 occurred at disk power-on lifetime: 1196 hours (49 days + 20 hours)
 When the command that caused the error occurred, the device was active or idle.

 After command completion occurred, registers were:
 -- -- -- -- -- -- --
 84 51 ef 28 4c 77 e0  Error: ICRC, ABRT 239 sectors at LBA = 0x00774c28 = 7818280

 Commands leading to the command that caused the error were:
 CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
 -- -- -- -- -- -- -- --  ----------------  --------------------
 25 00 00 17 4c 77 e0 00	  02:11:06.963  READ DMA EXT
 25 00 00 17 4b 77 e0 00	  02:11:06.994  READ DMA EXT
 25 00 00 17 4a 77 e0 00	  02:11:06.991  READ DMA EXT
 25 00 00 17 49 77 e0 00	  02:11:06.989  READ DMA EXT
 25 00 00 17 48 77 e0 00	  02:11:06.987  READ DMA EXT

Error 203 occurred at disk power-on lifetime: 1196 hours (49 days + 20 hours)
 When the command that caused the error occurred, the device was active or idle.

 After command completion occurred, registers were:
 -- -- -- -- -- -- --
 84 51 1f a8 98 04 e0  Error: ICRC, ABRT 31 sectors at LBA = 0x000498a8 = 301224

 Commands leading to the command that caused the error were:
 CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
 -- -- -- -- -- -- -- --  ----------------  --------------------
 25 00 00 c7 97 04 e0 00	  02:10:15.588  READ DMA EXT
 ca 00 08 67 41 00 e0 00	  02:10:15.573  WRITE DMA
 ca 00 40 27 41 00 e0 00	  02:10:15.521  WRITE DMA
 c8 00 08 6f 68 b3 e8 00	  02:10:15.489  READ DMA
 c8 00 10 ef 2e b3 e8 00	  02:10:15.843  READ DMA

Error 202 occurred at disk power-on lifetime: 1192 hours (49 days + 16 hours)
 When the command that caused the error occurred, the device was active or idle.

 After command completion occurred, registers were:
 -- -- -- -- -- -- --
 84 51 00 d6 14 65 e0  Error: ICRC, ABRT at LBA = 0x006514d6 = 6624470

 Commands leading to the command that caused the error were:
 CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
 -- -- -- -- -- -- -- --  ----------------  --------------------
 25 00 a0 37 13 65 e0 00	  07:17:08.591  READ DMA EXT
 25 00 68 cf 12 65 e0 00	  07:17:08.591  READ DMA EXT
 35 00 08 b7 00 4c e0 00	  07:17:08.591  WRITE DMA EXT
 ca 00 08 e7 3c 00 e0 00	  07:17:02.856  WRITE DMA
 ca 00 08 df 3c 00 e0 00	  07:17:02.834  WRITE DMA

Error 201 occurred at disk power-on lifetime: 1189 hours (49 days + 13 hours)
 When the command that caused the error occurred, the device was active or idle.

 After command completion occurred, registers were:
 -- -- -- -- -- -- --
 84 51 8f a8 f1 62 e0  Error: ICRC, ABRT 143 sectors at LBA = 0x0062f1a8 = 6484392

 Commands leading to the command that caused the error were:
 CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
 -- -- -- -- -- -- -- --  ----------------  --------------------
 25 00 00 37 f1 62 e0 00	  04:24:14.104  READ DMA EXT
 25 00 a0 97 f0 62 e0 00	  04:24:14.104  READ DMA EXT
 25 00 00 ff f0 62 e0 00	  04:24:08.735  READ DMA EXT
 35 00 08 b7 00 4c e0 00	  04:23:57.096  WRITE DMA EXT
 ca 00 08 3f 35 00 e0 00	  04:23:57.095  WRITE DMA

Error 200 occurred at disk power-on lifetime: 1183 hours (49 days + 7 hours)
 When the command that caused the error occurred, the device was active or idle.

 After command completion occurred, registers were:
 -- -- -- -- -- -- --
 84 51 7f a8 63 65 e0  Error: ICRC, ABRT 127 sectors at LBA = 0x006563a8 = 6644648

 Commands leading to the command that caused the error were:
 CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
 -- -- -- -- -- -- -- --  ----------------  --------------------
 25 00 a0 87 63 65 e0 00	  08:19:37.189  READ DMA EXT
 25 00 60 27 63 65 e0 00	  08:19:37.189  READ DMA EXT
 25 00 08 1f 63 65 e0 00	  08:19:31.719  READ DMA EXT
 25 00 08 bf 00 4c e0 00	  08:16:05.022  READ DMA EXT
 35 00 08 67 00 4c e0 00	  08:15:33.809  WRITE DMA EXT

SMART Self-test log structure revision number 1
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline	   Completed without error	   00%	  1054		 -
# 2  Short offline	   Completed without error	   00%	   954		 -
# 3  Short offline	   Completed without error	   00%	   696		 -
# 4  Short offline	   Completed without error	   00%	   695		 -
# 5  Short offline	   Completed without error	   00%	   593		 -
# 6  Extended offline	Completed without error	   00%	   518		 -
# 7  Short offline	   Completed without error	   00%	   447		 -

SMART Selective self-test log data structure revision number 1
1		0		0  Not_testing
2		0		0  Not_testing
3		0		0  Not_testing
4		0		0  Not_testing
5		0		0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

c'est ptet ton alim qui ne donne plus de jus!

A vrai dire je sais pas, c'est une 420watt de chez LC Power je croît ...

Et j'ai pas une config très gourmande je pense : proc amd athlon, carte agp, 2 carte pci (tuner tv + carte sata), 2 lecteurs optique (lecteur dvd, graveur dvd), 2 disque dur

Mais bon, sa serait bizarre que j'ai pas d'autre problème que ça.

Étape 1 : backup ce que tu peux ;) .

Hum, ça va, j'ai surtout des backup de mon disque ata (environs 30go de data), et quelques vidéos ;)

Regarde les tensions dans le BIOS ou avec "sensors" (il faut faire un "sensors-detect" avant)

les sondes de tensions me donnent des résultat totalement incorrect depuis longtemps (avec 2 alims différentes), donc sondes hs je pense.


Bon, j'ai ré-installé tout de même lm-sensors ...

il me retourne ceci :

$ sensors
Adapter: ISA adapter
VCore 1:	 +1.65 V  (min =  +0.00 V, max =  +4.08 V)
VCore 2:	 +1.57 V  (min =  +0.00 V, max =  +4.08 V)
+3.3V:	   +2.59 V  (min =  +0.00 V, max =  +4.08 V)
+5V:		 +2.47 V  (min =  +0.00 V, max =  +6.85 V)
+12V:	   +13.25 V  (min =  +0.00 V, max = +16.32 V)
-12V:		-4.90 V  (min = -27.36 V, max =  +3.93 V)
-5V:		 -0.96 V  (min = -13.64 V, max =  +4.03 V)
Stdby:	   +5.24 V  (min =  +0.00 V, max =  +6.85 V)
VBat:		+4.08 V
fan1:	   1875 RPM  (min =	0 RPM, div = 4)
fan2:		927 RPM  (min =	0 RPM, div = 8)
M/B Temp:	+69.0°C  (low  = +69.0°C, high = +75.0°C)  sensor = thermal diode
CPU Temp:	 -1.0°C  (low  =  +0.0°C, high = +127.0°C)  sensor = disabled
Temp3:		-1.0°C  (low  =  +0.0°C, high = +127.0°C)  sensor = disabled
cpu0_vid:   +1.650 V

edit 2 : bon bah va falloir que j'achète un multi-mètre je pense ...

  1 an après...


c'est un petit gros déterrage, mais j'ai enfin trouvé la solution (il y a quelques mois)...

Le fautif était .... le câble sata !

Dorénavant il marche sans aucune erreur (bon autre souci : aléatoirement il siffle ... et imite une turbine après l'arrêt du disque ....)

c'est un petit gros déterrage, mais j'ai enfin trouvé la solution (il y a quelques mois)...

Le fautif était .... le câble sata !

Dorénavant il marche sans aucune erreur (bon autre souci : aléatoirement il siffle ... et imite une turbine après l'arrêt du disque ....)

n'aurais-tu pas obstruer le trou qui permet le dégagement de l'air chaud engendrer par la vitesse de rotation des plateaux?

