CONTENTS
#contents
----
Lastmodified &lastmod;
----
*SMART error (CurrentPendingSector) detected on host ZFS編 [#d3de4893]
[[SMART error (CurrentPendingSector) detected on host]]


 # zpool status
   pool: tank
  state: ONLINE
   scan: scrub canceled on Fri Aug 14 08:50:50 2015
 config:
 
         NAME        STATE     READ WRITE CKSUM
         tank        ONLINE       0     0     0
           raidz1-0  ONLINE       0     0     0
             ada1    ONLINE       0     0     0
             ada2    ONLINE       0     0     0
             ada3    ONLINE       0     0     0
 
 errors: No known data errors
 
   pool: zfspool
  state: ONLINE
   scan: none requested
 config:
 
         NAME        STATE     READ WRITE CKSUM
         zfspool     ONLINE       0     0     0
           ada0p3    ONLINE       0     0     0
 
 errors: No known data errors

こんな構成のFreeBSD 9.x なサーバから、こんなメールが・・・

** SMART error (CurrentPendingSector) detected on host: blackcube.smb.net 2015年8月13日 [#l31ae9d9]
 This email was generated by the smartd daemon running on:
 
   host name: blackcube.smb.net
   DNS domain: smb.net
   NIS domain: 
 
 The following warning/error was logged by the smartd daemon:
 
 Device: /dev/ada1, 2 Currently unreadable (pending) sectors
 
 
 For details see host's SYSLOG.
 
 You can also use the smartctl utility for further investigation.
 No additional email messages about this problem will be sent.

ログはこんな具合。

 root@blackcube:/home/kuji # grep -i "smartd" /var/log/messages | tail
 Aug 14 06:27:15 blackcube smartd[1066]: Device: /dev/ada1, 2 Currently unreadable (pending) sectors
 Aug 14 06:27:15 blackcube smartd[1066]: Device: /dev/ada1, 2 Offline uncorrectable sectors
 Aug 14 06:57:15 blackcube smartd[1066]: Device: /dev/ada1, 2 Currently unreadable (pending) sectors
 Aug 14 06:57:15 blackcube smartd[1066]: Device: /dev/ada1, 2 Offline uncorrectable sectors
 Aug 14 07:27:16 blackcube smartd[1066]: Device: /dev/ada1, 2 Currently unreadable (pending) sectors
 Aug 14 07:27:16 blackcube smartd[1066]: Device: /dev/ada1, 2 Offline uncorrectable sectors
 Aug 14 07:57:15 blackcube smartd[1066]: Device: /dev/ada1, 2 Currently unreadable (pending) sectors
 Aug 14 07:57:15 blackcube smartd[1066]: Device: /dev/ada1, 2 Offline uncorrectable sectors
 Aug 14 08:27:15 blackcube smartd[1066]: Device: /dev/ada1, 2 Currently unreadable (pending) sectors
 Aug 14 08:27:15 blackcube smartd[1066]: Device: /dev/ada1, 2 Offline uncorrectable sectors

/dev/ada1 は、こんな・・・


 # smartctl /dev/ada1 --log=selftest
 smartctl 5.43 2012-06-30 r3573 [FreeBSD 9.1-RELEASE-p22 amd64] (local build)
 Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
 
 === START OF READ SMART DATA SECTION ===
 SMART Self-test log structure revision number 1
 Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
 # 1  Short offline       Completed without error       00%     51597         -
 # 2  Short offline       Interrupted (host reset)      00%     51573         -
 # 3  Short offline       Completed without error       00%     51549         -
 # 4  Short offline       Interrupted (host reset)      00%     51525         -
 # 5  Short offline       Interrupted (host reset)      00%     51501         -
 # 6  Short offline       Completed without error       00%     51477         -
 # 7  Extended offline    Completed: read failure       90%     51457         51978742
 # 8  Short offline       Interrupted (host reset)      00%     51453         -
 # 9  Short offline       Interrupted (host reset)      00%     51429         -
 #10  Short offline       Interrupted (host reset)      00%     51405         -
 #11  Short offline       Completed without error       00%     51381         -
 #12  Short offline       Completed without error       00%     51357         -
 #13  Short offline       Completed without error       00%     51333         -
 #14  Short offline       Completed without error       00%     51309         -
 #15  Extended offline    Completed: read failure       90%     51296         51978742
 #16  Extended offline    Completed: read failure       90%     51289         51978742
 #17  Short offline       Interrupted (host reset)      00%     51285         -
 #18  Short offline       Completed without error       00%     51261         -
 #19  Short offline       Completed without error       00%     51237         -
 #20  Short offline       Completed without error       00%     51213         -
 #21  Short offline       Completed without error       00%     51173         -

と言うことで、

      b = (int)((L-S)*512/B)
       where:
       b = File System block number
       B = File system block size in bytes
       L = LBA of bad sector
       S = Starting sector of partition as shown by fdisk -lu
       and (int) denotes the integer part.
 
 S = 0、B = 4096、L = 51978742を公式に代入。
 
 b = (int)(1043624 * 512 / 4096) = (int) 130453 (小数点以下切捨)

で、

 # dd if=/dev/zero of=/dev/ada1 bs=4096 count=1 seek=6497334

とか、してみたが、状況変わらず。

そこで、ブロックサイズが16kのほうの式

       b = (int)((L-S)*512/B)
       where:
       b = File System block number
       B = File system block size in bytes (dumpfs 16384)
       L = LBA of bad sector
       S = Starting sector of partition as shown by fdisk
       and (int) denotes the integer part.
 
 S = 0、B = 16384、L = 51978742を公式に代入。
 
 b = (int)(310658987 * 512 / 16384) = (int) 1624335.6875 (小数点以下切捨)

ということで、

 # sysctl kern.geom.debugflags=0x10
 kern.geom.debugflags: 0 -> 16
 # dd if=/dev/zero of=/dev/ada1 bs=16384 count=1 seek=1624335
 1+0 records in
 1+0 records out
 16384 bytes transferred in 0.000646 secs (25367101 bytes/sec)
 # sysctl kern.geom.debugflags=0
 kern.geom.debugflags: 16 -> 0

とすると、このように変化。

 root@blackcube:/home/kuji # grep -i "smartd" /var/log/messages | tail
 Aug 14 06:57:15 blackcube smartd[1066]: Device: /dev/ada1, 2 Currently unreadable (pending) sectors
 Aug 14 06:57:15 blackcube smartd[1066]: Device: /dev/ada1, 2 Offline uncorrectable sectors
 Aug 14 07:27:16 blackcube smartd[1066]: Device: /dev/ada1, 2 Currently unreadable (pending) sectors
 Aug 14 07:27:16 blackcube smartd[1066]: Device: /dev/ada1, 2 Offline uncorrectable sectors
 Aug 14 07:57:15 blackcube smartd[1066]: Device: /dev/ada1, 2 Currently unreadable (pending) sectors
 Aug 14 07:57:15 blackcube smartd[1066]: Device: /dev/ada1, 2 Offline uncorrectable sectors
 Aug 14 08:27:15 blackcube smartd[1066]: Device: /dev/ada1, 2 Currently unreadable (pending) sectors
 Aug 14 08:27:15 blackcube smartd[1066]: Device: /dev/ada1, 2 Offline uncorrectable sectors
 Aug 14 08:57:15 blackcube smartd[1066]: Device: /dev/ada1, 1 Currently unreadable (pending) sectors (changed -1)
 Aug 14 08:57:15 blackcube smartd[1066]: Device: /dev/ada1, 1 Offline uncorrectable sectors (changed -1)

エラーは、2個から1個に減っている。ディスクをみると、

 # smartctl /dev/ada1 --log=selftest
 smartctl 5.43 2012-06-30 r3573 [FreeBSD 9.1-RELEASE-p22 amd64] (local build)
 Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
 
 === START OF READ SMART DATA SECTION ===
 SMART Self-test log structure revision number 1
 Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
 # 1  Short offline       Completed without error       00%     51597         -
 # 2  Short offline       Interrupted (host reset)      00%     51573         -
 # 3  Short offline       Completed without error       00%     51549         -
 # 4  Short offline       Interrupted (host reset)      00%     51525         -
 # 5  Short offline       Interrupted (host reset)      00%     51501         -
 # 6  Short offline       Completed without error       00%     51477         -
 # 7  Extended offline    Completed: read failure       90%     51457         51978742
 # 8  Short offline       Interrupted (host reset)      00%     51453         -
 # 9  Short offline       Interrupted (host reset)      00%     51429         -
 #10  Short offline       Interrupted (host reset)      00%     51405         -
 #11  Short offline       Completed without error       00%     51381         -
 #12  Short offline       Completed without error       00%     51357         -
 #13  Short offline       Completed without error       00%     51333         -
 #14  Short offline       Completed without error       00%     51309         -
 #15  Extended offline    Completed: read failure       90%     51296         51978742
 #16  Extended offline    Completed: read failure       90%     51289         51978742
 #17  Short offline       Interrupted (host reset)      00%     51285         -
 #18  Short offline       Completed without error       00%     51261         -
 #19  Short offline       Completed without error       00%     51237         -
 #20  Short offline       Completed without error       00%     51213         -
 #21  Short offline       Completed without error       00%     51173         -

と、変化無し!?

というわけで、もう一度テスト


 # smartctl --test=long /dev/ada1
 smartctl 5.43 2012-06-30 r3573 [FreeBSD 9.1-RELEASE-p22 amd64] (local build)
 Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
 
 === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
 Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
 Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
 Testing has begun.
 Please wait 116 minutes for test to complete.
 Test will complete after Fri Aug 14 11:46:40 2015 
 
 Use smartctl -X to abort test.


こんどは、別の場所が・・・?

 # smartctl /dev/ada1 --log=selftest
 smartctl 5.43 2012-06-30 r3573 [FreeBSD 9.1-RELEASE-p22 amd64] (local build)
 Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
 
 === START OF READ SMART DATA SECTION ===
 SMART Self-test log structure revision number 1
 Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
 # 1  Extended offline    Completed: read failure       90%     51603         101630056
 # 2  Short offline       Completed without error       00%     51597         -
 # 3  Short offline       Interrupted (host reset)      00%     51573         -
 # 4  Short offline       Completed without error       00%     51549         -
 # 5  Short offline       Interrupted (host reset)      00%     51525         -
 # 6  Short offline       Interrupted (host reset)      00%     51501         -
 # 7  Short offline       Completed without error       00%     51477         -
 # 8  Extended offline    Completed: read failure       90%     51457         51978742
 # 9  Short offline       Interrupted (host reset)      00%     51453         -
 #10  Short offline       Interrupted (host reset)      00%     51429         -
 #11  Short offline       Interrupted (host reset)      00%     51405         -
 #12  Short offline       Completed without error       00%     51381         -
 #13  Short offline       Completed without error       00%     51357         -
 #14  Short offline       Completed without error       00%     51333         -
 #15  Short offline       Completed without error       00%     51309         -
 #16  Extended offline    Completed: read failure       90%     51296         51978742
 #17  Extended offline    Completed: read failure       90%     51289         51978742
 #18  Short offline       Interrupted (host reset)      00%     51285         -
 #19  Short offline       Completed without error       00%     51261         -
 #20  Short offline       Completed without error       00%     51237         -
 #21  Short offline       Completed without error       00%     51213         -

 # grep -i "smartd" /var/log/messages | tail
 Aug 14 10:13:12 blackcube smartd[1066]: Device: /dev/ada1, 1 Currently unreadable (pending) sectors
 Aug 14 10:13:12 blackcube smartd[1066]: Device: /dev/ada1, 1 Offline uncorrectable sectors
 Aug 14 10:13:12 blackcube smartd[1066]: Device: /dev/ada1, previous self-test completed with error (read test element)
 Aug 14 10:13:12 blackcube smartd[1066]: Device: /dev/ada1, Self-Test Log error count increased from 3 to 4
 Aug 14 10:43:12 blackcube smartd[1066]: Device: /dev/ada1, 1 Currently unreadable (pending) sectors
 Aug 14 10:43:12 blackcube smartd[1066]: Device: /dev/ada1, 1 Offline uncorrectable sectors
 Aug 14 11:13:13 blackcube smartd[1066]: Device: /dev/ada1, 1 Currently unreadable (pending) sectors
 Aug 14 11:13:13 blackcube smartd[1066]: Device: /dev/ada1, 1 Offline uncorrectable sectors
 Aug 14 11:43:12 blackcube smartd[1066]: Device: /dev/ada1, 1 Currently unreadable (pending) sectors
 Aug 14 11:43:12 blackcube smartd[1066]: Device: /dev/ada1, 1 Offline uncorrectable sectors

計算

 b = (int)(101630056 * 512 / 16384) = (int) 3175939.25 (小数点以下切捨)

  1017  11:57   sysctl kern.geom.debugflags=0x10
  1018  11:57   dd if=/dev/zero of=/dev/ada1 bs=16384 count=1 seek=3175939
  1019  11:58   sysctl kern.geom.debugflags=0




----
Total access &counter(total);:本日 &counter(today);:昨日 &counter(yesterday);
#counter([total|today|yesterday]);

トップ   新規 一覧 検索 最終更新   ヘルプ   最終更新のRSS