[[Guard]] #contents http://hiro-system.blog.ocn.ne.jp/blog/2010/11/smartd_995c.html http://www.wizard-limit.net/mt/pc/archives/2011_08.html ***smartdでエラーメッセージを受け取ったときの応急対応 [#q2b4c761] grep -i "smartd" /var/log/messages | tail guard# grep -i "smartd" /var/log/messages | tail Jan 11 12:39:00 guard smartd[596]: Device: /dev/ada0, 104 Currently unreadable (pending) sectors Jan 11 13:09:00 guard smartd[596]: Device: /dev/ada0, 104 Currently unreadable (pending) sectors Jan 11 13:39:00 guard smartd[596]: Device: /dev/ada0, 104 Currently unreadable (pending) sectors ***sector errorを修復を試みる。 [#nb685c9b] # smartctl /dev/ada0 --log=selftest smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-RELEASE-p4 amd64] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 40% 359 577493400 # 2 Short offline Completed without error 00% 352 - # 3 Short offline Completed without error 00% 328 - # 4 Short offline Completed without error 00% 311 - # 5 Short offline Completed without error 00% 304 - # 6 Short offline Completed without error 00% 280 - # 7 Short offline Completed without error 00% 256 - # 8 Short offline Completed without error 00% 232 - # 9 Extended offline Completed without error 00% 213 - #10 Short offline Completed without error 00% 208 - #11 Short offline Completed without error 00% 184 - #12 Short offline Completed without error 00% 160 - #13 Short offline Completed without error 00% 136 - #14 Short offline Completed without error 00% 112 - #15 Short offline Completed without error 00% 88 - #16 Short offline Completed without error 00% 64 - #17 Extended offline Completed without error 00% 45 - #18 Short offline Completed without error 00% 40 - guard# # fdisk ******* Working on device /dev/ada0 ******* parameters extracted from in-core disklabel are: cylinders=969021 heads=16 sectors/track=63 (1008 blks/cyl) Figures below won't work with BIOS for partitions not in cyl 1 parameters to be used for BIOS calculations are: cylinders=969021 heads=16 sectors/track=63 (1008 blks/cyl) Media sector size is 512 Warning: BIOS sector numbering starts with sector 1 Information from DOS bootblock is: The data for partition 1 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 64, size 976773103 (476939 Meg), flag 80 (active) beg: cyl 0/ head 1/ sector 2; end: cyl 1023/ head 255/ sector 63 The data for partition 2 is: <UNUSED> The data for partition 3 is: <UNUSED> The data for partition 4 is: <UNUSED> # disklabel -A /dev/ada0s1 # /dev/ada0s1: type: unknown disk: label: flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 16 sectors/cylinder: 1008 cylinders: 969020 sectors/unit: 976773103 rpm: 3600 interleave: 0 trackskew: 0 cylinderskew: 0 headswitch: 0 # milliseconds track-to-track seek: 0 # milliseconds drivedata: 0 8 partitions: # size offset fstype [fsize bsize bps/cpg] a: 968884224 0 4.2BSD 0 0 0 b: 7888878 968884224 swap c: 976773103 0 unused 0 0 # "raw" part, don't edit ということで、bsize が0となってしまう・・・? ***ファイルシステムのブロックサイズを調べるには、次のコマンドを使います。 [#m5fe276c] # dumpfs /some/filesystem | grep '^bsize' ということなので、 guard# dumpfs /dev/ad4s1 | grep '^bsize' bsize 16384 shift 14 mask 0xffffc000 piano2ndだと、gpart list すると、ada0p2がファイル領域のようなので root@piano2nd:~ # dumpfs /dev/ada0p2 | grep '^bsize' bsize 32768 shift 15 mask 0xffff8000 小さなファイルのstatを見てみて確認 # stat .screenrc 114 5056131 -rw-r--r-- 1 root wheel 10102700 54 "Jan 12 10:43:37 2018" "Dec 29 06:40:50 2017" "Dec 29 06:40:58 2017" "Dec 29 06:40:50 2017" 32768 8 0 .screenrc http://d.hatena.ne.jp/parasporospa/touch/searchdiary?word=*%5Bunix%5D&of=20 によれば、st_blksize ファイルシステム I/O 操作での最適なブロックサイズ は、16384 ここでは、32768と。 ***以下の公式からseek箇所を算出。 [#t7d208fe] http://hiro-system.blog.ocn.ne.jp/blog/2010/11/smartd_995c.html http://see-take.blogspot.jp/2010/01/hddsmart.html では、以下の様な公式が掲載されていますが、これはFreeBSDだとそのままでは当てはまりません。 b = (int)((L-S)*512/B) where: b = File System block number B = File system block size in bytes L = LBA of bad sector S = Starting sector of partition as shown by fdisk -lu and (int) denotes the integer part. S = 0、B = 4096、L = 577493400を公式に代入。 b = (int)(1043624 * 512 / 4096) = (int) 130453 (小数点以下切捨) で、こちらとなります。 b = (int)((L-S)*512/B) where: b = File System block number B = File system block size in bytes (dumpfs 32768) L = LBA of bad sector S = Starting sector of partition as shown by fdisk and (int) denotes the integer part. S = 0、B = 32768、L = 577493400を公式に代入。 b = (int)(577493400 * 512 / 32768) = (int) 9023334.375 (小数点以下切捨) となった。該当箇所をddでゼロで埋める。 guard# dd if=/dev/zero of=/dev/ada0 bs=32768 count=1 seek=9023334 dd: /dev/ada0: Operation not permitted と言われるので、ググって guard# sysctl kern.geom.debugflags=0x10 kern.geom.debugflags: 0 -> 16 としてから、 guard# dd if=/dev/zero of=/dev/ada0 bs=32768 count=1 seek=9023334 guard# dd if=/dev/zero of=/dev/ada0 bs=32768 count=1 seek=9023334 1+0 records in 1+0 records out 16384 bytes transferred in 0.000281 secs (58286240 bytes/sec) と、書き換えて、 guard# sysctl kern.geom.debugflags=0 kern.geom.debugflags: 16 -> 0 お目に留まればもとへとかえす。 そして、正常にもどったかどうかチェック! guard# smartctl --test=long /dev/ad4 smartctl 5.43 2012-06-30 r3573 [FreeBSD 8.1-RELEASE-p13 i386] (local build) Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Extended self-test routine immediately in off-line mode". Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 40 minutes for test to complete. Test will complete after Sat Feb 16 13:59:56 2013 Use smartctl -X to abort test. guard# ・・・14時になったので guard# smartctl /dev/ad4 --log=selftest smartctl 5.43 2012-06-30 r3573 [FreeBSD 8.1-RELEASE-p13 i386] (local build) Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 19437 - # 2 Extended offline Completed: read failure 90% 19431 310658987 # 3 Short offline Completed: read failure 90% 19427 310658987 # 4 Short offline Completed: read failure 90% 19403 310658987 # 5 Short offline Completed: read failure 90% 19379 310658987 # 6 Short offline Completed: read failure 90% 19355 310658987 # 7 Short offline Completed: read failure 90% 19331 310658987 # 8 Short offline Completed: read failure 90% 19307 310658987 # 9 Short offline Completed: read failure 90% 19283 310658987 #10 Extended offline Completed: read failure 90% 19263 310658987 #11 Short offline Completed: read failure 90% 19259 310658987 #12 Short offline Completed: read failure 90% 19235 310658987 #13 Short offline Completed: read failure 90% 19211 310658987 #14 Short offline Completed: read failure 90% 19187 310658987 #15 Short offline Completed: read failure 90% 19163 310658987 #16 Short offline Completed: read failure 90% 19139 310658987 #17 Short offline Completed: read failure 90% 19115 310658987 #18 Extended offline Completed: read failure 90% 19095 310658987 #19 Short offline Completed: read failure 90% 19091 310658987 #20 Short offline Completed: read failure 90% 19067 310658987 #21 Short offline Completed: read failure 90% 19043 310658987 20 of 20 failed self-tests are outdated by newer successful extended offline self-test # 1 guard# うまく、いったっぽい!( ´▽`)ノ