- 追加された行はこの色です。
- 削除された行はこの色です。
[[Guard]]
#contents
*smartdでエラーメッセージを受け取ったときの応急対応・コマンドまとめ [#y75b9039]
*** # grep -i "smartd" /var/log/messages | tail [#q00a8915]
Jan 12 11:09:01 guard smartd[596]: Device: /dev/ada0, 96 Currently unreadable (pending) sectors (changed -8)
Jan 12 11:39:01 guard smartd[596]: Device: /dev/ada0, 96 Currently unreadable (pending) sectors
Jan 12 12:09:01 guard smartd[596]: Device: /dev/ada0, 96 Currently unreadable (pending) sectors
Jan 12 12:09:01 guard smartd[596]: Device: /dev/ada0, previous self-test completed with error (read test element)
Jan 12 12:09:01 guard smartd[596]: Device: /dev/ada0, Self-Test Log error count increased from 1 to 2
Jan 12 12:39:01 guard smartd[596]: Device: /dev/ada0, 88 Currently unreadable (pending) sectors (changed -8)
Jan 12 13:09:01 guard smartd[596]: Device: /dev/ada0, 88 Currently unreadable (pending) sectors
Jan 12 13:39:00 guard smartd[596]: Device: /dev/ada0, 80 Currently unreadable (pending) sectors (changed -8)
Jan 12 13:39:01 guard smartd[596]: Device: /dev/ada0, Self-Test Log error count increased from 2 to 3
Jan 12 14:09:00 guard smartd[596]: Device: /dev/ada0, 80 Currently unreadable (pending) sectors
*** # smartctl /dev/ada0 --log=selftest [#pb39d335]
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-RELEASE-p4 amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 10% 363 975890088
# 2 Extended offline Completed: read failure 10% 362 975884072
# 3 Extended offline Completed: read failure 10% 361 975876336
# 4 Short offline Completed without error 00% 360 -
# 5 Extended offline Completed: read failure 40% 359 577493400
# 6 Short offline Completed without error 00% 352 -
# 7 Short offline Completed without error 00% 328 -
# 8 Short offline Completed without error 00% 311 -
#
ブロックサイズが 32768 の場合
b = (int)(975890088 * 512 / 32768) = (int) 15248282 (小数点以下切捨)
*** # sysctl kern.geom.debugflags=0x10 [#v8eb0d4f]
kern.geom.debugflags: 0 -> 16
*** # dd if=/dev/zero of=/dev/ada0 bs=32768 count=1 seek=15248282 [#paff7c7c]
1+0 records in
1+0 records out
32768 bytes transferred in 0.000234 secs (140105438 bytes/sec)
*** # sysctl kern.geom.debugflags=0 [#rc7012c8]
kern.geom.debugflags: 16 -> 0
*** # smartctl --test=long /dev/ada0 [#v22d083d]
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-RELEASE-p4 amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 62 minutes for test to complete.
Test will complete after Fri Jan 12 15:35:51 2018
Use smartctl -X to abort test.
----
http://hiro-system.blog.ocn.ne.jp/blog/2010/11/smartd_995c.html
http://www.wizard-limit.net/mt/pc/archives/2011_08.html
***smartdでエラーメッセージを受け取ったときの応急対応 [#q2b4c761]
**smartdでエラーメッセージを受け取ったときの応急対応 [#q2b4c761]
grep -i "smartd" /var/log/messages | tail
guard# grep -i "smartd" /var/log/messages | tail
Feb 14 05:57:56 guard smartd[779]: Device: /dev/ad4, 1 Currently unreadable (pending) sectors
Feb 14 06:27:56 guard smartd[779]: Device: /dev/ad4, 1 Currently unreadable (pending) sectors
Feb 14 06:57:56 guard smartd[779]: Device: /dev/ad4, 1 Currently unreadable (pending) sectors
Feb 14 07:27:56 guard smartd[779]: Device: /dev/ad4, 1 Currently unreadable (pending) sectors
Feb 14 07:57:56 guard smartd[779]: Device: /dev/ad4, 1 Currently unreadable (pending) sectors
Feb 14 08:27:56 guard smartd[779]: Device: /dev/ad4, 1 Currently unreadable (pending) sectors
Feb 14 08:57:56 guard smartd[779]: Device: /dev/ad4, 1 Currently unreadable (pending) sectors
Feb 14 09:27:56 guard smartd[779]: Device: /dev/ad4, 1 Currently unreadable (pending) sectors
Feb 14 09:57:57 guard smartd[779]: Device: /dev/ad4, 1 Currently unreadable (pending) sectors
Feb 14 10:27:56 guard smartd[779]: Device: /dev/ad4, 1 Currently unreadable (pending) sectors
guard#
Jan 11 12:39:00 guard smartd[596]: Device: /dev/ada0, 104 Currently unreadable (pending) sectors
Jan 11 13:09:00 guard smartd[596]: Device: /dev/ada0, 104 Currently unreadable (pending) sectors
Jan 11 13:39:00 guard smartd[596]: Device: /dev/ada0, 104 Currently unreadable (pending) sectors
***sector errorを修復を試みる。 [#nb685c9b]
**sector errorを修復を試みる。 [#nb685c9b]
guard# smartctl /dev/ad4 --log=selftest
smartctl 5.43 2012-06-30 r3573 [FreeBSD 8.1-RELEASE-p13 i386] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
# smartctl /dev/ada0 --log=selftest
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-RELEASE-p4 amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed: read failure 90% 19379 310658987
# 2 Short offline Completed: read failure 90% 19355 310658987
# 3 Short offline Completed: read failure 90% 19331 310658987
# 4 Short offline Completed: read failure 90% 19307 310658987
# 5 Short offline Completed: read failure 90% 19283 310658987
# 6 Extended offline Completed: read failure 90% 19263 310658987
# 7 Short offline Completed: read failure 90% 19259 310658987
# 8 Short offline Completed: read failure 90% 19235 310658987
# 9 Short offline Completed: read failure 90% 19211 310658987
#10 Short offline Completed: read failure 90% 19187 310658987
#11 Short offline Completed: read failure 90% 19163 310658987
#12 Short offline Completed: read failure 90% 19139 310658987
#13 Short offline Completed: read failure 90% 19115 310658987
#14 Extended offline Completed: read failure 90% 19095 310658987
#15 Short offline Completed: read failure 90% 19091 310658987
#16 Short offline Completed: read failure 90% 19067 310658987
#17 Short offline Completed: read failure 90% 19043 310658987
#18 Short offline Completed: read failure 90% 19019 310658987
#19 Short offline Completed: read failure 90% 18995 310658987
#20 Short offline Completed: read failure 90% 18971 310658987
#21 Short offline Completed: read failure 90% 18947 310658987
# 1 Extended offline Completed: read failure 40% 359 577493400
# 2 Short offline Completed without error 00% 352 -
# 3 Short offline Completed without error 00% 328 -
# 4 Short offline Completed without error 00% 311 -
# 5 Short offline Completed without error 00% 304 -
# 6 Short offline Completed without error 00% 280 -
# 7 Short offline Completed without error 00% 256 -
# 8 Short offline Completed without error 00% 232 -
# 9 Extended offline Completed without error 00% 213 -
#10 Short offline Completed without error 00% 208 -
#11 Short offline Completed without error 00% 184 -
#12 Short offline Completed without error 00% 160 -
#13 Short offline Completed without error 00% 136 -
#14 Short offline Completed without error 00% 112 -
#15 Short offline Completed without error 00% 88 -
#16 Short offline Completed without error 00% 64 -
#17 Extended offline Completed without error 00% 45 -
#18 Short offline Completed without error 00% 40 -
guard#
***guardバックアップ先 [#a004a988]
30 2 * * * root /root/bin/backup_to_k222_all.sh
30 0 * * * root /root/bin/backup_to_BlackHole_all.sh
# smartctl /dev/sdz --log=selftest
smartctl version x.xx Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed: read failure 10% 19935 1043624
# 2 Extended offline Completed without error 00% 18472 -
# 3 Short offline Completed without error 00% 18469 -
# 4 Extended offline Completed without error 00% 18447 -
# 5 Extended offline Completed without error 00% 16669 -
# 6 Short offline Completed without error 00% 16645 -
# 7 Extended offline Completed without error 00% 13278 -
# 8 Short offline Completed without error 00% 12081 -
guard# fdisk
******* Working on device /dev/ad4 *******
# fdisk
******* Working on device /dev/ada0 *******
parameters extracted from in-core disklabel are:
cylinders=310020 heads=16 sectors/track=63 (1008 blks/cyl)
cylinders=969021 heads=16 sectors/track=63 (1008 blks/cyl)
Figures below won't work with BIOS for partitions not in cyl 1
parameters to be used for BIOS calculations are:
cylinders=310020 heads=16 sectors/track=63 (1008 blks/cyl)
cylinders=969021 heads=16 sectors/track=63 (1008 blks/cyl)
Media sector size is 512
Warning: BIOS sector numbering starts with sector 1
Information from DOS bootblock is:
The data for partition 1 is:
sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
start 63, size 312496317 (152586 Meg), flag 80 (active)
beg: cyl 0/ head 1/ sector 1;
end: cyl 1023/ head 3/ sector 63
start 64, size 976773103 (476939 Meg), flag 80 (active)
beg: cyl 0/ head 1/ sector 2;
end: cyl 1023/ head 255/ sector 63
The data for partition 2 is:
<UNUSED>
The data for partition 3 is:
<UNUSED>
The data for partition 4 is:
<UNUSED>
guard#
guard# disklabel -A /dev/ad4s1
# /dev/ad4s1:
type: ESDI
disk: ad4s1
# disklabel -A /dev/ada0s1
# /dev/ada0s1:
type: unknown
disk:
label:
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 310020
sectors/unit: 312500160
cylinders: 969020
sectors/unit: 976773103
rpm: 3600
interleave: 1
interleave: 0
trackskew: 0
cylinderskew: 0
headswitch: 0 # milliseconds
track-to-track seek: 0 # milliseconds
drivedata: 0
8 partitions:
# size offset fstype [fsize bsize bps/cpg]
a: 1048576 0 4.2BSD 0 0 0
b: 4092240 1048576 swap
c: 312496317 0 unused 0 0 # "raw" part, don't edit
d: 4143104 5140816 4.2BSD 0 0 0
e: 1048576 9283920 4.2BSD 0 0 0
f: 302163821 10332496 4.2BSD 0 0 0
disklabel: partition c doesn't cover the whole unit!
disklabel: An incorrect partition c may cause problems for standard system utilities
guard#
# size offset fstype [fsize bsize bps/cpg]
a: 968884224 0 4.2BSD 0 0 0
b: 7888878 968884224 swap
c: 976773103 0 unused 0 0 # "raw" part, don't edit
ということで、bsize が0となってしまう・・・?
***ファイルシステムのブロックサイズを調べるには、次のコマンドを使います。 [#m5fe276c]
**ファイルシステムのブロックサイズを調べるには、次のコマンドを使います。 [#m5fe276c]
# dumpfs /some/filesystem | grep '^bsize'
ということなので、
guard# dumpfs /dev/ad4s1 | grep '^bsize'
bsize 16384 shift 14 mask 0xffffc000
piano2ndだと、gpart list すると、ada0p2がファイル領域のようなので
root@piano2nd:~ # dumpfs /dev/ada0p2 | grep '^bsize'
bsize 32768 shift 15 mask 0xffff8000
guard# stat
100728576 88 crw--w---- 1 root tty 88 0 "Feb 14 12:47:16 2013" "Feb 14 12:47:16 2013" "Feb 14 12:47:16 2013" "Jan 1 08:59:59 1970" 4096 0 0 /dev/pts/1
guard# stat w-filter_1_02.sh
80 16587 -rwxr-xr-x 1 root wheel 68531 1776 "Jan 12 14:00:52 2012" "Apr 27 14:46:06 2007" "Nov 15 11:53:50 2010" "Apr 27 14:46:06 2007" 16384 4 0 w-filter_1_02.sh
小さなファイルのstatを見てみて確認
# stat .screenrc
114 5056131 -rw-r--r-- 1 root wheel 10102700 54 "Jan 12 10:43:37 2018" "Dec 29 06:40:50 2017" "Dec 29 06:40:58 2017" "Dec 29 06:40:50 2017" 32768 8 0 .screenrc
http://d.hatena.ne.jp/parasporospa/touch/searchdiary?word=*%5Bunix%5D&of=20
によれば、st_blksize ファイルシステム I/O 操作での最適なブロックサイズ は、16384
によれば、st_blksize ファイルシステム I/O 操作での最適なブロックサイズ は、16384 ここでは、32768と。
***以下の公式からseek箇所を算出。 [#t7d208fe]
**以下の公式からseek箇所を算出。 [#t7d208fe]
http://hiro-system.blog.ocn.ne.jp/blog/2010/11/smartd_995c.html
http://see-take.blogspot.jp/2010/01/hddsmart.html
では、以下の様な公式が掲載されていますが、これはFreeBSDだとそのままでは当てはまりません。
b = (int)((L-S)*512/B)
where:
b = File System block number
B = File system block size in bytes
L = LBA of bad sector
S = Starting sector of partition as shown by fdisk -lu
and (int) denotes the integer part.
S = 0、B = 4096、L = 1043624を公式に代入。
S = 0、B = 4096、L = 577493400を公式に代入。
b = (int)(1043624 * 512 / 4096) = (int) 130453 (小数点以下切捨)
で、こちらとなります。
b = (int)((L-S)*512/B)
where:
b = File System block number
B = File system block size in bytes (dumpfs 16384)
B = File system block size in bytes (dumpfs 32768)
L = LBA of bad sector
S = Starting sector of partition as shown by fdisk
and (int) denotes the integer part.
S = 0、B = 16384、L = 310658987を公式に代入。
S = 0、B = 32768、L = 577493400を公式に代入。
b = (int)(310658987 * 512 / 16384) = (int) 9708093.34375 (小数点以下切捨)
b = (int)(577493400 * 512 / 32768) = (int) 9023334.375 (小数点以下切捨)
b = (int)(310658987 * 512 / 16384) = (int) 9708093.34375
b = (int)(310658987 * 512 / 32768) = (int) 9708093.34375
192384*512/32768=
となった。該当箇所をddでゼロで埋める。
guard# dd if=/dev/zero of=/dev/ad4 bs=16384 count=1 seek=9708093
dd: /dev/ad4: Operation not permitted
guard# dd if=/dev/zero of=/dev/ada0 bs=32768 count=1 seek=9023334
dd: /dev/ada0: Operation not permitted
と言われるので、ググって
guard# sysctl kern.geom.debugflags=0x10
kern.geom.debugflags: 0 -> 16
としてから、
guard# dd if=/dev/zero of=/dev/ad4 bs=16384 count=1 seek=9708093
guard# dd if=/dev/zero of=/dev/ada0 bs=32768 count=1 seek=9023334
guard# dd if=/dev/zero of=/dev/ada0 bs=32768 count=1 seek=9023334
1+0 records in
1+0 records out
16384 bytes transferred in 0.000281 secs (58286240 bytes/sec)
と、書き換えて、
guard# sysctl kern.geom.debugflags=0
kern.geom.debugflags: 16 -> 0
お目に留まればもとへとかえす。
そして、正常にもどったかどうかチェック!
guard# smartctl --test=long /dev/ad4
smartctl 5.43 2012-06-30 r3573 [FreeBSD 8.1-RELEASE-p13 i386] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 40 minutes for test to complete.
Test will complete after Sat Feb 16 13:59:56 2013
Use smartctl -X to abort test.
guard#
・・・14時になったので
guard# smartctl /dev/ad4 --log=selftest
smartctl 5.43 2012-06-30 r3573 [FreeBSD 8.1-RELEASE-p13 i386] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 19437 -
# 2 Extended offline Completed: read failure 90% 19431 310658987
# 3 Short offline Completed: read failure 90% 19427 310658987
# 4 Short offline Completed: read failure 90% 19403 310658987
# 5 Short offline Completed: read failure 90% 19379 310658987
# 6 Short offline Completed: read failure 90% 19355 310658987
# 7 Short offline Completed: read failure 90% 19331 310658987
# 8 Short offline Completed: read failure 90% 19307 310658987
# 9 Short offline Completed: read failure 90% 19283 310658987
#10 Extended offline Completed: read failure 90% 19263 310658987
#11 Short offline Completed: read failure 90% 19259 310658987
#12 Short offline Completed: read failure 90% 19235 310658987
#13 Short offline Completed: read failure 90% 19211 310658987
#14 Short offline Completed: read failure 90% 19187 310658987
#15 Short offline Completed: read failure 90% 19163 310658987
#16 Short offline Completed: read failure 90% 19139 310658987
#17 Short offline Completed: read failure 90% 19115 310658987
#18 Extended offline Completed: read failure 90% 19095 310658987
#19 Short offline Completed: read failure 90% 19091 310658987
#20 Short offline Completed: read failure 90% 19067 310658987
#21 Short offline Completed: read failure 90% 19043 310658987
20 of 20 failed self-tests are outdated by newer successful extended offline self-test # 1
guard#
うまく、いったっぽい!( ´▽`)ノ
2013/02/16 14:07
guard# df
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/ad4s1a 507630 341756 125264 73% /
devfs 1 1 0 100% /dev
/dev/ad4s1e 507630 16 467004 0% /tmp
/dev/ad4s1f 146328056 14727436 119894376 11% /usr
/dev/ad4s1d 2000622 198004 1642570 11% /var
guard#
268 6:54 smartctl --test=short /dev/ada1
269 6:55 cat /var/log/console.log
270 6:55 smartctl /dev/ada1 --log=selftest
271 6:57 dd if=/dev/zero of=/dev/ada1 bs=16384 count=1 seek=6058
272 6:57 smartctl --test=short /dev/ada1
273 6:59 smartctl /dev/ada1 --log=selftest
274 7:00 dd if=/dev/zero of=/dev/ada1 bs=16384 count=1 seek=6104
275 7:00 smartctl --test=short /dev/ada1
276 7:02 smartctl /dev/ada1 --log=selftest
277 7:03 dd if=/dev/zero of=/dev/ada1 bs=16384 count=1 seek=6150
278 7:03 smartctl --test=short /dev/ada1
279 7:05 smartctl /dev/ada1 --log=selftest
280 7:05 dd if=/dev/zero of=/dev/ada1 bs=16384 count=1 seek=6196
281 7:06 smartctl --test=short /dev/ada1
282 7:14 smartctl /dev/ada1 --log=selftest
283 7:14 dd if=/dev/zero of=/dev/ada1 bs=16384 count=1 seek=6298
284 7:14 smartctl --test=short /dev/ada1
285 7:15 history