Monday, November 23, 2015

Fixing server error: SMART error (CurrentPendingSector) detected on host: (your_sever_hostname)

This morning I got an email from one of my server with this content:

-----------------

This email was generated by the smartd daemon running on:

   host name: (your_server_hostname)
  DNS domain: yourdomain.com
  NIS domain: (none)

The following warning/error was logged by the smartd daemon:

Device: /dev/sdb [SAT], 1 Currently unreadable (pending) sectors


For details see host's SYSLOG.

You can also use the smartctl utility for further investigation.
The original email about this issue was sent at Sat Nov 21 15:15:52 2015 CST
Another email message will be sent in 24 hours if the problem persists.



---------------------


The confusing part is ... when I check the hard drive /dev/sdb using smartmon tool, it actually says PASSED!

smartctl -H /dev/sdb
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-2.6.32-40-pve] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED



But when I do a long test:

smartctl --test=short /dev/sdb

or 

smartctl --test=long /dev/sdb


and check the result using:


smartctl -a /dev/sdb


I found some errors:

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%      6518         84256


as you can see I have a problem and it was confirmed by ONLY the long test.

I replaced the bad drive /dev/sdb and ran another short test.  Problem has been confirmed and fixed.



No comments:

Post a Comment