linux监控磁盘?linux磁盘空间满了怎么办

本篇文章给大家谈谈linux监控磁盘,以及linux磁盘空间满了怎么办对应的知识点,文章可能有点长,但是希望大家可以阅读完,增长自己的知识,最重要的是希望对各位有所帮助,可以解决了您的问题,不要忘了收藏本站喔。

在Linux中使用Smartctl监控磁盘性能的方法

Smartctl(S.M.A.R.T自监控,分析和报告技术)是类Unix系统下实施SMART任务命令行套件或工具,它用于打印SMART自检和错误日志,启用并禁用SMRAT自动检测,以及初始化设备自检。

Smartctl对于Linux物理服务器十分有用,在这些服务器上,可以对智能磁盘进行错误检查,并将与硬件RAID相关的磁盘信息摘录下来。

在本帖中,我们将讨论smartctl命令的一些实用样例。如果你的Linux上海没有安装smartctl,请按以下步骤来安装。

安装 Smartctl

对于 Ubuntu

复制代码代码如下:$ sudo apt-get install smartmontools

对于 CentOS& RHEL

复制代码代码如下:# yum install smartmontools

启动Smartctl服务

对于 Ubuntu

复制代码代码如下:$ sudo/etc/init.d/smartmontools start

对于 CentOS& RHEL

复制代码代码如下:# service smartd start; chkconfig smartd on

样例

样例:1检查磁盘的 Smart功能是否启用

复制代码代码如下:root@linuxtechi:~# smartctl-i/dev/sdb

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic](local build)

Copyright(C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION===

Model Family: Seagate Momentus 5400.6

Device Model: ST9320325AS

Serial Number: 5VD2V59T

LU WWN Device Id: 5 000c50 020a37ec4

Firmware Version: 0002BSM1

User Capacity: 320,072,933,376 bytes [320 GB]

Sector Size: 512 bytes logical/physical

Rotation Rate: 5400 rpm

Device is: In smartctl database [for details use:-P show]

ATA Version is: ATA8-ACS T13/1699-D revision 4

SATA Version is: SATA 2.6, 1.5 Gb/s

Local Time is: Sun Nov 16 12:32:09 2014 IST

SMART support is: Available- device has SMART capability.

SMART support is: Enabled

这里‘/dev/sdb’是你的硬盘。上面输出中的最后两行显示了SMART功能已启用。

样例:2启用磁盘的 Smart功能

复制代码代码如下:root@linuxtechi:~# smartctl-s on/dev/sdb

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic](local build)

Copyright(C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF ENABLE/DISABLE COMMANDS SECTION===

SMART Enabled.

样例:3禁用磁盘的 Smart功能

复制代码代码如下:root@linuxtechi:~# smartctl-s off/dev/sdb

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic](local build)

Copyright(C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF ENABLE/DISABLE COMMANDS SECTION===

SMART Disabled. Use option-s with argument'on' to enable it.

样例:4显示磁盘的详细 Smart信息

复制代码代码如下:root@linuxtechi:~# smartctl-a/dev/sdb// For IDE drive

root@linuxtechi:~# smartctl-a-d ata/dev/sdb// For SATA drive

样例:5显示磁盘总体健康状况

复制代码代码如下:root@linuxtechi:~# smartctl-H/dev/sdb

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic](local build)

Copyright(C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION===

SMART overall-health self-assessment test result: PASSED

Warning: This result is based on an Attribute check.

Please note the following marginal Attributes:

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE

190 Airflow_Temperature_Cel 0x0022 067 045 045 Old_age Always In_the_past 33(Min/Max 25/33)

样例:6使用long和short选项测试硬盘

Long测试

复制代码代码如下:root@linuxtechi:~# smartctl--test=long/dev/sdb

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic](local build)

Copyright(C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION===

Sending command:"Execute SMART Extended self-test routine immediately in off-line mode".

Drive command"Execute SMART Extended self-test routine immediately in off-line mode" successful.

Testing has begun.

Please wait 102 minutes for test to complete.

Test will complete after Sun Nov 16 14:29:43 2014

Use smartctl-X to abort test.

或者,我们可以重定向测试输出到日志文件,就像下面这样

复制代码代码如下:root@linuxtechi:~# smartctl--test=long/dev/sdb>/var/log/long.text

Short测试

复制代码代码如下:root@linuxtechi:~# smartctl--test=short/dev/sdb

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic](local build)

Copyright(C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION===

Sending command:"Execute SMART Short self-test routine immediately in off-line mode".

Drive command"Execute SMART Short self-test routine immediately in off-line mode" successful.

Testing has begun.

Please wait 1 minutes for test to complete.

Test will complete after Sun Nov 16 12:51:45 2014

Use smartctl-X to abort test.

复制代码代码如下:root@linuxtechi:~# smartctl--test=short/dev/sdb>/var/log/short.text

注意:short测试将花费最多2分钟,而在long测试中没有时间限制,因为它会读取并验证磁盘的每个段。

样例:7查看驱动器的自检结果

复制代码代码如下:root@linuxtechi:~# smartctl-l selftest/dev/sdb

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic](local build)

Copyright(C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION===

SMART Self-test log structure revision number 1

Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

# 1 Short offline Completed: read failure 90% 492 210841222

# 2 Extended offline Completed: read failure 90% 492 210841222

样例:8计算测试时间估值

复制代码代码如下:root@linuxtechi:~# smartctl-c/dev/sdb

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic](local build)

Copyright(C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION===

General SMART Values:

Offline data collection status:(0x00) Offline data collection activity

was never started.

Auto Offline Data Collection: Disabled.

Self-test execution status:( 121) The previous self-test completed having

the read element of the test failed.

Total time to complete Offline

data collection:( 0) seconds.

Offline data collection

capabilities:(0x73) SMART execute Offline immediate.

Auto Offline data collection on/off support.

Suspend Offline collection upon new

command.

No Offline surface scan supported.

Self-test supported.

Conveyance Self-test supported.

Selective Self-test supported.

SMART capabilities:(0x0003) Saves SMART data before entering

power-saving mode.

Supports SMART auto save timer.

Error logging capability:(0x01) Error logging supported.

General Purpose Logging supported.

Short self-test routine

recommended polling time:( 1) minutes.

Extended self-test routine

recommended polling time:( 102) minutes.

Conveyance self-test routine

recommended polling time:( 2) minutes.

SCT capabilities:(0x103b) SCT Status supported.

SCT Error Recovery Control supported.

SCT Feature Control supported.

SCT Data Table supported.

样例:9显示磁盘错误日志

复制代码代码如下:root@linuxtechi:~# smartctl-l error/dev/sdb

Sample Output

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-32-generic](local build)

Copyright(C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION===

SMART Error Log Version: 1

ATA Error Count: 5

CR= Command Register [HEX]

FR= Features Register [HEX]

SC= Sector Count Register [HEX]

SN= Sector Number Register [HEX]

CL= Cylinder Low Register [HEX]

CH= Cylinder High Register [HEX]

DH= Device/Head Register [HEX]

DC= Device Command Register [HEX]

ER= Error register [HEX]

ST= Status register [HEX]

Powered_Up_Time is measured from power on, and printed as

DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,

SS=sec, and sss=millisec. It"wraps" after 49.710 days.

Commands leading to the command that caused the error were:

CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

----------------------------------------------------

25 da 08 e7 e5 a5 4c 00 00:30:44.515 READ DMA EXT

25 da 08 df e5 a5 4c 00 00:30:44.514 READ DMA EXT

25 da 80 5f e5 a5 4c 00 00:30:44.502 READ DMA EXT

25 da f0 5f e6 a5 4c 00 00:30:44.496 READ DMA EXT

25 da 10 4f e6 a5 4c 00 00:30:44.383 READ DMA EXT

如何使用PHP实时监控Linux服务器的cpu,内存,硬盘信息

1,Linux下可以在/proc/cpuinfo中看到每个cpu的详细信息。但是对于双核的cpu,在cpuinfo中会看到两个cpu。常常会让人误以为是两个单核的cpu。

其实应该通过Physical

Processor

ID来区分单核和双核。而Physical

Processor

ID可以从cpuinfo或者dmesg中找到.

flags

如果有

ht

说明

支持超线程技术

判断物理CPU的个数可以查看physical

id

的值,相同则为同一个物理CPU

2,查看内存大小:

cat

/proc/meminfo

|grep

MemTotal

3,其他一些可以查看详细

linux系统

信息的命令和方法:

uname

-a

#

查看内核/操作系统/

CPU信息

的linux系统信息命令

head

-n

1

/etc/issue

#

查看操作系统版本,是数字1不是字母L

cat

/proc/cpuinfo

#

查看CPU信息的linux系统信息命令

hostname

#

查看计算机名的linux系统信息命令

lspci

-tv

#

列出所有

PCI设备

lsusb

-tv

#

列出所有USB设备的linux系统信息命令

lsmod

#

列出加载的内核模块

env

#

查看

环境变量

资源

free

-m

#

查看内存使用量和

交换区

使用量

df

-h

#

查看各分区使用情况

du

-sh

#

查看指定目录的大小

grep

MemTotal

/proc/meminfo

#

查看内存总量

grep

MemFree

/proc/meminfo

#

查看空闲内存量

uptime

#

查看系统

运行时间

、用户数、负载

cat

/proc/loadavg

#

查看系统负载磁盘和分区

mount

|

column

-t

#

查看挂接的分区状态

fdisk

-l

#

查看所有分区

swapon

-s

#

查看所有

交换分区

hdparm

-i

/dev/hda

#

查看磁盘参数(仅适用于

IDE设备

)

dmesg

|

grep

IDE

#

查看启动时IDE设备检测状况网络

ifconfig

#

查看所有网络接口的属性

iptables

-L

#

查看防火墙设置

route

-n

#

查看

路由表

netstat

-lntp

#

查看所有监听端口

netstat

-antp

#

查看所有已经建立的连接

netstat

-s

#

查看

网络统计

信息进程

ps

-ef

#

查看所有进程

top

#

实时显示

进程状态

用户

w

#

查看活动用户

id

#

查看指定用户信息

last

#

查看

用户登录

日志

cut

-d:

-f1

/etc/passwd

#

查看系统所有用户

cut

-d:

-f1

/etc/group

#

查看系统所有组

crontab

-l

#

查看当前用户的计划任务服务

chkconfig

–list

#

列出所有系统服务

chkconfig

–list

|

grep

on

#

列出所有启动的系统服务程序

rpm

-qa

#

查看所有安装的软件包

cat

/proc/cpuinfo

:查看CPU相关参数的linux系统命令

cat

/proc/partitions

:查看linux硬盘和分区信息的系统信息命令

cat

/proc/meminfo

:查看linux系统内存信息的linux系统命令

cat

/proc/version

:查看版本,类似uname

-r

cat

/proc/ioports

:查看设备io端口

cat

/proc/interrupts

:查看中断

cat

/proc/pci

:查看pci设备的信息

cat

/proc/swaps

:查看所有swap分区的信息

几个常用的Linux操作系统监控脚本代码

本文介绍了几个常用的Linux监控脚本,可以实现主机网卡流量、系统状况、主机磁盘空间、CPU和内存的使用情况等方面的自动监控与报警。根据自己的需求写出的shell脚本更能满足需求,更能细化主机监控的全面性。

最近时不时有互联网的朋友问我关于服务器监控方面的问题,问常用的服务器监控除了用开源软件,比如:cacti,nagios监控外是否可以自己写shell脚本呢?根据自己的需求写出的shell脚本更能满足需求,更能细化主机监控的全面性。

下面是我常用的几个主机监控的脚本,大家可以根据自己的情况再进行修改,希望能给大家一点帮助。

1、查看主机网卡流量

复制代码代码如下:

#!/bin/bash#network#Mike.Xu while:; do time='date+%m"-"%d""%k":"%M' day='date+%m"-"%d' rx_before='ifconfig eth0|sed-n"8"p|awk'{print$2}'|cut-c7-' tx_before='ifconfig eth0|sed-n"8"p|awk'{print$6}'|cut-c7-' sleep 2 rx_after='ifconfig eth0|sed-n"8"p|awk'{print$2}'|cut-c7-' tx_after='ifconfig eth0|sed-n"8"p|awk'{print$6}'|cut-c7-' rx_result=$[(rx_after-rx_before)/256] tx_result=$[(tx_after-tx_before)/256] echo"$time Now_In_Speed:"$rx_result"kbps Now_OUt_Speed:"$tx_result"kbps" sleep 2 done

2、系统状况监控

复制代码代码如下:

#!/bin/sh#systemstat.sh#Mike.Xu IP=192.168.1.227 top-n 2| grep"Cpu"》./temp/cpu.txt free-m| grep"Mem"》./temp/mem.txt df-k| grep"sda1"》./temp/drive_sda1.txt#df-k| grep sda2》./temp/drive_sda2.txt df-k| grep"/mnt/storage_0"》./temp/mnt_storage_0.txt df-k| grep"/mnt/storage_pic"》./temp/mnt_storage_pic.txt time=`date+%m"."%d""%k":"%M` connect=`netstat-na| grep"219.238.148.30:80"| wc-l` echo"$time$connect"》./temp/connect_count.txt

3、监控主机的磁盘空间,当使用空间超过90%就通过发mail来发警告

复制代码代码如下:

#!/bin/bash#monitor available disk space SPACE='df| sed-n'//$/ p'| gawk'{print$5}'| sed's/%//' if [$SPACE-ge 90 ] then fty89@163.com fi

4、监控CPU和内存的使用情况

复制代码代码如下:

#!/bin/bash#script to capture system statistics OUTFILE=/home/xu/capstats.csv

  DATE='date+%m/%d/%Y'

  TIME='date+%k:%m:%s'

  TIMEOUT='uptime'

  VMOUT='vmstat 1 2'

  USERS='echo$TIMEOUT| gawk'{print$4}''

  LOAD='echo$TIMEOUT| gawk'{print$9}'| sed"s/,//''

  FREE='echo$VMOUT| sed-n'/[0-9]/p'| sed-n'2p'| gawk'{print$4}''

  IDLE='echo$VMOUT| sed-n'/[0-9]/p'| sed-n'2p'|gawk'{print$15}''

  echo"$DATE,$TIME,$USERS,$LOAD,$FREE,$IDLE"》$OUTFILE

5、全方位监控主机

复制代码代码如下:

#!/bin/bash# check_xu.sh# 0****/home/check_xu.sh DAT="`date+%Y%m%d`" HOUR="`date+%H`" DIR="/home/oslog/host_${DAT}/${HOUR}" DELAY=60 COUNT=60# whether the responsible directory exist if! test-d${DIR} then/bin/mkdir-p${DIR} fi# general check export TERM=linux/usr/bin/top-b-d${DELAY}-n${COUNT}${DIR}/top_${DAT}.log 21# cpu check/usr/bin/sar-u${DELAY}${COUNT}${DIR}/cpu_${DAT}.log 21#/usr/bin/mpstat-P 0${DELAY}${COUNT}${DIR}/cpu_0_${DAT}.log 21#/usr/bin/mpstat-P 1${DELAY}${COUNT}${DIR}/cpu_1_${DAT}.log 21# memory check/usr/bin/vmstat${DELAY}${COUNT}${DIR}/vmstat_${DAT}.log 21# I/O check/usr/bin/iostat${DELAY}${COUNT}${DIR}/iostat_${DAT}.log 21# network check/usr/bin/sar-n DEV${DELAY}${COUNT}${DIR}/net_${DAT}.log 21#/usr/bin/sar-n EDEV${DELAY}${COUNT}${DIR}/net_edev_${DAT}.log 21

放在crontab里每小时自动执行:

0****/home/check_xu.sh

这样会在/home/oslog/host_yyyymmdd/hh目录下生成各小时cpu、内存、网络,IO的统计数据。

如果某个时间段产生问题了,就可以去看对应的日志信息,看看当时的主机性能如何。

阅读剩余
THE END