1. CPU Performance
- Current Usage:
top
,htop
, ormpstat
- Load Average:
uptime
or check the output oftop
(the three numbers at the top-right). - Compare the load average to the number of CPU cores (
nproc
). - Processes: Monitor high-CPU-consuming processes using
top
orps -eo pid,ppid,cmd,%mem,%cpu --sort=-%cpu
.
2. Memory Usage
- Total and Free Memory:
free -h
orvmstat -s
. - Swap Usage: Check if swap space is being heavily used (
free -h
orswapon -s
). - Processes Using Most Memory:
top
orps -eo pid,ppid,cmd,%mem --sort=-%mem
.
3. Disk Usage
- Available Space:
df -h
to check disk usage across filesystems. - Inode Usage:
df -i
to check inode utilization. - Disk I/O:
iostat
,iotop
, ordstat
. - Error Messages: Review logs in
/var/log/
for any disk-related errors.
4. Network Performance
- Network Usage:
iftop
,ip -s link
, ornetstat
. - Connections:
ss
ornetstat
to check open connections and ports. - Packet Loss/Latency:
ping
,traceroute
, ormtr
. - Bandwidth Monitoring:
vnstat
,iftop
, ornload
.
5. System Logs
- General System Logs:
journalctl
or/var/log/syslog
(for system-wide events). - Kernel Logs:
dmesg
orjournalctl -k
to check for hardware errors or warnings.
6. Uptime and System Load
- Uptime:
uptime
command provides server uptime and load averages. - Load Analysis: Investigate load spikes with
sar
oratop
.
7. Running Services and Processes
- Service Status:
systemctl status <service>
orservice <service> status
. - Zombie/Unnecessary Processes:
ps aux | grep Z
to list zombie processes.
8. Security
- Users Logged In:
who
,w
, orlast
. - Unauthorized Logins: Review
/var/log/secure
or/var/log/auth.log
. - Firewall Rules:
iptables -L
orufw status
. - Listening Ports:
ss -tuln
ornetstat -tuln
.
9. Hardware Health
- Temperature and Fan Speed:
sensors
(part of lm-sensors package). - RAID Status: Check using
mdadm
or vendor tools if RAID is configured.
10. Scheduled Jobs
- Cron Jobs:
crontab -l
or check/etc/crontab
. - Failures: Examine
/var/log/syslog
for cron-related logs.
11. Backup Status
- Backup Logs: Ensure regular backups are occurring as scheduled.
- Verify Integrity: Test restore procedures periodically.
Automation Tools for Server Health Monitoring
- Nagios, Zabbix, Prometheus, or Datadog for continuous monitoring.
- Custom scripts combining commands like
top
,df
,iostat
, and log parsing can provide quick insights.
By periodically reviewing these parameters, you can ensure the Linux server’s health and address potential issues proactively.