1. CPU Performance

Current Usage: top, htop, or mpstat
Load Average: uptime or check the output of top (the three numbers at the top-right).
Compare the load average to the number of CPU cores (nproc).
Processes: Monitor high-CPU-consuming processes using top or ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%cpu.

2. Memory Usage

Total and Free Memory: free -h or vmstat -s.
Swap Usage: Check if swap space is being heavily used (free -h or swapon -s).
Processes Using Most Memory: top or ps -eo pid,ppid,cmd,%mem --sort=-%mem.

3. Disk Usage

Available Space: df -h to check disk usage across filesystems.
Inode Usage: df -i to check inode utilization.
Disk I/O: iostat, iotop, or dstat.
Error Messages: Review logs in /var/log/ for any disk-related errors.

4. Network Performance

Network Usage: iftop, ip -s link, or netstat.
Connections: ss or netstat to check open connections and ports.
Packet Loss/Latency: ping, traceroute, or mtr.
Bandwidth Monitoring: vnstat, iftop, or nload.

5. System Logs

General System Logs: journalctl or /var/log/syslog (for system-wide events).
Kernel Logs: dmesg or journalctl -k to check for hardware errors or warnings.

6. Uptime and System Load

Uptime: uptime command provides server uptime and load averages.
Load Analysis: Investigate load spikes with sar or atop.

7. Running Services and Processes

Service Status: systemctl status <service> or service <service> status.
Zombie/Unnecessary Processes: ps aux | grep Z to list zombie processes.

8. Security

Users Logged In: who, w, or last.
Unauthorized Logins: Review /var/log/secure or /var/log/auth.log.
Firewall Rules: iptables -L or ufw status.
Listening Ports: ss -tuln or netstat -tuln.

9. Hardware Health

Temperature and Fan Speed: sensors (part of lm-sensors package).
RAID Status: Check using mdadm or vendor tools if RAID is configured.

10. Scheduled Jobs

Cron Jobs: crontab -l or check /etc/crontab.
Failures: Examine /var/log/syslog for cron-related logs.

11. Backup Status

Backup Logs: Ensure regular backups are occurring as scheduled.
Verify Integrity: Test restore procedures periodically.

Automation Tools for Server Health Monitoring

Nagios, Zabbix, Prometheus, or Datadog for continuous monitoring.
Custom scripts combining commands like top, df, iostat, and log parsing can provide quick insights.

By periodically reviewing these parameters, you can ensure the Linux server’s health and address potential issues proactively.

“journalctl” – is a command-line tool in Linux used to query and view logs managed by the systemd-journald service, which is part of the systemd system and service manager. journalctl allows users to access log data from various sources in a consolidated, searchable format, covering everything from kernel and system logs to application logs for services that run on systemd.

Here’s a quick overview of how to use journalctl:

1 .View All Logs:

journalctl

2. View Most Recent Logs:

journalctl -r

3 .Follow Logs in Real-Time (similar to tail -f):

journalctl -f

4. Specify a Service:

journalctl -u [service-name]

5. Filter by Time:

journalctl –since “YYYY-MM-DD HH:MM:SS” –until “YYYY-MM-DD HH:MM:SS”
journalctl –since “1 hour ago”

6. Filter by Priority:

journalctl -p [priority]

7. View Kernel Messages:

journalctl -k

8. Advanced Filtering:

journalctl -u nginx –since “2024-10-01” –until “2024-10-31” -p warning

Ahsan Habib

Welcome to my LifeSytle blog

Tag Archives: System

What and how to check any linux Server/Systems health