What and how to check any linux Server/Systems health

1. CPU Performance

  • Current Usage: top, htop, or mpstat
  • Load Average: uptime or check the output of top (the three numbers at the top-right).
  • Compare the load average to the number of CPU cores (nproc).
  • Processes: Monitor high-CPU-consuming processes using top or ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%cpu.

2. Memory Usage

  • Total and Free Memory: free -h or vmstat -s.
  • Swap Usage: Check if swap space is being heavily used (free -h or swapon -s).
  • Processes Using Most Memory: top or ps -eo pid,ppid,cmd,%mem --sort=-%mem.

3. Disk Usage

  • Available Space: df -h to check disk usage across filesystems.
  • Inode Usage: df -i to check inode utilization.
  • Disk I/O: iostat, iotop, or dstat.
  • Error Messages: Review logs in /var/log/ for any disk-related errors.

4. Network Performance

  • Network Usage: iftop, ip -s link, or netstat.
  • Connections: ss or netstat to check open connections and ports.
  • Packet Loss/Latency: ping, traceroute, or mtr.
  • Bandwidth Monitoring: vnstat, iftop, or nload.

5. System Logs

  • General System Logs: journalctl or /var/log/syslog (for system-wide events).
  • Kernel Logs: dmesg or journalctl -k to check for hardware errors or warnings.

6. Uptime and System Load

  • Uptime: uptime command provides server uptime and load averages.
  • Load Analysis: Investigate load spikes with sar or atop.

7. Running Services and Processes

  • Service Status: systemctl status <service> or service <service> status.
  • Zombie/Unnecessary Processes: ps aux | grep Z to list zombie processes.

8. Security

  • Users Logged In: who, w, or last.
  • Unauthorized Logins: Review /var/log/secure or /var/log/auth.log.
  • Firewall Rules: iptables -L or ufw status.
  • Listening Ports: ss -tuln or netstat -tuln.

9. Hardware Health

  • Temperature and Fan Speed: sensors (part of lm-sensors package).
  • RAID Status: Check using mdadm or vendor tools if RAID is configured.

10. Scheduled Jobs

  • Cron Jobs: crontab -l or check /etc/crontab.
  • Failures: Examine /var/log/syslog for cron-related logs.

11. Backup Status

  • Backup Logs: Ensure regular backups are occurring as scheduled.
  • Verify Integrity: Test restore procedures periodically.

Automation Tools for Server Health Monitoring

  • Nagios, Zabbix, Prometheus, or Datadog for continuous monitoring.
  • Custom scripts combining commands like top, df, iostat, and log parsing can provide quick insights.

By periodically reviewing these parameters, you can ensure the Linux server’s health and address potential issues proactively.

Checking Linux Logs : All bout “journalctl”

“journalctl” – is a command-line tool in Linux used to query and view logs managed by the systemd-journald service, which is part of the systemd system and service manager. journalctl allows users to access log data from various sources in a consolidated, searchable format, covering everything from kernel and system logs to application logs for services that run on systemd.

Here’s a quick overview of how to use journalctl:

1 .View All Logs:

journalctl

2. View Most Recent Logs:

journalctl -r

3 .Follow Logs in Real-Time (similar to tail -f):

journalctl -f

4. Specify a Service:

journalctl -u [service-name]

5. Filter by Time:

journalctl –since “YYYY-MM-DD HH:MM:SS” –until “YYYY-MM-DD HH:MM:SS”
journalctl –since “1 hour ago”

6. Filter by Priority:

journalctl -p [priority]

7. View Kernel Messages:

journalctl -k

8. Advanced Filtering:

journalctl -u nginx –since “2024-10-01” –until “2024-10-31” -p warning

All about ” tcpdump “

Install TCPDUMP in ubuntu –

sudo apt-get install tcpdump

sudo yum install tcpdump

sudo tcpdump [options] [filter expression]
sudo tcpdump -i eth1
sudo tcpdump udp
sudo tcpdump port 80
sudo tcpdump dst port 80
sudo tcpdump src host 1.2.3.4
sudo tcpdump “src port 22” and “dst host 1.2.3.4” #Use and or or operator
sudo tcpdump “src port 22” or “src port 443”
tcpdump host 1.2.3.4 -w /home/users/demo/demo.dump
tcpdump -r /home/users/demo/demo.dump #read the raw file



Change Host name & IP address in Ubuntu Server 24.04

A. Change hostname

$ hostnamectl set-hostname new-hostname

Check hostname

$ hostname

Reboot the server

B. Change IP Address

Find the following file – /etc/netplan/01-netcfg.yaml OR  50-cloud-init.yaml

Edit the IP/Interface info –

$ sudo chmod 600 /etc/netplan/01-netcfg.yaml
network:
version: 2
renderer: networkd
ethernets:
enp0s3:
dhcp4: no
addresses:
- 192.168.1.10/24
routes:
- to: default
via: 192.168.1.1
nameservers:
addresses: [8.8.8.8, 8.8.4.4]

Ensure that the Netplan configuration file permissions are secure to prevent unauthorized access.

$ sudo chmod 600 /etc/netplan/01-netcfg.yaml

Apply the configuration change

$ sudo netplan apply

Verify the ip address

$ ip a

To get Linux system information using ‘dmidecode’ command –

[habib@localhost ~]$ sudo dmidecode -t system

dmidecode 3.2

Getting SMBIOS data from sysfs.
SMBIOS 2.8 present.

Handle 0x0100, DMI type 1, 27 bytes
System Information
Manufacturer: RDO
Product Name: OpenStack Compute
Version: 19.3.2-1.el7
Serial Number: 9b3570ed-2a79-4c7d-91b8-f4fad8c4ec52
UUID: 9b3570ed-2a79-4c7d-91b8-f4fad8c4ec52
Wake-up Type: Power Switch
SKU Number: Not Specified
Family: Virtual Machine

Handle 0x2000, DMI type 32, 11 bytes
System Boot Information
Status: No errors detected

[habib@localhost ~]$