Mastering Linux System Monitoring with htop, iostat, and sar
System monitoring is crucial for maintaining healthy Linux servers and identifying performance bottlenecks before they impact your applications. This guide covers essential monitoring tools that every Linux administrator should master.
Why System Monitoring Matters
Effective monitoring helps you:
- Identify resource bottlenecks early
- Plan capacity upgrades
- Troubleshoot performance issues
- Optimize system configuration
- Maintain service availability
htop: Interactive Process Viewer
htop is an enhanced version of the traditional top command, providing a more user-friendly interface for monitoring system processes.
Installation
# Ubuntu/Debian
sudo apt install htop -y
# CentOS/RHEL/Rocky Linux
sudo yum install htop -y
# Arch Linux
sudo pacman -S htop
Basic Usage
Simply run:
htop
Understanding the htop Interface
Header Section
- CPU Usage: Shows per-core CPU utilization with color coding
- Memory: RAM and swap usage with visual bars
- Load Average: 1, 5, and 15-minute load averages
- Tasks: Number of total, running, sleeping, stopped, and zombie processes
- Uptime: System uptime information
Color Coding
- Blue: Low-priority processes
- Green: Normal user processes
- Red: Kernel processes
- Yellow: IRQ time
- Magenta: Soft IRQ time
- Cyan: Steal time (virtualized environments)
Key Commands in htop
# Navigation
↑/↓ # Navigate process list
Home/End # Jump to top/bottom
PgUp/PgDn # Page up/down
# Sorting
F6 or > # Sort by different columns
P # Sort by CPU usage
M # Sort by memory usage
T # Sort by TIME+
# Process Management
F9 or k # Kill selected process
F7/F8 # Change process priority
F4 or \ # Filter processes
F5 or t # Tree view toggle
# Display Options
F2 # Setup screen
H # Hide/show user threads
K # Hide/show kernel threads
Customizing htop
Press F2 to access setup options:
- Colors: Choose color schemes
- Display Options: Show/hide various elements
- Columns: Add/remove columns like PPID, USER, PRIORITY
- Meters: Customize header layout
htop Configuration File
htop saves configuration in ~/.config/htop/htoprc:
# View current configuration
cat ~/.config/htop/htoprc
# Reset to defaults
rm ~/.config/htop/htoprc
iostat: I/O Statistics
iostat provides detailed input/output statistics for devices and partitions, helping identify storage bottlenecks.
Installation
iostat is part of the sysstat package:
# Ubuntu/Debian
sudo apt install sysstat -y
# CentOS/RHEL/Rocky Linux
sudo yum install sysstat -y
Basic Usage
# Basic I/O statistics
iostat
# Extended statistics every 2 seconds
iostat -x 2
# Monitor specific device
iostat -x sda 1
# Show statistics since boot
iostat -x 1 1
Understanding iostat Output
Device Statistics
Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
sda 1.25 2.30 45.23 123.45 0.12 0.45 8.76 16.38 2.34 4.56 0.01 36.18 53.67 1.23 0.45
Key metrics:
- r/s, w/s: Reads/writes per second
- rkB/s, wkB/s: Kilobytes read/written per second
- %util: Percentage of time device was busy
- await: Average time for I/O requests (milliseconds)
- svctm: Average service time
Useful iostat Commands
# Show extended statistics for all devices
iostat -x
# Monitor every 5 seconds, 12 times
iostat -x 5 12
# Show statistics for specific devices
iostat -x sda sdb 2
# Include timestamps
iostat -xt 1
# Show CPU and device statistics
iostat -c -d 2
Interpreting Performance Issues
High %util
- Indicates storage bottleneck
- Consider faster storage or load balancing
High await/svctm
- Slow response times
- Check for failing drives or overloaded storage
High r/s or w/s with low throughput
- Many small I/O operations
- Consider optimizing application I/O patterns
sar: System Activity Reporter
sar collects and reports system activity information, providing historical data for performance analysis.
Basic Usage
# Current CPU usage
sar
# CPU usage every 2 seconds, 5 times
sar 2 5
# Memory usage
sar -r
# I/O statistics
sar -b
# Network statistics
sar -n DEV
Common sar Options
CPU Monitoring
# Overall CPU usage
sar -u 1 10
# Per-CPU statistics
sar -P ALL 1 5
# CPU usage for specific CPU
sar -P 0 1 5
Memory Monitoring
# Memory utilization
sar -r 1 10
# Memory statistics including buffers/cache
sar -R 1 5
# Swap space utilization
sar -S 1 10
I/O Monitoring
# I/O transfer rates
sar -b 1 10
# Block device statistics
sar -d 1 10
# Specific device monitoring
sar -d -p 1 10
Network Monitoring
# Network interface statistics
sar -n DEV 1 10
# Network error statistics
sar -n EDEV 1 10
# TCP statistics
sar -n TCP 1 10
# UDP statistics
sar -n UDP 1 10
Historical Data Analysis
sar stores historical data in /var/log/sysstat/ (or /var/log/sa/):
# View yesterday's CPU data
sar -u -f /var/log/sysstat/saDD
# View specific time range
sar -r -s 10:00:00 -e 18:00:00
# Generate daily report
sar -A > daily_report.txt
Setting Up Data Collection
Enable automatic data collection by editing /etc/cron.d/sysstat:
# Collect data every 10 minutes
*/10 * * * * root /usr/lib64/sa/sa1 1 1
# Generate daily reports
53 23 * * * root /usr/lib64/sa/sa2 -A
Advanced Monitoring Techniques
Combining Tools for Comprehensive Analysis
Real-time Performance Script
#!/bin/bash
# performance_monitor.sh
echo "=== System Performance Monitor ==="
echo "Date: $(date)"
echo
echo "=== CPU Usage ==="
sar -u 1 1 | tail -n 1
echo "=== Memory Usage ==="
sar -r 1 1 | tail -n 1
echo "=== I/O Statistics ==="
iostat -x 1 1 | grep -A 20 "Device"
echo "=== Top Processes ==="
ps aux --sort=-%cpu | head -6
echo "=== Load Average ==="
uptime
Performance Alert Script
#!/bin/bash
# alert_monitor.sh
CPU_THRESHOLD=80
MEM_THRESHOLD=85
DISK_THRESHOLD=90
# Check CPU usage
CPU_USAGE=$(sar -u 1 1 | tail -1 | awk '{print 100-$8}')
if (( $(echo "$CPU_USAGE > $CPU_THRESHOLD" | bc -l) )); then
echo "ALERT: High CPU usage: $CPU_USAGE%"
fi
# Check memory usage
MEM_USAGE=$(free | grep Mem | awk '{printf "%.1f", $3/$2 * 100.0}')
if (( $(echo "$MEM_USAGE > $MEM_THRESHOLD" | bc -l) )); then
echo "ALERT: High memory usage: $MEM_USAGE%"
fi
# Check disk usage
df -h | awk 'NR>1 {gsub(/%/,"",$5); if($5 > '$DISK_THRESHOLD') print "ALERT: High disk usage on " $6 ": " $5"%"}'
Creating Custom Monitoring Dashboards
Simple Web Dashboard with HTML
<!DOCTYPE html>
<html>
<head>
<title>System Monitor</title>
<meta http-equiv="refresh" content="30">
</head>
<body>
<h1>System Performance Dashboard</h1>
<pre id="stats">
<!-- Auto-refreshed system stats -->
</pre>
<script>
// Add JavaScript for real-time updates
</script>
</body>
</html>
Best Practices
1. Regular Monitoring Schedule
- Check system performance daily
- Review weekly trends
- Analyze monthly patterns
- Plan capacity based on historical data
2. Set Up Alerts
- Configure thresholds for critical metrics
- Use email or SMS notifications
- Implement escalation procedures
- Document response procedures
3. Baseline Performance
- Record normal operating parameters
- Document seasonal variations
- Track performance after changes
- Maintain performance history
4. Tool Selection Strategy
- Use htop for real-time process monitoring
- Use iostat for storage performance analysis
- Use sar for historical trend analysis
- Combine tools for comprehensive monitoring
Troubleshooting Common Issues
High CPU Usage
# Identify CPU-intensive processes
htop (sort by CPU)
ps aux --sort=-%cpu | head -10
# Check for runaway processes
sar -u 1 10
Memory Issues
# Check memory usage patterns
sar -r 1 10
free -h
# Identify memory-hungry processes
htop (sort by memory)
ps aux --sort=-%mem | head -10
I/O Bottlenecks
# Monitor I/O performance
iostat -x 1 10
iotop (if available)
# Check specific device performance
iostat -x sda 1 10
Conclusion
Mastering these monitoring tools—htop, iostat, and sar—provides you with comprehensive visibility into your Linux system's performance. Regular monitoring helps prevent issues, optimize resource usage, and maintain system reliability.
Key takeaways:
- Use htop for interactive process monitoring
- Use iostat to identify storage bottlenecks
- Use sar for historical performance analysis
- Combine tools for complete system visibility
- Set up automated monitoring and alerting
Remember that monitoring is most effective when it's consistent and proactive. Establish baselines, set up alerts, and regularly review performance trends to maintain optimal system health.
