Introduction: Why Log Analysis Is a Core Skill for System Administrators

When operating Linux servers, the first thing you check when an incident occurs is the logs. Logs are essentially the system's "black box," recording what happened, when, and how. Whether you're tracing the cause of a security breach, debugging a service outage, or identifying performance bottlenecks, logs are always the starting point.

In real production environments, log files can range from several GB to tens of GB, and it's not uncommon for millions of lines to accumulate in a single day. The ability to quickly and accurately extract the information you need from this massive volume of data is a core competency for system administrators and DevOps engineers.

This guide systematically covers all essential commands needed for Linux log analysis. From basic commands to advanced one-liners, it is organized around practical examples you can copy and use immediately in production. It provides step-by-step explanations suitable for everyone from beginners to intermediate and advanced practitioners.

1. Linux Log System Architecture and Key Files

1.1 Log System Architecture

The Linux logging system is broadly divided into two frameworks:

  • Traditional syslog family: Tools like rsyslog and syslog-ng write logs as text files, stored under the /var/log/ directory.
  • systemd-journald: On systemd-based systems, logs are managed in binary format and queried using the journalctl command.

Most modern distributions (RHEL 8+, Ubuntu 20.04+, Debian 11+) run both systems in parallel. journald collects logs first, and rsyslog also saves them as text files.

1.2 Essential Log Files You Must Know

Log File Contents Primary Use
/var/log/syslog General system messages (Debian/Ubuntu) General troubleshooting
/var/log/messages General system messages (RHEL/CentOS) General troubleshooting
/var/log/auth.log Authentication/login events (Debian/Ubuntu) Security analysis, intrusion detection
/var/log/secure Authentication/login events (RHEL/CentOS) Security analysis, intrusion detection
/var/log/kern.log Kernel messages Hardware, driver issues
/var/log/dmesg Boot-time kernel messages Boot problems, hardware detection
/var/log/cron Cron job execution records Scheduled task debugging
/var/log/maillog Mail server logs Mail send/receive tracking
/var/log/nginx/ Nginx access/error logs Web traffic analysis
/var/log/apache2/ Apache access/error logs Web traffic analysis
/var/log/mysql/ MySQL/MariaDB logs DB query, error analysis
/var/log/audit/audit.log SELinux/AppArmor audit logs Security auditing, policy violations
/var/log/boot.log Service boot logs Service startup failure analysis
/var/log/lastlog Last login records (binary) Query with lastlog command
/var/log/wtmp Login/logout history (binary) Query with last command
/var/log/btmp Failed login attempts (binary) Query with lastb command
TIP: Log file locations may vary depending on the distribution. Make it a habit to first check what log files exist by running ls -la /var/log/.

1.3 Understanding Log Rotation

Log files are periodically rotated by logrotate. Previous logs are archived with .1, .2, or .gz extensions:

# Log rotation example
/var/log/syslog          # Current log
/var/log/syslog.1        # Previous log (uncompressed)
/var/log/syslog.2.gz     # Older log (compressed)
/var/log/syslog.3.gz     # Even older log (compressed)

# Use zgrep, zcat to search compressed log files
zgrep "error" /var/log/syslog.2.gz
zcat /var/log/syslog.3.gz | grep "kernel"

# Check logrotate configuration
cat /etc/logrotate.conf
ls /etc/logrotate.d/

2. Basic Log Viewing Commands

2.1 tail - The Starting Point for Real-Time Log Monitoring

tail is the most fundamental and most frequently used command for log analysis. The -f option in particular is the cornerstone of real-time monitoring.

# View the last 10 lines (default)
tail /var/log/syslog

# View the last 50 lines
tail -n 50 /var/log/syslog
tail -50 /var/log/syslog          # Short form

# Real-time log monitoring (follow)
tail -f /var/log/syslog

# Real-time monitoring starting from the last 100 lines
tail -n 100 -f /var/log/syslog
tail -100f /var/log/syslog        # Short form

# Monitor multiple files simultaneously
tail -f /var/log/syslog /var/log/auth.log

# Continue tracking even after file rotation
tail -F /var/log/syslog            # Uppercase -F: reopens the file

# Real-time filtering for specific keywords
tail -f /var/log/syslog | grep --line-buffered "error"
tail -f /var/log/syslog | grep --line-buffered -i "fail\|error\|warn"
Warning: When piping tail -f to grep, always use the --line-buffered option. Otherwise, output may be delayed due to buffering.

2.2 head - Viewing the Beginning of a File

# View the first 10 lines (default)
head /var/log/syslog

# View the first 30 lines
head -n 30 /var/log/syslog

# View everything except the last 10 lines
head -n -10 /var/log/syslog

2.3 cat, less, more - Viewing Entire Files

# Print entire file (suitable only for small files)
cat /var/log/cron

# Print with line numbers
cat -n /var/log/cron

# Browse page by page (most recommended)
less /var/log/syslog
# Keyboard shortcuts inside less:
#   / : Search forward
#   ? : Search backward
#   n : Next search result
#   N : Previous search result
#   G : Go to end of file
#   g : Go to beginning of file
#   q : Quit

# Real-time monitoring in less (similar to tail -f)
less +F /var/log/syslog
# Ctrl+C to switch to browse mode, Shift+F to resume follow mode
TIP: For large log files, always use less instead of cat. cat loads the entire file into memory, which can put strain on the system for multi-GB files.

2.4 wc - Quick Log Statistics

# Check total line count
wc -l /var/log/syslog

# Quickly count how many errors occurred today
grep -c "error" /var/log/syslog
grep -ci "error\|fail\|critical" /var/log/syslog

# Compare line counts across multiple log files
wc -l /var/log/syslog /var/log/auth.log /var/log/kern.log

3. grep - The Essential Weapon for Log Searching

It's no exaggeration to say that grep is the single most important command for log analysis. It quickly extracts the information you need from massive logs through string pattern matching.

3.1 Basic Search

# Search for a specific string
grep "error" /var/log/syslog

# Case-insensitive search
grep -i "error" /var/log/syslog

# Show line numbers
grep -n "error" /var/log/syslog

# Show only the match count
grep -c "error" /var/log/syslog

# Show only lines that do NOT match (inverse matching)
grep -v "info" /var/log/syslog

# Exact word matching (matches "error" but not "errorlog")
grep -w "error" /var/log/syslog

3.2 Advanced Search with Regular Expressions

# Use extended regular expressions (-E or egrep)
grep -E "error|fail|critical" /var/log/syslog

# Search for IP address patterns
grep -E "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" /var/log/auth.log

# Filter logs for a specific time window (e.g., 3 AM)
grep "^Mar  7 03:" /var/log/syslog

# Date + time range (2 AM to 5 AM)
grep -E "^Mar  7 0[2-5]:" /var/log/syslog

# Extract failed SSH login attempts
grep "Failed password" /var/log/auth.log

# Track sudo commands by a specific user
grep "sudo.*username" /var/log/auth.log

3.3 Context Search - Understanding What Happened Before and After an Error

# Show 5 lines after the match (After)
grep -A 5 "kernel panic" /var/log/kern.log

# Show 3 lines before the match (Before)
grep -B 3 "OOM" /var/log/syslog

# Show 5 lines before and after (Context)
grep -C 5 "segfault" /var/log/syslog

# Combined conditions: error AND nginx-related
grep "error" /var/log/syslog | grep "nginx"

# Search across multiple files simultaneously
grep -r "connection refused" /var/log/
grep -rl "connection refused" /var/log/    # Output filenames only

3.4 Searching Compressed Logs

# Search directly in gz compressed files
zgrep "error" /var/log/syslog.2.gz

# View compressed file contents
zcat /var/log/syslog.2.gz | less

# Search both current and archived logs
zgrep "error" /var/log/syslog /var/log/syslog.1 /var/log/syslog.*.gz

4. awk - The Powerhouse for Log Data Extraction and Processing

awk is a tool optimized for splitting and processing text by fields. It is extremely powerful when extracting specific columns from logs or performing conditional aggregation.

4.1 Basic Field Extraction

# syslog format: date hostname process: message
# $1=month, $2=day, $3=time, $4=hostname, $5=process

# Extract only time and message
awk '{print $3, $5, $0}' /var/log/syslog | tail -20

# Print specific fields (Nginx access log example)
# Format: IP - - [date] "request" status_code size "referer" "UA"
awk '{print $1, $9}' /var/log/nginx/access.log    # IP and status code

# Process tab-delimited files
awk -F'\t' '{print $1, $3}' logfile.tsv

# Comma-delimited (CSV)
awk -F',' '{print $1, $NF}' logfile.csv    # First and last fields

4.2 Conditional Filtering

# Extract only HTTP 500 errors (Nginx/Apache)
awk '$9 == 500' /var/log/nginx/access.log

# All 500-series errors
awk '$9 >= 500 && $9 < 600' /var/log/nginx/access.log

# Only responses larger than 1MB
awk '$10 > 1048576' /var/log/nginx/access.log

# Only requests from a specific IP
awk '$1 == "192.168.1.100"' /var/log/nginx/access.log

# Lines containing a specific string (similar to grep)
awk '/error/' /var/log/syslog
awk '/error/ && /nginx/' /var/log/syslog    # AND condition
awk '/error/ || /warn/' /var/log/syslog     # OR condition

4.3 Aggregation and Statistics

# Count requests by IP (Top accessing IPs)
awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head -20

# IP counting using awk alone
awk '{ip[$1]++} END {for (i in ip) print ip[i], i}' /var/log/nginx/access.log | sort -rn | head -20

# Count by status code
awk '{code[$9]++} END {for (c in code) print c, code[c]}' /var/log/nginx/access.log | sort -k2 -rn

# Request count by hour
awk '{split($4, a, ":"); hour=a[2]; h[hour]++} END {for (i in h) print i, h[i]}' /var/log/nginx/access.log | sort

# Total transferred bytes
awk '{sum += $10} END {printf "Total: %.2f GB\n", sum/1024/1024/1024}' /var/log/nginx/access.log

# Request count by URL path (regardless of GET/POST)
awk '{print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head -30

4.4 Practical One-Liners

# Find slow requests (assuming log format includes response time)
# If $request_time is the last field in log_format:
awk '{if ($NF > 3.0) print $0}' /var/log/nginx/access.log

# Calculate requests per minute
awk '{split($4,a,"[:/]"); min=a[2]":"a[3]":"a[4]" "a[5]":"a[6]; m[min]++} END {for (i in m) print i, m[i]}' /var/log/nginx/access.log | sort | tail -20

# Calculate error rate
awk '{total++; if ($9 >= 400) errors++} END {printf "Total: %d, Errors: %d, Rate: %.2f%%\n", total, errors, (errors/total)*100}' /var/log/nginx/access.log

5. sed - Log Text Transformation and Extraction

sed is a stream editor useful for transforming log data or extracting specific ranges.

5.1 Basic Pattern Substitution and Extraction

# Substitute a specific string (for display)
sed 's/error/ERROR/gi' /var/log/syslog | head

# Print only a specific range of lines
sed -n '100,200p' /var/log/syslog          # Lines 100 through 200

# Extract content between specific patterns
sed -n '/Start of backup/,/End of backup/p' /var/log/syslog

# Extract a specific time range
sed -n '/Mar  7 14:00/,/Mar  7 15:00/p' /var/log/syslog

# Remove blank lines
sed '/^$/d' /var/log/application.log

# Remove comment (#) lines
sed '/^#/d' /etc/rsyslog.conf

# Mask IP addresses (for security reports)
sed -E 's/([0-9]+\.[0-9]+\.[0-9]+\.)[0-9]+/\1***/g' /var/log/auth.log

5.2 Multiple Commands and Advanced Usage

# Multiple substitutions at once
sed -e 's/error/ERROR/gi' -e 's/warning/WARNING/gi' /var/log/syslog

# Delete specific lines (output everything except lines 1-5)
sed '1,5d' /var/log/syslog

# Date format conversion example
echo "2026/03/07 14:30:00" | sed 's|\([0-9]*\)/\([0-9]*\)/\([0-9]*\)|\1-\2-\3|'

# Print only lines containing a specific pattern (similar to grep)
sed -n '/OOM/p' /var/log/syslog

6. sort, uniq, cut - Sorting and Aggregating Log Data

These three commands are useful on their own, but they truly shine when combined through pipelines.

6.1 sort - Sorting

# Basic sort (alphabetical)
sort /var/log/auth.log

# Reverse sort
sort -r /var/log/auth.log

# Numeric sort
sort -n data.log

# Sort by a specific field (3rd field, numeric descending)
sort -t' ' -k3 -rn access.log

# Sort by date
sort -k1,1M -k2,2n /var/log/syslog    # Month + day (numeric)

6.2 uniq - Deduplication and Counting

# Remove duplicates (must be used with sort!)
sort /var/log/auth.log | uniq

# Show duplicate counts
sort /var/log/auth.log | uniq -c

# Re-sort by duplicate count (frequency order)
sort /var/log/auth.log | uniq -c | sort -rn

# Show only duplicated lines
sort /var/log/auth.log | uniq -d

# Show only unique lines (appearing only once)
sort /var/log/auth.log | uniq -u
Warning: uniq only removes consecutive duplicates. You must always run sort first before using uniq to get correct results.

6.3 cut - Extracting Fields

# Extract fields by delimiter
cut -d' ' -f1 /var/log/nginx/access.log        # Extract IP only
cut -d':' -f1 /etc/passwd                       # Usernames only

# Extract by character position
cut -c1-15 /var/log/syslog                      # First 15 chars (date/time)

# Extract multiple fields simultaneously
cut -d' ' -f1,4,7 /var/log/nginx/access.log    # 1st, 4th, 7th fields

6.4 Combined Pipeline Practical Examples

# Top 10 IPs with failed SSH logins
grep "Failed password" /var/log/auth.log | awk '{print $(NF-3)}' | sort | uniq -c | sort -rn | head -10

# Error count by hour
grep -i "error" /var/log/syslog | awk '{print $3}' | cut -d: -f1 | sort | uniq -c | sort -rn

# Top 20 Nginx request URLs
awk '{print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head -20

# Count of unique IPs
awk '{print $1}' /var/log/nginx/access.log | sort -u | wc -l

# Log line count trend by date
awk '{print $1, $2}' /var/log/syslog | sort | uniq -c

7. journalctl - Everything About systemd Log Analysis

On systemd-based systems, journalctl is the most powerful log querying tool. It queries binary journals in a structured format and natively supports various filtering options.

7.1 Basic Queries

# View entire journal
journalctl

# View in reverse chronological order (newest first)
journalctl -r

# View only the last 50 lines
journalctl -n 50

# Real-time monitoring (same as tail -f)
journalctl -f

# Logs from a specific boot session
journalctl -b              # Current boot
journalctl -b -1           # Previous boot
journalctl --list-boots    # List of boot sessions

# Output without pager (useful for scripts)
journalctl --no-pager

7.2 Time-Based Filtering

# Since a specific date
journalctl --since "2026-03-07"

# Specific time range
journalctl --since "2026-03-07 14:00" --until "2026-03-07 16:00"

# Relative time
journalctl --since "1 hour ago"
journalctl --since "30 min ago"
journalctl --since "2 days ago"
journalctl --since yesterday
journalctl --since today

7.3 Service/Unit-Based Filtering

# View logs for a specific service
journalctl -u nginx.service
journalctl -u sshd.service
journalctl -u mysql.service

# Query multiple services simultaneously
journalctl -u nginx.service -u php-fpm.service

# Real-time monitoring of a specific service
journalctl -u nginx.service -f

# Only recent errors for a specific service
journalctl -u nginx.service -p err -n 50

7.4 Priority (Severity) Filtering

# Filter by severity level
# 0=emerg, 1=alert, 2=crit, 3=err, 4=warning, 5=notice, 6=info, 7=debug

journalctl -p emerg          # Emergency (system is unusable)
journalctl -p alert          # Immediate action required
journalctl -p crit           # Critical condition
journalctl -p err            # Error (most commonly used)
journalctl -p warning        # Warning

# Specify a range (only severe logs at crit level and above)
journalctl -p crit..emerg

# Combine service + severity + time
journalctl -u sshd.service -p err --since "1 hour ago"

7.5 Output Format Control

# JSON format output (useful for parsing)
journalctl -u nginx.service -o json-pretty -n 5

# Concise single-line format
journalctl -o short-precise

# Available output formats
# short         : Default syslog style
# short-precise : Microsecond precision timestamps
# short-iso     : ISO 8601 format timestamps
# verbose       : Show all fields
# json          : JSON format (single line)
# json-pretty   : JSON format (indented)
# cat           : Message only (no timestamps)

# Filter by specific fields
journalctl _COMM=sshd               # Process name
journalctl _PID=1234                 # PID
journalctl _UID=0                    # Root user
journalctl _HOSTNAME=webserver01     # Hostname

7.6 Kernel Logs and Disk Usage

# Kernel messages only (similar to dmesg)
journalctl -k
journalctl --dmesg

# Check journal disk usage
journalctl --disk-usage

# Clean up journal (delete old logs)
sudo journalctl --vacuum-time=7d     # Delete logs older than 7 days
sudo journalctl --vacuum-size=500M   # Delete logs exceeding 500MB

8. Real-World Scenario-Based Log Analysis

8.1 SSH Brute Force Attack Analysis

# Check failed SSH login attempts
grep "Failed password" /var/log/auth.log | tail -20

# Extract attack IPs and sort by frequency
grep "Failed password" /var/log/auth.log | \
  awk '{print $(NF-3)}' | sort | uniq -c | sort -rn | head -20

# Analyze attempt times for a specific IP
grep "Failed password" /var/log/auth.log | \
  grep "192.168.1.100" | awk '{print $1, $2, $3}'

# Check successful logins
grep "Accepted" /var/log/auth.log | tail -20

# Attacks using invalid usernames
grep "Invalid user" /var/log/auth.log | \
  awk '{print $8}' | sort | uniq -c | sort -rn | head -20

# Attack frequency by hour
grep "Failed password" /var/log/auth.log | \
  awk '{print $3}' | cut -d: -f1 | sort | uniq -c | sort -rn

# SSH analysis with journalctl
journalctl -u sshd.service -p warning --since "24 hours ago" --no-pager

8.2 Web Server Troubleshooting (Nginx/Apache)

# Identify when 500 errors occurred
grep '" 500 ' /var/log/nginx/access.log | tail -30

# Calculate 5xx error rate
awk '{total++; if ($9 ~ /^5/) err++} END {printf "Total: %d, 5xx: %d (%.2f%%)\n", total, err, err/total*100}' \
  /var/log/nginx/access.log

# Analyze URL patterns causing errors
grep '" 500 ' /var/log/nginx/access.log | \
  awk '{print $7}' | sort | uniq -c | sort -rn | head -20

# Analyze Nginx error log
grep "error" /var/log/nginx/error.log | tail -30

# Check upstream timeouts
grep "upstream timed out" /var/log/nginx/error.log | wc -l

# Calculate requests per second (RPS)
awk '{split($4,a,"[:/]"); sec=a[4]" "a[5]":"a[6]":"a[7]; s[sec]++} END {for (i in s) print i, s[i]}' \
  /var/log/nginx/access.log | sort | tail -20

# Slow requests during a specific time window (request_time is the last field)
awk '$NF > 5.0 {print $4, $7, $NF}' /var/log/nginx/access.log | tail -20

8.3 Disk I/O and OOM (Out of Memory) Analysis

# Check OOM Killer process termination history
grep -i "oom" /var/log/syslog
grep "Out of memory" /var/log/syslog
dmesg | grep -i "oom"

# List processes killed by OOM
grep "Killed process" /var/log/syslog | \
  awk -F'[()]' '{print $2}' | sort | uniq -c | sort -rn

# Check OOM with journalctl
journalctl -k | grep -i "oom"
journalctl -k | grep "Out of memory"

# Disk-related errors
grep -i "I/O error" /var/log/syslog
grep -i "ext4.*error" /var/log/syslog
dmesg | grep -i "error"

# Detect filesystem remount to read-only
grep "Remounting filesystem read-only" /var/log/syslog

8.4 Service Start/Stop/Crash Tracking

# Service start/stop history
journalctl -u nginx.service | grep -i "started\|stopped\|failed"

# Analyze service failure causes
systemctl status nginx.service
journalctl -u nginx.service --since "10 min ago" -p err

# Analyze service restart frequency
journalctl -u mysql.service | grep -c "Started"

# Check core dumps
journalctl | grep "core dump"
coredumpctl list                    # When using systemd-coredump

# List failed services at boot
systemctl --failed

8.5 Cron Job Debugging

# Check cron execution history
grep CRON /var/log/syslog
grep CRON /var/log/cron                  # RHEL/CentOS

# Check cron execution for a specific user
grep "CRON.*username" /var/log/syslog

# Check cron errors only
grep CRON /var/log/syslog | grep -i "error\|fail"

# Check cron with journalctl
journalctl -u cron.service --since today

# Check cron execution of a specific script
grep "backup.sh" /var/log/syslog

9. Advanced Analysis Techniques and One-Liner Collection

9.1 Multi-File Analysis with xargs and find

# Search for errors in log files modified within the last 24 hours
find /var/log -name "*.log" -mtime -1 -exec grep -l "error" {} \;

# Search for a specific IP across all log files
find /var/log -type f -name "*.log" | xargs grep -l "192.168.1.100" 2>/dev/null

# Find log files above a certain size
find /var/log -type f -size +100M -exec ls -lh {} \;

# Clean up old log files (older than 30 days)
find /var/log -name "*.gz" -mtime +30 -exec ls -la {} \;

9.2 Line-by-Line Processing with while read Loops

# Process failed IPs one by one (e.g., generate a blocklist)
grep "Failed password" /var/log/auth.log | \
  awk '{print $(NF-3)}' | sort -u | \
  while read ip; do
    count=$(grep -c "$ip" /var/log/auth.log)
    echo "$ip: $count attempts"
  done

# Alert when a specific error pattern occurs
tail -f /var/log/syslog | while read line; do
    echo "$line" | grep -q "CRITICAL" && echo "ALERT: $line" >> /tmp/alerts.log
done

9.3 Time-Based Analysis with the date Command

# Extract only logs from the last hour (syslog format)
SINCE=$(date -d '1 hour ago' '+%b %e %H')
awk -v since="$SINCE" '$0 >= since' /var/log/syslog

# Search for yesterday's logs
YESTERDAY=$(date -d yesterday '+%b %e')
grep "^$YESTERDAY" /var/log/syslog

# Request count trend per minute (last hour)
awk '{print substr($4,14,5)}' /var/log/nginx/access.log | sort | uniq -c | tail -60

9.4 Best Practical One-Liner Favorites

# === Security Analysis ===

# 1. Top 10 SSH brute force attack IPs + frequency analysis
grep "Failed password" /var/log/auth.log | awk '{print $(NF-3)}' | sort | uniq -c | sort -rn | head -10

# 2. Currently active SSH sessions
who | grep pts
ss -tnp | grep :22

# 3. sudo command usage history
grep "COMMAND" /var/log/auth.log | awk -F: '{print $NF}' | sort | uniq -c | sort -rn | head -20

# 4. Connection attempts on unusual ports
grep "refused connect" /var/log/syslog | awk '{print $NF}' | sort | uniq -c | sort -rn


# === Performance Analysis ===

# 5. Nginx response code distribution
awk '{print $9}' /var/log/nginx/access.log | sort | uniq -c | sort -rn

# 6. Traffic by hour
awk '{split($4,a,":"); print a[2]":00"}' /var/log/nginx/access.log | sort | uniq -c

# 7. URLs generating the largest responses
awk '{print $10, $7}' /var/log/nginx/access.log | sort -rn | head -20

# 8. Calculate bandwidth usage (daily)
awk '{sum+=$10} END {printf "%.2f GB\n", sum/1024/1024/1024}' /var/log/nginx/access.log


# === Troubleshooting ===

# 9. Comprehensive recent error log (deduplicated, sorted by frequency)
grep -i "error\|fail\|critical\|fatal" /var/log/syslog | \
  sed 's/^.*\] //' | sort | uniq -c | sort -rn | head -30

# 10. Service restart frequency
journalctl --since "7 days ago" | grep "Started" | \
  awk '{for(i=5;i<=NF;i++) printf $i" "; print ""}' | sort | uniq -c | sort -rn | head -20

# 11. Check for kernel panics/OOPS
journalctl -k -p crit --since "7 days ago"
dmesg -T | grep -i "panic\|oops\|bug\|error"

# 12. Disk full warning related logs
grep -i "no space\|disk full\|ENOSPC" /var/log/syslog

10. Log Monitoring Automation

10.1 Simple Log Monitoring Script

#!/bin/bash
# log_monitor.sh - Log monitoring and alert script

LOG_FILE="/var/log/syslog"
ALERT_PATTERNS="error|critical|fatal|panic|OOM|segfault"
ALERT_LOG="/var/log/alert_monitor.log"
CHECK_INTERVAL=60

echo "[$(date)] Log monitor started for $LOG_FILE" | tee -a "$ALERT_LOG"

tail -Fn0 "$LOG_FILE" | while read line; do
    if echo "$line" | grep -qiE "$ALERT_PATTERNS"; then
        TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
        echo "[$TIMESTAMP] ALERT: $line" | tee -a "$ALERT_LOG"

        # Add notifications here (email, Slack, etc.)
        # echo "$line" | mail -s "Log Alert" admin@example.com
    fi
done

10.2 Daily Reports with logwatch

# Install logwatch
sudo apt install logwatch          # Debian/Ubuntu
sudo yum install logwatch          # RHEL/CentOS

# Generate a daily report
logwatch --detail High --range today --output stdout

# Send via email
logwatch --detail High --range yesterday --mailto admin@example.com --format html

# Analyze a specific service only
logwatch --service sshd --detail High --range today

10.3 multitail - Real-Time Multi-Log Monitoring

# Install multitail
sudo apt install multitail

# Monitor multiple logs simultaneously (split screen)
multitail /var/log/syslog /var/log/auth.log

# Color highlighting + filtering
multitail -ci green /var/log/nginx/access.log -ci red /var/log/nginx/error.log

# Apply color schemes and filters
multitail -e "error" /var/log/syslog -e "Failed" /var/log/auth.log

10.4 lnav - Structured Log Analyzer

# Install lnav
sudo apt install lnav

# Basic usage (auto format detection)
lnav /var/log/syslog

# Analyze multiple files simultaneously
lnav /var/log/syslog /var/log/auth.log

# Use SQL queries inside lnav
# ;SELECT log_time, log_body FROM syslog_log WHERE log_body LIKE '%error%' LIMIT 20

# Compressed files are also supported automatically
lnav /var/log/syslog.*.gz
TIP: lnav is a highly powerful terminal-based log analyzer that supports automatic log format detection, color-coded display, timeline views, histograms, and SQL queries. It is the best choice when you need to analyze logs extensively in the terminal.

11. Log Analysis Best Practices

11.1 Efficient Log Analysis Habits

  • Start with the big picture: Assess the scale with wc -l, check error frequency with grep -c, then proceed with detailed analysis.
  • Narrow down the time range: Narrowing the analysis window around the incident timestamp helps reduce unnecessary noise.
  • Build pipelines incrementally: Don't write a complex one-liner all at once. Add pipes one at a time and verify the results at each step.
  • Leverage aliases: Registering frequently used log analysis commands as aliases in ~/.bashrc boosts efficiency.
  • Save intermediate results with tee: Use command | tee /tmp/result.txt | next_command to save intermediate results to a file while continuing the pipeline.

11.2 Useful Bash Alias Configuration

# Log analysis aliases to add to ~/.bashrc
alias syslog='tail -f /var/log/syslog'
alias authlog='tail -f /var/log/auth.log'
alias nginxlog='tail -f /var/log/nginx/access.log'
alias nginxerr='tail -f /var/log/nginx/error.log'

alias logerror='grep -i "error\|fail\|critical" /var/log/syslog | tail -50'
alias sshfail='grep "Failed password" /var/log/auth.log | awk '\''{print $(NF-3)}'\'' | sort | uniq -c | sort -rn | head -20'
alias topip='awk '\''{print $1}'\'' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head -20'

alias jlog='journalctl -f'
alias jerr='journalctl -p err --since "1 hour ago" --no-pager'

11.3 Security Considerations

  • Log file permission management: Log files may contain sensitive information (IPs, usernames, query parameters), so set appropriate permissions (640 or lower).
  • Log integrity protection: Intruders may delete or tamper with logs, so for critical servers, use remote log collection (rsyslog remote forwarding, ELK, Loki, etc.) in parallel.
  • Log retention policies: Configure logrotate and retention periods appropriately, considering both legal requirements and disk capacity.
  • Privacy considerations: Personal information contained in logs (IPs, emails, etc.) must be managed properly in accordance with applicable privacy regulations.

12. Command Quick Reference Cheat Sheet

Purpose Command
Real-time monitoring tail -f /var/log/syslog
Real-time keyword filter tail -f /var/log/syslog | grep --line-buffered "error"
Error count grep -ci "error" /var/log/syslog
Error context grep -C 5 "error" /var/log/syslog
Access count by IP awk '{print $1}' access.log | sort | uniq -c | sort -rn
HTTP status code distribution awk '{print $9}' access.log | sort | uniq -c | sort -rn
SSH attack IPs grep "Failed password" auth.log | awk '{print $(NF-3)}' | sort | uniq -c | sort -rn
Errors by hour grep "error" syslog | awk '{print $3}' | cut -d: -f1 | sort | uniq -c
Service logs journalctl -u nginx.service -p err --since "1h ago"
Kernel errors journalctl -k -p err
OOM check dmesg -T | grep -i "oom\|killed process"
Disk errors grep -i "I/O error\|read-only" /var/log/syslog
Search compressed logs zgrep "error" /var/log/syslog.*.gz
Log line count wc -l /var/log/syslog
Journal disk usage journalctl --disk-usage

Conclusion: Practical Instinct Is the Key to Log Analysis

Linux log analysis requires more than just knowing the tools and commands. What matters most is the practical instinct for knowing which log file to check first during an actual incident, what patterns to filter by, and how to interpret the results.

Try running the commands covered in this guide directly on your own server environment. The combination of grep and awk, the various filtering options of journalctl, and the sort | uniq -c | sort -rn pipeline will become powerful weapons that enable you to respond quickly in any log analysis situation once you get comfortable with them.

Here are the key principles to remember:

  • Always start by narrowing the time range - Don't search through the entire log; focus on the window around the incident.
  • Build pipelines incrementally - Verify results at each step as you add commands.
  • Save frequently used commands as aliases - In an emergency, typing an alias is faster than trying to recall the full command.
  • Use remote log collection in parallel - Local logs alone cannot guarantee security and availability.
  • Review logs regularly - Don't wait until an incident to look at logs; make it a habit to monitor log trends proactively.

Log analysis proficiency is an essential skill for system administrators, DevOps engineers, SREs, and security engineers alike. Start running the commands you learned today in your terminal and build your own analysis routine.