Lustre Debugging Tutorial

Lustre provides a comprehensive set of debugging tools for troubleshooting file-system issues, including an internal debugger, debug logs, configurable debug levels, buffer management, and a debug daemon. This tutorial covers Lustre 2.17.0 (January 2026), based on the Lustre Operations Manual (updated 2025). Refer to Lustre Manual for full details. This expanded guide includes explanations for users with limited experience, best practices, warnings, and additional troubleshooting tips.

Introduction for Beginners

If you're new to Lustre debugging, understand that Lustre is a complex distributed filesystem, and issues can arise from network problems, metadata inconsistencies, or resource contention. Debugging tools help capture and analyze logs to identify root causes. Key concepts:

Best Practices: Start with minimal debug levels to avoid overwhelming logs. Reproduce issues in a test environment before enabling on production.

Warning: High debug levels can impact performance or fill disks—monitor CPU/memory/disk usage. Always disable after debugging to prevent overhead.

Internal Debugger

The Lustre kernel debug logging captures debug messages from Lustre kernel modules (e.g., mds, ost, lnet, ldlm, ptlrpc, etc.) and stores them in a circular debug buffer in kernel memory. For beginners: This is like a flight recorder for the filesystem, logging events for later analysis.

Key Features

Best Practices: Use markers (lctl mark "Start test") to bookmark logs. Sync node clocks with NTP for multi-node analysis.

Warning: Large buffers consume kernel memory—avoid exceeding available RAM to prevent OOM kills.

Message Types

Message types categorize log entries. Beginners: Start with error/warning for critical issues; add trace for detailed flows.

TypeDescriptionBeginner Notes
traceFunction entry/exitVerbose; use for step-by-step debugging but expect large logs.
inodeInode operationsUseful for file creation/deletion issues.
infoGeneral non-critical infoLow overhead; good starting point.
warningSignificant but non-fatal issuesAlerts to potential problems.
errorCritical errorsMust-investigate; often with error codes.
emergFatal conditionsSystem may crash; check immediately.
neterrorLNet/network errorsFor connectivity issues.
rpctraceRPC request/reply tracingTracks client-server communications.
mallocMemory allocation tracking (used with `leak_finder.pl`)Enable only for leak hunting; high overhead.
haFailover and recovery eventsFor high-availability setups.
quotaSpace accountingFor quota-related errors.
secSecurity handlingFor permission/ACL issues.
iotraceIO path tracingFor performance bottlenecks in data paths.

Commands

CommandPurposeBeginner Notes
lctl debug_kernel FILENAMEWrite buffer to FILENAME (ASCII or raw) or stdoutUse ASCII for readability; raw for tools like debug_file.
lctl clearClear kernel debug bufferDo this before tests for clean logs.
lctl mark [TEXT]Insert timestamped marker TEXT into the kernel debug logHelps segment logs (e.g., "Before failure").
lctl set_param debug=[+-]TYPE...Enable or disable debug logging of TYPE messages+ adds, - removes; combine like +error+warning.
lctl set_param subsystem_debug=[+-]SUBSYS...Enable or disable logging of SUBSYS messagesTarget specific areas to reduce noise.
lctl debug_file INPUT OUTPUTConvert binary INPUT debug log file dumped by kernel to text in OUTPUT fileEssential for analyzing daemon outputs.

Reading Debug Logs

Debug logs are accessible via kernel buffer dumps and user-space tools. For beginners: Logs can be voluminous—use grep for keywords like "LustreError".

Access Methods

Best Practices: Dump logs immediately after issues to avoid overwrite. Compress large files (e.g., gzip).

Warning: Frequent dumps can add I/O load—schedule during low activity.

Analysis Tools

Changing Debugging Levels

Debug verbosity is controlled via global and subsystem-specific masks. Beginners: Levels range from silent (0) to very detailed (full); start low and increase as needed.

Global Debug Mask

lctl set_param debug=[+-]TYPE

Default includes warning, error, emerg, ha, config, console.

Best Practices: Use + for additive enabling; test in isolation.

Subsystem-Specific Debug

lctl set_param subsystem_debug=mds

Levels

LevelActivationBeginner Notes
0No messagesSilent mode for production.
1Critical errorsMinimal logging.
2Warnings + errorsBalance for monitoring.
3+Detailed tracing (function calls, RPCs)High detail; use briefly.

Targeting Different Subsystems

Subsystems allow focused debugging. Beginners: Choose based on symptoms (e.g., mds for metadata slowness).

SubsystemRoleKey Debug ParametersCategories/Masks (Examples)Activation ExamplesBeginner Notes
mdc (Metadata Client)Client-side metadata ops (create, unlink, getattr)mdc.debug, mdc_rpc.debug, mdc_request.debug+all, +rpctrace, +tracelctl set_param debug=+allStart here for client file ops issues.
mds (Metadata Server)Server-side metadata (layout, locks, recovery)mds.debug, mds_request.debug, mds_reint.debug+rpctrace, +dlmtrace, +inodelctl set_param debug=+inodeFor server-side bottlenecks.
osc (Object Storage Client)Client-side I/O to OSTsosc.debug, osc_request.debug, osc_io.debug+iotrace, +rpctracelctl set_param debug=+iotraceData read/write problems.
ost (Object Storage Target)Server-side data storageost.debug, ost_io.debug, ost_create.debug+rpctrace, +dlmtrace, +inodelctl set_param ost.debug=+rpctraceStorage server issues.
ldlm (Lustre Distributed Lock Manager)Manages distributed locksldlm.debug, ldlm_enqueue.debug, ldlm_cancel.debug+dlmtrace, +rpctracelctl set_param debug=+dlmtraceLock contention or deadlocks.
ptlrpc (Portal RPC)RPC communication layerptlrpc.debug, ptlrpc_request.debug, ptlrpc_reply.debug+rpctrace, +neterror, +tracelctl set_param debug=+rpctraceCommunication failures.
lnet (Lustre Network)Network routing & communicationlnet.debug, lnet_ni.debug, lnet_router.debug+neterrorlctl set_param debug=+neterrorNetwork-specific errors.

View & List

lctl get_param debug
lctl debug_list types
lctl debug_list subsystems

Permanent Settings

lctl set_param -P debug=+malloc

Warning: Permanent settings (-P) apply cluster-wide—test first without -P.

Basic Debugging Settings

Maximum Debug Buffer Size

lctl set_param debug_mb=1G  # 1 GiB total

Default: ~5 MB per CPU core. Buffer wraps on overflow.

Console Rate Limiting

Disable: Add to /etc/modprobe.d/lustre.conf: options libcfs libcfs_console_ratelimit=0. Reload modules.

Panic, Log Dump, and Upcall on LBUG

lctl set_param panic_on_lbug=1
lctl set_param debug_log_upcall=/path/to/script

Upcall script can automate dumps on errors.

Debug Daemon

The debug daemon is a userspace process that continuously dumps the kernel debug buffer to a file, preventing overflow and providing persistent logs. It runs as a background daemon, flushing the buffer at regular intervals or on demand. For beginners: Think of it as a continuous logger to capture long sessions without losing data.

Why Use It

When to Use It

Best Practices: Run on all relevant nodes (clients/servers). Use large size limits for extended tests.

How to Set It Up

CommandDescriptionBeginner Notes
lctl debug_daemon start <filename> [MB]Start logging to file (e.g., start /var/log/lustre.bin 40). The optional megabytes parameter limits the file size; daemon overwrites old logs if size exceeded.Use binary (.bin) for efficiency; decode later.
lctl debug_daemon stopStop and flush final buffer to file.Always stop to ensure complete logs.
lctl debug_daemon dumpManual flush without stopping.Useful mid-test.
lctl debug_file <input> <output>Decode binary log to text (e.g., lctl debug_file lustre.bin lustre.txt).Text files are grep-friendly.

Behavior: Runs as kernel thread; auto-stops on shutdown. If file exists, overwrites. Use on servers/clients for distributed debugging.

Warning: Large files can fill disks—monitor with df; use rotation or limits.

Troubleshooting the Debug Daemon

Example Workflow

# Enable debug
lctl set_param subsystem_debug=mds debug.+neterror

# Start daemon
lctl debug_daemon start /var/log/lustre_debug 1024

# Trigger issue

# Stop and decode
lctl debug_daemon stop
lctl debug_file /var/log/lustre_debug /var/log/lustre_debug.txt

# Analyze
grep "LustreError" /var/log/lustre_debug.txt
perl leak_finder.pl /var/log/lustre_debug.txt

Expanded Debug Daemon Examples

Example 1: Debugging MDS Recovery

# On MDS: Set debug for recovery
lctl set_param debug.mds=+ha

# Start daemon with 2 GB limit writing to tmpfs
lctl debug_daemon start /tmp/mds_recovery.bin 200

# Simulate recovery (e.g., unmount and remount MDT)

# Dump manually during process
lctl debug_daemon dump

# Stop after issue has been reproduced
lctl debug_daemon stop
lctl debug_file /tmp/mds_recovery.bin /tmp/mds_recovery.txt

# Analyze
grep "recovery" /tmp/mds_recovery.txt

Example 2: Multi-Node Network Debugging

# On Client: Debug LNet/RPC
lctl set_param debug+lnet+rpctrace
lctl set_param debug.ptlrpc=+trace

# Start daemon
lctl debug_daemon start /tmp/client_net.bin 2048

# On Server: Mirror debug
lctl set_param debug+lnet+rpctrace
lctl debug_daemon start /tmp/server_net.bin 2048

# Run I/O test (e.g., dd on client)

# Stop both, decode, compare timestamps

Example 3: Memory Leak Hunting

# Enable malloc tracking
lctl set_param debug=+malloc

# Start daemon
lctl debug_daemon start /tmp/memleak.bin 1024

# Run workload (e.g., create/delete files loop)

# Stop and analyze
lctl debug_daemon stop
lctl debug_file /tmp/memleak.bin /tmp/memleak.txt
perl leak_finder.pl /tmp/memleak.txt

Example 4: Long-Running Performance Debug

# Set low-level for perf
lctl set_param debug=iotrace debug_mb=1024

# Start daemon with rotation
lctl debug_daemon start /var/log/perf_debug.bin 20480  # Larger for long runs

# Run benchmark (e.g., IOR)

# Manual dump mid-run
lctl debug_daemon dump

# Stop at end
lctl debug_daemon stop

leak_finder.pl Usage

leak_finder.pl is a Perl script located in lustre/tests/ that analyzes debug logs with +malloc enabled to detect memory leaks by matching kmalloced/kfreed pairs. It reports unpaired allocations as potential leaks, grouped by call site or function. Use it after issue has been reproduced capturing logs with malloc tracing to identify leaks in kernel modules. For beginners: Memory leaks occur when allocated memory isn't freed, leading to exhaustion over time.

Preparing for Usage

# Enable malloc tracing
lctl set_param debug=+malloc

# Generate log (e.g., via debug daemon)
lctl debug_daemon start /tmp/leak_log.bin 2048
# Run suspected leaky code
lctl debug_daemon stop
lctl debug_file /tmp/leak_log.bin /tmp/leak_log.txt

Best Practices: Run workloads that stress memory (e.g., repeated allocations). Clear buffer before starting.

Warning: +malloc adds significant overhead—use only in testing; disable in production.

Running the Script

perl /path/to/lustre/tests/leak_finder.pl /tmp/leak_log.txt

Options:

Example Output: Lists allocations without frees, with addresses, sizes, and call sites (e.g., "Unmatched kmalloc at obd_alloc: 1024 bytes").

Expanded leak_finder.pl Examples

Example 1: Basic Leak Detection

# Run script
perl leak_finder.pl /tmp/leak_log.txt

# Output example:
Unmatched kmallocs:
obd_alloc: 2048 bytes at 0x12345678 (called from function1)
lustre_inode_alloc: 1024 bytes at 0x87654321 (called from function2)

Example 2: Group by Function

perl leak_finder.pl --by-func /tmp/leak_log.txt

# Output example:
Leaks by function:
function1: 3 allocations, 12048 bytes
function2: 2 allocations, 200 bytes

Example 3: Long-Run Analysis

# Start daemon with malloc
lctl set_param debug=+malloc
lctl debug_daemon start /tmp/long_leak.bin 200

# Run extended workload

# Stop and analyze
lctl debug_daemon stop
lctl debug_file /tmp/long_leak.bin /tmp/long_leak.txt
perl leak_finder.pl --by-func /tmp/long_leak.txt

Example 4: Troubleshooting No Leaks Reported

# If no output: Verify +malloc enabled
lctl get_param debug # Should include +malloc

# Rerun workload, ensure logs capture full run
perl leak_finder.pl /tmp/leak_log.txt  # If empty, increase buffer size or use daemon

Notes: Higher levels increase overhead; disable in production.

Additional Resources and Troubleshooting

For more advanced debugging:

If logs show "LustreError: went back in time", check disk caches. For persistent issues, use LFSCK for consistency checks.