Progressive File Layouts (PFL)
Progressive File Layouts (PFL) is a Lustre feature introduced in version 2.10 that enables dynamic, flexible, and scalable file striping through composite layouts, allowing files to be composed of multiple sub-layouts (components) across Object Storage Targets (OSTs) with varying stripe patterns. It supports automatic extension, migration, and management of layouts without requiring file recreation or full data migration. This guide is based on the Lustre Operations Manual (updated 2025) for Lustre 2.17.0 (January 2026).
Core Concepts
| Concept | Description |
| Composite Layouts | Files can consist of multiple components (sub-files), each with its own striping (stripe size, count, OST pool, index). Components cover non-overlapping, adjacent extents starting at offset 0. |
| Dynamic Extension | PFL allows automatic striping extension to different OSTs when a file component is written for the first time. Unused components do not initialize their OST objects, reducing access overhead and OST inode usage. |
| Self-Extending Layout (SEL) | Introduced in Lustre 2.13, extends PFL with space management in case an OST fills while file is being written. Use --extension-size|-z. |
| File Level Redundancy (FLR) | In Lustre 2.11+, PFL supports mirrored files where each mirror uses different OSTs/pools for storage tiering and/or redundancy. Mirrors are marked stale when the file is modified; requires external resync action to bring mirrors back in sync. |
| Instantiation Delay | Only the first component is instantiated at layout set. Other components have OST object allocated on first write/truncate to that file offset. |
| Extents & Holes | Component extents must be adjacent with no gaps. -1 or eof indicates extent to maximum file offset. |
| Client Compatibility | Clients 2.9- cannot create/use PFL. 2.10+ clients can use PFL files, but may not open if incompatible type (e.g., overlapping components). |
Commands
| Command | Purpose | Key Options |
| lfs setstripe | Create, modify, add/delete components; set defaults for directories. | -E COMPONENT_END, -c STRIPE_COUNT, -S STRIPE_SIZE, -i OST_INDEX, -p POOL, --comp-flags FLAGS |
| lfs getstripe | View layout, including PFL components, mirror IDs, flags, extents. | -I <comp_id> to show specific component |
| lfs migrate | Convert layouts (normal to composite, composite to composite, composite to normal). | Same syntax as setstripe with -E for extents |
| lfs mirror create | Create mirrored file/dir with PFL layouts per mirror. | -N MIRROR_COUNT, setstripe options per mirror, --flags=prefer |
| lfs mirror extend | Add mirrors to existing file using layout from victim file. | -f <victim_file>, --no-verify |
| lfs mirror split | Extract a mirror as a new file. | --mirror-id ID, --delete, -f NEW_FILE |
| lfs mirror resync | Resync stale mirrors from primary. | --only <id> |
| lfs mirror verify | Check data consistency across mirrors using checksums. | --only <id>, -v (verbose) |
| lfs find | Locate files by mirror count/state. | --mirror-count, --mirror-state {ro|wp|sp} |
| | |
| tunefs.lustre | Tune filesystem for PFL (if supported). | Post-formatting |
| lctl lfsck_start -t layout | Run LFSCK to check/repair layout inconsistencies involving PFL. | -A (all targets), -o (repair orphans), -n (dry-run) |
Examples
# Basic Composite PFL File
lfs setstripe -E 1G -c 1 -E 16G -c 4 -E eof -c -1 -i 4 /mnt/testfs/create_comp
# Add Component to Existing File
lfs setstripe --component-add -E eof -c 4 -o 6-7,0,5 /mnt/testfs/add_comp
# Resync & Verify Specific Mirrors
lfs mirror resync --only 2 /mnt/testfs/file1
# Delete Last Component
lfs setstripe --component-del -I 5 /mnt/testfs/del_comp
# Set Directory Default with PFL
lfs setstripe -E 256M -c 1 -E 16G -c 4 -E eof -S 4M -c -1 /mnt/testfs/pfldir
# Create Mirrored File with PFL
lfs mirror create -N -S 4M -c 2 -p flash -N -c -1 -p archive /mnt/testfs/file1
# Extend Mirror Using Victim File
lfs setstripe -E 1G -c 2 -p none -E eof -c -1 /mnt/testfs/victim_file
lfs mirror extend --no-verify -N -f /mnt/testfs/victim_file /mnt/testfs/file1
# Resync & Verify Specific Mirrors
lfs mirror resync --only 2 /mnt/testfs/file1
lfs mirror verify -vvv /mnt/testfs/file2
Best Practices
- Tune Striping: Align stripe size with application write size (multiple of 64KB). Use appropriate stripe counts per component.
- Use OST Pools: Assign components to different fault domains (racks, OSSes) for redundancy and performance.
- Enable PFL Early: Use lfs setstripe -E ... MOUNTPOINT to set default PFL after mounting new filesystem.
- Use a reasonable number of components (3-4) to minimize overhead.
- Monitor Layouts: Regularly run lfs getstripe to verify component instantiation and extension.
- Trigger Instantiation: Perform writes beyond current extent to instantiate delayed components.
- Set Limits: Use lctl set_param lod.*.max_stripecount to cap stripes if not all OSTs are needed.
- FLR Usage: After writes, run lfs mirror resync to sync stale mirrors. Use verify regularly.
- LFSCK Maintenance: Run lctl lfsck_start -t layout after restores or OST failures to repair inconsistencies.
- Avoid Holes: Ensure components are contiguous; no gaps allowed.
- Client Consistency: Ensure all clients are ≥2.10 for full PFL support.
Limitations
- No Shrinking: PFL only extends; cannot reduce stripe count or remove components without migration first.
- Write-Dependent: Extension and component instantiation require file writes.
- Component Deletion: Only allowed on last (unused) component; data in deleted component is lost.
- Max Components: Up to 2000 stripes total per file, fewer if many components.
- Client Version: Older clients (2.9-) cannot create or fully use PFL files.
- Mirrors: Stale on write; require manual resync. No auto-repair.
- Interoperability: Layouts may not persist across upgrades without re-enabling PFL.
- Instantiation Delay: Writes beyond last component fail with ENODATA if no EOF component. Can be used to restrict max file size.
Recent Changes (Up to Lustre 2.17.0)
- 2.10: PFL initially introduced: composite layouts, lfs setstripe enhancements, component management.
- 2.11: Integrated with FLR: mirrored files with PFL layouts per mirror; lfs mirror commands added.
- 2.12: DoM (Data on MDT) builds on PFL for special layouts.
- 2.13: Introduced SEL (Self-Extending Layout): dynamic extension with --extension-size, OST pool spillover support.
- 2.13+: Foreign Layout support for external systems using PFL-like mechanisms.
- 2.14: Added pool quotas for PFL-managed stripes; improved diagnostics in lfs getstripe.
- Post-2.13: LSoM (Lazy Size on MDT): stores file size/blocks lazily on MDT to reduce metadata overhead.
Related Tools & Diagnostics
- lfs getstripe: Primary tool to inspect PFL layouts, component extents, mirror states, flags (init, stale, prefer).
- lctl get_param: Check PFL enablement (osc.pfl_enable), LFSCK status, LOD limits.
- lctl lfsck_start -t layout: Repair MDT-OST layout inconsistencies (dangling references, unreferenced objects).
- Procfs Monitoring: mdd.*.lfsck_layout, obdfilter.*.lfsck_layout for scan status.
- Lost+Found: Orphan objects from layout issues placed in .lustre/lost+found via LFSCK.