Btrfs - btrfs

Btrfs is a local file system based on the COW principle. COW means that data is stored to a different block after it has been modified instead of overwriting the existing data, reducing the risk of data corruption. Unlike other file systems, Btrfs is extent-based, which means that it stores data in contiguous areas of memory.

In addition to basic file system features, Btrfs offers RAID and volume management, pooling, snapshots, checksums, compression and other features.

To use Btrfs, make sure you have btrfs-progs installed on your machine.

Terminology

A Btrfs file system can have subvolumes, which are named binary subtrees of the main tree of the file system with their own independent file and directory hierarchy. A Btrfs snapshot is a special type of subvolume that captures a specific state of another subvolume. Snapshots can be read-write or read-only.

btrfs driver in LXD

The btrfs driver in LXD uses a subvolume per instance, image and snapshot. When creating a new entity (for example, launching a new instance), it creates a Btrfs snapshot.

Btrfs doesn’t natively support storing block devices. Therefore, when using Btrfs for VMs, LXD creates a big file on disk to store the VM. This approach is not very efficient and might cause issues when creating snapshots.

Btrfs can be used as a storage backend inside a container in a nested LXD environment. In this case, the parent container itself must use Btrfs. Note, however, that the nested LXD setup does not inherit the Btrfs quotas from the parent (see Quotas below).

Quotas

Btrfs supports storage quotas via qgroups. Btrfs qgroups are hierarchical, but new subvolumes will not automatically be added to the qgroups of their parent subvolumes. This means that users can trivially escape any quotas that are set. Therefore, if strict quotas are needed, you should consider using a different storage driver (for example, ZFS with refquota or LVM with Btrfs on top).

When using quotas, you must take into account that Btrfs extents are immutable. When blocks are written, they end up in new extents. The old extents remain until all their data is dereferenced or rewritten. This means that a quota can be reached even if the total amount of space used by the current files in the subvolume is smaller than the quota.

Note

This issue is seen most often when using VMs on Btrfs, due to the random I/O nature of using raw disk image files on top of a Btrfs subvolume.

Therefore, you should never use VMs with Btrfs storage pools.

If you really need to use VMs with Btrfs storage pools, set the instance root disk’s size.state property to twice the size of the root disk’s size. This configuration allows all blocks in the disk image file to be rewritten without reaching the qgroup quota. Setting the btrfs.mount_options storage pool option to compress-force can also avoid this scenario, because a side effect of enabling compression is to reduce the maximum extent size such that block rewrites don’t cause as much storage to be double-tracked. However, this is a storage pool option, and it therefore affects all volumes on the pool.

Configuration options

The following configuration options are available for storage pools that use the btrfs driver and for storage volumes in these pools.

Storage pool configuration

btrfs.mount_options

Mount options for block devices

Key: btrfs.mount_options
Type:

string

Default:

user_subvol_rm_allowed

Scope:

global

size

Size of the storage pool (for loop-based pools)

Key: size
Type:

string

Default:

auto (20% of free disk space, >= 5 GiB and <= 30 GiB)

Scope:

local

When creating loop-based pools, specify the size in bytes (suffixes are supported). You can increase the size to grow the storage pool.

The default (auto) creates a storage pool that uses 20% of the free disk space, with a minimum of 5 GiB and a maximum of 30 GiB.

source

Path to an existing block device, loop file, or Btrfs subvolume

Key: source
Type:

string

Scope:

local

source.wipe

Whether to wipe the block device before creating the pool

Key: source.wipe
Type:

bool

Default:

false

Scope:

local

Set this option to true to wipe the block device specified in source prior to creating the storage pool.

Tip

In addition to these configurations, you can also set default values for the storage volume configurations. See Configure default values for storage volumes.

Storage volume configuration

security.shared

Enable volume sharing

Key: security.shared
Type:

bool

Default:

same as volume.security.shared or false

Condition:

virtual-machine or custom block volume

Scope:

global

Enabling this option allows sharing the volume across multiple instances despite the possibility of data loss.

security.shifted

Enable ID shifting overlay

Key: security.shifted
Type:

bool

Default:

same as volume.security.shifted or false

Condition:

custom volume

Scope:

global

Enabling this option allows attaching the volume to multiple isolated instances.

security.unmapped

Disable ID mapping for the volume

Key: security.unmapped
Type:

bool

Default:

same as volume.security.unmappped or false

Condition:

custom volume

Scope:

global

size

Size/quota of the storage volume

Key: size
Type:

string

Default:

same as volume.size

Condition:

appropriate driver

Scope:

global

snapshots.expiry

When snapshots are to be deleted

Key: snapshots.expiry
Type:

string

Default:

same as volume.snapshots.expiry

Condition:

custom volume

Scope:

global

Specify an expression like 1M 2H 3d 4w 5m 6y.

snapshots.pattern

Template for the snapshot name

Key: snapshots.pattern
Type:

string

Default:

same as volume.snapshots.pattern or snap%d

Condition:

custom volume

Scope:

global

You can specify a naming template that is used for scheduled snapshots and unnamed snapshots.

The snapshots.pattern option takes a Pongo2 template string to format the snapshot name.

To add a time stamp to the snapshot name, use the Pongo2 context variable creation_date. Make sure to format the date in your template string to avoid forbidden characters in the snapshot name. For example, set snapshots.pattern to {{ creation_date|date:'2006-01-02_15-04-05' }} to name the snapshots after their time of creation, down to the precision of a second.

Another way to avoid name collisions is to use the placeholder %d in the pattern. For the first snapshot, the placeholder is replaced with 0. For subsequent snapshots, the existing snapshot names are taken into account to find the highest number at the placeholder’s position. This number is then incremented by one for the new name.

snapshots.schedule

Schedule for automatic volume snapshots

Key: snapshots.schedule
Type:

string

Default:

same as snapshots.schedule

Condition:

custom volume

Scope:

global

Specify either a cron expression (<minute> <hour> <dom> <month> <dow>), a comma-separated list of schedule aliases (@hourly, @daily, @midnight, @weekly, @monthly, @annually, @yearly), or leave empty to disable automatic snapshots (the default).

volatile.uuid

The volume’s UUID

Key: volatile.uuid
Type:

string

Default:

random UUID

Scope:

global

Storage bucket configuration

To enable storage buckets for local storage pool drivers and allow applications to access the buckets via the S3 protocol, you must configure the core.storage_buckets_address server setting.

size

Size/quota of the storage bucket

Key: size
Type:

string

Default:

same as volume.size

Condition:

appropriate driver

Scope:

local