BTRFS Reflink | copy files efficiently

Diogo
2 min readJan 2, 2024

--

Copy files more efficiently with the cp command, taking advantage of the BTRFS CoW mechanism, without duplicating space unnecessarily!

If you’re utilizing the BTRFS file system, it’s important to note that it offers an alternative method for copying files. Unlike the conventional approach where data is duplicated in its entirety, thus occupying double the space, Copy-on-write (CoW) allows for a more space-efficient process.

Copy-on-write

BTRFS uses copy-on-write data management technique for all files by default, which means that when a file is modified or written, the original data (block) is not overwritten like in traditional file systems. Instead, a copy of the modified data is created, enhancing data integrity. It then updates the metadata to point to the new location of the data.

Reflink

Reflink is a type of shallow copy of file data that shares the blocks but otherwise the files are independent and any change to the file will not affect the original. This builds on the underlying Copy-on-Write mechanism.

A reflink will effectively just create a separate metadata pointing to the shared blocks, which is usually much faster than a deep copy of all the blocks.

Requirements

  • The storage drive must be formatted with the BTRFS file system or another copy-on-write file system.
  • Linux kernel 5.18 or above (check with uname -r)
  • Have COW enable in the file system, by default it is enabled unless you use the NOCOW flag when mounting the file system

Copy files with Reflink using cp

Syntax

cp --reflink=always source target

When reflink=always is specified, perform a shallow copy, where the data blocks are copied only when modified. If this is not possible, the copy fails.

Example

Example of copying a 4GB file

I created a 4GB file and three copies of it using reflink. As we know, the files were not actually duplicated. Instead, the metadata points to the original file, allowing the data to be shared among them, saving space.

Shared use of disk and space for each file

Cons

Cross-filesystem reflink is not possible, there’s nothing in common between, so the block sharing can’t work.

--

--

Diogo
Diogo

Written by Diogo

I am enthusiastic and self-taught about all things Linux, I love open source projects and learning about new things.

No responses yet