Backing Up Linux to Backblaze B2 with Duplicity and Restic

By | October 19th, 2017

How to Back Up Linux to B2 with Duplicity and Restic

Linux users have a variety of options for handling data backup. The choices range from free and open-source programs to paid commercial tools, and include applications that are purely command-line based (CLI) and others that have a graphical interface (GUI), or both.

If you take a look at our Backblaze B2 Cloud Storage Integrations page, you will see a number of offerings that enable you to back up your Linux desktops and servers to Backblaze B2. These include CloudBerry, Duplicity, Duplicacy, 45 Drives, GoodSync, HashBackup, QNAP, Restic, and Rclone, plus other choices for NAS and hybrid uses.

In this post, we’ll discuss two popular command line and open-source programs: one older, Duplicity, and a new player, Restic.

Backing Up Linux: Old School vs. New School

We’re highlighting Duplicity and Restic because they exemplify two different philosophical approaches to data backup: “Old School” (Duplicity) vs “New School” (Restic).

Old School (Duplicity)

In the old school model, data is written sequentially to the storage medium. Once a section of data is recorded, new data is written starting where that section of data ends. It’s not possible to go back and change the data that’s already been written.

This old-school model has long been associated with the use of magnetic tape, a prime example of which is the LTO (Linear Tape-Open) standard. In this “write once” model, files are always appended to the end of the tape. If a file is modified and overwritten or removed from the volume, the associated tape blocks used are not freed up: they are simply marked as unavailable, and the used volume capacity is not recovered. Data is deleted and capacity recovered only if the whole tape is reformatted. As a Linux/Unix user, you undoubtedly are familiar with the TAR archive format, which is an acronym for Tape ARchive. TAR has been around since 1979 and was originally developed to write data to sequential I/O devices with no file system of their own.

It is from the use of tape that we get the full backup/incremental backup approach to backups. A backup sequence begins with a full backup of data. Each incremental backup contains what’s been changed since the last backup until the next full backup is made and the process starts over, filling more and more tape or whatever medium is being used. Recovering a system can be time consuming if there were a number of incremental updates following a full backup, as the full backup must be restored followed by each incremental to obtain the latest state of the data. Most customers will do frequent full-backups to avoid this.

This is the model used by Duplicity: full and incremental backups. Duplicity backs up files by producing encrypted, digitally signed, versioned, TAR-format volumes and uploading them to a remote location, including Backblaze B2 Cloud Storage. Released under the terms of the GNU General Public License (GPL), Duplicity is free software.

With Duplicity, the first archive is a complete (full) backup, and subsequent (incremental) backups only add differences from the latest full or incremental backup. Chains consisting of a full backup and a series of incremental backups can be recovered to the point in time that any of the incremental steps were taken. If any of the incremental backups are missing, then reconstructing a complete and current backup is much more difficult and sometimes impossible.

duplicity backup screen shot

Duplicity help text (partial)

Duplicity is available under many Unix-like operating systems (such as Linux, BSD, and Mac OS X) and ships with many popular Linux distributions including Ubuntu, Debian, and Fedora. It also can be used with Windows under Cygwin.

We recently published a KB article on How to configure Backblaze B2 with Duplicity on Linux that demonstrates how to set up Duplicity with B2 and back up and restore a directory from Linux.

New School (Restic)

With the arrival of non-sequential storage medium, such as disk drives, and new ideas such as deduplication, comes the new school approach, which is used by Restic. Data can be written and changed anywhere on the storage medium. This efficiency comes largely through the use of deduplication. Deduplication is a process that eliminates redundant copies of data and reduces storage overhead. Data deduplication techniques ensure that only one unique instance of data is retained on storage media, greatly increasing storage efficiency and flexibility.

Restic is a recently available multi-platform command line backup software program that is designed to be fast, efficient, and secure. Restic supports a variety of backends for storing backups, including a local server, SFTP server, HTTP Rest server, and a number of cloud storage providers, including Backblaze B2.

Files are uploaded to a B2 bucket as deduplicated, encrypted chunks. Each time a backup runs, only changed data is backed up. On each backup run, a snapshot is created enabling restores to a specific date or time.

Restic assumes that the storage location for repository is shared, so it always encrypts the backed up data. This is in addition to any encryption and security from the storage provider.

Restic backup screen shot

Restic help text

Restic is open source and free software and licensed under the BSD 2-Clause License and actively developed on GitHub.

There’s a lot more you can do with Restic, including adding tags, mounting a repository locally, and scripting. To learn more, you can review the documentation at https://restic.readthedocs.io.

Coincidentally with this blog post, we published a KB article, How to configure Backblaze B2 with Restic on Linux, in which we show how to set up Restic for use with B2 and how to back up and restore a home directory from Linux to B2.

Which is Linux Backup Method is Right for You?

While Duplicity is a popular, widely-available, and useful program, many users of cloud storage such as B2 are moving to new-school solutions like Restic that take better advantage of the non-sequential access capabilities and speed of modern storage media used by cloud storage providers.

Tell us how you’re backing up Linux

Please let us know in the comments what you’re using for Linux backups, and if you have experience using Duplicity, Restic, or other Unix/Linux backup software with Backblaze B2.

Roderick Bauer

Roderick Bauer

Content Director at Backblaze
Roderick enjoys sailing on San Francisco Bay, motorcycling, cooking, reading, and writing about tech and culture. He is Content Director for Backblaze.

Follow Roderick on:
Twitter: @rodbauer | LinkedIn | Google+ | Medium | Flickr | SmugMug
Category:  Cloud Storage
  • Restic looks promising, however it does use symmetrical cryptography to secure your data and it does store the password on the server when scripted. I do consider this a security risk for corporate data. It can not beat, yet, the way how duplicity encrypts your stored data with asymmetrical cryptography (pgp). Symmetrical keys can be brute-forced in one or two days with a botnet. So unless this is solved in Restic, I consider this solution not a secure way to use as backup solution. Additional I would like to see close operation with ZFS (snapshots). Satoshin.

    • I use pass to store my restic key, as well as my B2 account ID and key … and then have a “wrapper script” which grabs the info, kicks off restic. If gpg-agent have not unlocked my private key, I get a pop-up dialogue prompting for my PGP password.

  • Kristian

    B2 is still lacking a PUT/append only mode or a feature to disable deleting (from the API) of versioning objects.

    How is a backup useful, if the attacker can just delete it?

  • Richard Aldridge

    Have you looked at Duplicacy? I’ve tried Restic and found Duplicacy to be much faster.

    https://github.com/gilbertchen/duplicacy

  • Jonathan Cormier

    How well does Restic scale to large backups? In the millions of file and TB’s of data?

    • I’d like to hear from users who are successfully doing that.

  • Preston

    Ok, been playing around with Restic this morning. It is seriously awesome!

    • Nice to hear you have found a solution that works for you.

  • Pingback: Backing Up Linux to Backblaze B2 with Duplicity and Restic – Akshaya IT Services()

  • I’ve been using B2 for incremental-forever backups since Duplicity first supported it. As a system backup for my pair of Prod and Test Ubuntu VPSs.

    Biggest initial hurdle was getting the Duplicity options “right”. Here is what I ended up for Ubuntu 16.04 LTS: –timeout 120 –num-retries 12 –backend-retry-delay 90 –allow-source-mismatch –no-encryption –tempdir /tmp –exclude /tmp –exclude /var/lib/lxcfs –exclude /run –exclude /proc –exclude /sys –exclude /root/.cache

    My latest iteration has been to switch VPS providers, to get large storage at an affordable price (Time4VPS offers 1TB for a few Euros a month) and turn them into Prod and Test Backup/Archive Servers. Every night, Cron triggers (via http to PHP code) mysqldump of all MySQL databases on a half dozen Linux Shared Hosting accounts on various hosting companies,and then uses lftp on the Server to upload those mysqldump and all the other web and non-web files (via lftp mirror command). Then using Duplicity to do incremental-forever backups of everything to B2, with a different Bucket for each database, and a Bucket for each Shared Hosting account’s files, even if that includes several web sites.

    Sound complicated, but it beats paying for the amount of Backup coverage I want from the Shared Hosting vendors that I use.

    Thanks to this article, I’ll now be checking out Restic as a replacement for Duplicity.

  • Nice article! Been waiting a long while for you guys to write in backing up Linux :) finally got it! Haha thanks!
    https://mybuddyben.com/technology/guides-and-tutorials/full-backup-linux-server-b2-backblaze-hashbackup/
    I also already wrote an article, but I explain how to setup Hashbackup for B2. Check it out if you want to give Hashbackup a try!

  • Kristian

    # Copy-paste: https://www.backblaze.com/blog/sync-vs-backup-vs-storage/#comment-3384934971

    I use a combo.

    My laptop is synced to my server with Syncthing, and the server
    instance of Syncthing has versioning enabled, so if I accidentally
    delete a file, I can quickly find it on the server.

    02:00 every night BorgBackup create a backup to a external HDD and lastly I use rclone to sync that backup to B2.

    So in total I have 4 copy of my data.
    1 copy on my laptop.
    2 copy on my server (on 2 different HDD).
    1 copy on B2.

    So
    if my laptop get stolen, I still have 3 copies, but if the house burn
    down or some sort of power surge happens I still have the B2 copy.
    My last concern now is bitrot/silentl corruption.

  • Ryan Flowers

    I have a home server with plenty of storage (but not a lot, just 1TB) and a OVH VPS with 40gb space, and host several sites on it. I use autossh to keep a reverse ssh tunnel open and just rsync backups home once a week. http://miscdotgeek.com/reverse-ssh-tunnel/ tells part of that story. I manage others though and am currently not happy with my backups, so I will likely check out B2 and Restic for my managed customers.

  • Tina Femea

    I run a small web/email server linode for about a dozen domains. I used to use rsync and a series of custom scripts to backup to an always-on home server. I did full backups on the first of every month and incrementals every night after that. When I decided to move away from the home server as backup, I switched to rclone and b2. I just back up everything that’s changed every night, and I keep 90 days of versions using the bucket management options.

    On top of that, I still use /etc for all my configuration files, and I keep that directory in git. Every time I make a change, I check it in (and push it to a github private repository).