Synology Hybrid Backup, Part 1

Where is my data!?

Where is my data!?

We’ve all had it happen: that moment when the computer freezes, and hours of work is lost forever. But when the loss is catastrophic – a bad disk drive, the dreaded ‘clicking’ noise that signals you’re in deep, deep trouble, etc. – you know just how soul-crushing it can be.

I have a particular plan in mind for satisfying my own personal paranoia about data loss. After spending numerous days and weeks fighting with individual external hard drives, a few years ago I decided to move to network-attached storage (NAS) devices. I’ve made no secret that I consider my move from Drobo to Synology one of the best ones I’ve ever made as a consumer.

A few months ago, Synology announced some major work done in their operating system, DSM, to include some robust backup strategies, including encryption, cloud storage, and synchronization. As I wrote at the time, this could be a big deal (especially for me), and I’ve been dying to try it out.

I finally got the chance.

Backup Philosophy

In the past I have had the unfortunate circumstance of losing both my main data platform and the backup drive within days of each other (due to completely independent circumstances). This resulted in catastrophic loss over very personal photos (I’ve lost all the photos I had of my dogs’ puppy years, for instance).

Since that time I’ve wholly embraced the 3-2-1 guidance of backups:

  • 3 different copies
  • 2 different media (e.g, disk & tape, disk and DVD, etc.)
  • 1 offsite location

One thing to remember when it comes to Backup and Disaster Recovery is that often the conditions under which one type fails, is often the same condition that others fail.

For example, if you have two disks, side by side, on your desktop that you bought at the same time, over time the age of those disks will increase the likelihood of failure. As it happens, because they were both bought at the same time, it is possible that their failure rates might just occur at around the same time.

Likewise, if (forbid) there is a fire in your home, both devices are subjected to the same disaster conditions, which is why it’s always good to have a copy of your data in a safe, secure, offsite location. Many people are turning to Cloud storage solutions for this. One additional option (which I do choose) is to store a drive in a safety deposit box in a bank, and rotate that out on a regular basis with updated copies.

Most people, sadly, do not back up their data at all. For those working on Macintosh systems, the availability of Time Machine built into the OS has helped tremendously, but unfortunately my experience with it has been rough. When things go wrong – as it often seems to do (for me; speaking entirely of my own experience) – I can count on hours of troubleshooting.

For that reason I decided to look at Synology’s options for backup, syncing, and cloud storage. It’s important to note that each of these serve different purposes, have different behaviors, and require different approaches. Fortunately my personal needs tend to cover a wide variety of these and I’ll go into each of them in turn.

[Note: There are several software options available for backing up your computer to a Synology, each with their strengths and weaknesses. For me, I was curious about using an all-Synology solution to see the what’s what.]

The Task

Logical backup workflow for my environment

Logical backup workflow for my environment

My computer system is disaggregated for easier organizational purposes (easier for me, at least). I have separate drives for my boot (which is an SSD and very, very fast), Its capacity isn’t very great, however – less than 500G – and as a result I move all the data off onto other drives for optimal space allocation (SSDs do not like to be full).

To solve my 3-2-1 backup needs, I need to do a few things:

  • Aggregate the data from all the internal drives to the Synology DS1813+
  • Back up the aggregated data from the DS1813+ to the DS1511+
  • Encrypt that data on the DS1511+ when sending to an off-site provider
  • Send the data to a cloud service provider over an encrypted link
  • Store the data on a remote service provider in an encrypted form

Fortunately for me, there are tools built-in to the Synology OS, as well as third-party tools available for download, that can accomplish this. Unfortunately for me, there is no single tool that will do what I need to have done, and as a result need to be very careful as to which tool I use for which task.

RAID, Syncing, Backup, and Copying

The remainder of this post goes into some of the specifics of terminology, especially as it relates to Synology’s use of them. If you want to skip this first section and go straight to the “how to do this” post, click here, but I’ll be referring back to this background quite a bit.

Data protection is a science as well as an art form. Many of the terms used have subtle differences that the average layman may be confused about. For that reason, I’m going to briefly cover what some of the terms mean within the context of using Synology NAS systems to protect your data.

RAID
RAID is not backup!

RAID is not backup!

Let’s start off with the most important part: RAID is not backup.

When you set up your Synology it will ask you for what type of data protection you want to have. You can use the Synology’s built-in system (SHR) and it will ask you if you want to protect for 1 drive failure or 2 drive failures. This does not mean that your data is safe from failure – on the contrary, all it means is that your the risk of your data being affected by one physical drive is now spread over the availability of multiple drives.

There are still many potential points of failure, and do not count on your Synology as a backup solution if you store your main data on it! I’m not going to go into the reasons here, because it falls outside the scope of this blog, but there are many resources if you want to find out more about how and why RAID is not a backup.

Syncing

Many people are already familiar with Syncing. If you have used Dopbox, for example, you know that a folder on your hard drive is shared with a remote folder somewhere else and the two are mirrored across a (local or remote) network.

Syncing is not backup either!

Syncing is not backup either!

One of the important aspects of syncing is that what you do to one folder, you do to the other. That is, if you save a file in one folder, it will automatically be saved in the other system as well. Similarly, if you delete a file in one, it will delete in the other.

For that reason, Syncing (as a concept) is not backup either.

Why not?

It’s important to remember that backups are only half of the equation; you must also be able to restore. Suppose you accidentally delete a file or folder from your original location and wish to grab it from the remote side. Oooops! It’s gone from there too.

Until you’re more familiar with Synology’s approach, this can wind up becoming a bit complicated, because you have the ability to offset the syncing, which means that you can delay changes to the remote folder. As a result, it is possible to go back and rescue files and folders from the remote location before the next sync process takes place. As I describe it here, it doesn’t sound all that complicated, but when you start to fiddle with the knobs inside the UI, trust me, it helps to understand the bigger picture.

Depending on the size of the data deleted, the ability to copy it back before the delete command is initiated can be troublesome. Either way, the restoration functionality is not part of the normal process, and cannot be considered a backup solution.

So what’s it good for?

Generally speaking, Syncing (and Versioning, too), are about “data availability.” The goal is to prevent errors caused by human errors, viruses, malicious intent, etc., from disrupting the continuity of your workflow and productivity.

Most importantly, and this is what Synology recommends, syncing is great as a “poor-man’s high availability” (my words, not theirs). It’s not really “High” Availability (HA), but about as close as you can get. If your first DiskStation has a catastrophic failure and goes offline, all your data is available on the secondary system (at least, from the last time it was scheduled/available to sync).

You can simply point your compute and mobile devices to the secondary disk station and away you go, while you try to fix your primary system. Just remember you’re flying without a net at that point, unless you have a tertiary mechanism for recovery if that system should fail (e.g., offsite backup, cloud storage, etc.).

Now, DSM 5.2 has the ability for true HA, given the correct models. Unfortunately in my case, the DS1511+ does not have the correct hardware to permit HA configuration. So, for this reason I’m going to have to forego this capability, given my setup.

Syncing is also great for ensuring that multiple devices have access to the latest data, and indeed Synology provides mobile applications that can be connected with a sync’d folder share. Unlike Dropbox’s two-way synchronization paradigm, however, Synology’s – called Shared Folder Sync – is one-way. That is, if you add or delete anything manually on the remote side (which is not easy to do, btw), the original folder is not touched.

Think of it this way: setting up a synchronization shared folder in Synology is similar to setting up a repository that mobile devices can then access.

In Synology’s DSM 5.2 interface, Syncing is found in the Backup and Replication application, found in the main dashboard.

Backup and Replication location in the Dashboard

Backup and Replication location in the Dashboard

Backup and Time Backup

This is where things get interesting and, also, confusing.

Synology has a number of backup options to choose from, including the built-in Backup and Replication solution (mentioned above). They also have a Time Backup, which, if you’re familiar with Apple’s Time Machine, looks very similar in terms of functionality.

Time Backup can be installed from Package Center

Time Backup can be installed from Package Center

Synology’s Backup methodology, unlike RAID and Syncing, is a true Backup system. Compared with Apple’s Time Machine, however, there are a couple of notable differences.

Most importantly, Synology’s Backup systems have a different versioning mechanism. Apple’s Time Machine measures how much disk space you have, and will rotate older files out of the backup repository when you start to run out of space.

Synology does not appear to examine your available capacity, however. Instead, the rotation schedule is based upon a time rotation. It is up to users to determine how many versions they wish to retain, and as a result it is important that they have a good awareness of their capacity and usage patterns.

It’s important to note that DSM 5.2 will be the last version of the DSM  that will have Time Backup as a separate application. Future versions of DSM will have Time Backup abilities built-in for better functionality and versatility of the DSM, Synology says. Synology also says that existing users of Time Backup will not be affected by this change.

Both Shared Folder Sync and Backup allow you to select individual folders for replication purposes, though I found the interface for Backup to be far more intuitive with fewer clicks (in my experience, fewer clicks means fewer things to screw up).

Unlike Synology’s Shared Folder Sync function, however, Synology’s built-in backup functions allow you to do either data (file-based) backup, or LUN (block-based for iSCSI snapshots) backup tasks:

For our purposes, we want "Data backup"

For our purposes, we want “Data backup”

In this case, we’re not working with iSCSI LUNs, but it’s useful to know that iSCSI LUN snapshotting is possible using Synology’s DSM software, with a few additional features specific to iSCSI volumes.

Both of these features – Backup and Sync, have wizards that walk you through the process of selecting the folders you want to sync, and while the basics are there (choose the source, login credentials, there are key behavioral and mechanical administration differences (that we’ll get into in the next article). Also, undoing your configuration is more difficult with syncing than with backup, so make sure you know what type of replication you want (sync or backup) and why!

In my case, Backup helps solve the problem from getting from the DS1813+ to the DS1511+, but it does not help me solve the problem of saving from my Mac. For that, I’ll need another option, such as Cloud Station.

Cloud Station (Backup)

Earlier, when we looked at Shared Folder Sync, I noted that this was a one-way replication method. Sometimes, however, you want to do two-way synchronization, similar to the way Dropbox works. For that, you’d use Cloud Station, which is available in the Package Center:

Cloud Station can give you 2-way syncronization

Cloud Station can give you 2-way syncronization

Cloud Station, in practice, consists of two parts: a service that you have to run on the Synology itself (see the screenshot above), and a client software that you need to enable on whatever device/computer you wish to use the service.

So, for example, in this case if I want to have folders and volumes on my Mac sync with the Synology, I have to download the client for Macintosh.

If I wish to sync folders and volumes from one Synology to the other, bi-directionally, I have to download the client for the Synology (in the package center) and not use the Shared Folder Sync option.

If I wish to sync folders using a mobile device, I have to download the DS Cloud app for my iPad, iPhone, or Android Device:

QR Codes right on the screen make this very easy

QR Codes right on the screen make this very easy

While I’ll go into further detail later on, the effective process is that you establish the service on the Synology you wish to use as the destination (in my case it’s the DS1813+ in the middle of my diagram at the beginning of this blog), and then use the client software to do the configuration for that particular device (e.g., folders, files to be sync’d).

Remember, though, that this is still a syncing process, not a backup process, until you turn versioning on. Fortunately, Cloud Station gives you the ability to do some versioning control, and I’ll go into that later as it’s appropriate for my workflow.

Synology has put together a video on how to do this, as well.

Cloud Storage

cloud storageThe last piece of the puzzle is the offsite backup. Recently the market trend has been to move to Cloud-based storage, but it’s useful to know some very important caveats.

First, cloud storage is slow. It’s slow getting there, and it’s slow getting back. If you have a single Terabyte of data that you need to restore from a cloud storage location, that process can take weeks over a typical home broadband connection.

Second, you have no control. Cloud providers can (and will) grant you all the access you can pay for, and some will have you handle your own private encryption keys for added security. Ultimately, though, your data is at the mercy of anonymous people who can do all kinds of nefarious things to it. Likewise, should a government agency come a-knockin’ on their door, they do not need to get a warrant for your data. Many cloud service providers will simply hand over private data to government queries, whether they’re criminal investigations or not.

Third, cloud storage companies can guarantee the longevity of your data only for as long as they can guarantee the longevity of their own ability to stay in business. The more data you have located outside your control, the more difficult it is to migrate. A good lesson is the recent demise of Nirvanix.

Yes, you can encrypt your data and no, no encryption method is fool-proof. For these reasons, you should be careful as to what data you decide to store on cloud storage providers.

The best way to look at cloud storage is as your “last resort” option. I’m not saying that you set up your cloud storage as a last resort, I’m saying that you should have several restoration options before you get to cloud storage – but have some type of offsite storage as a last resort!

It’s the insurance policy you hope you never have to use. Just like an insurance policy, too, you’ll pay for the privilege of that protection, so choose your level of service with careful forethought and planning.

Synology assumes you’ve thought this through already, as well they should. It’s not their responsibility to hand-hold people through the best practices of cloud storage decisions, mostly because there are so many different options to choose from that the appropriate answer to any question is, “it depends.”

Synology does, however, provide a wide array of options for you to choose from, including some built-in tie-ins to a wide variety of options, expanded with version 5.2. This includes Google Drive, Dropbox, Baidu, OneDrive, Box, hubiC, Amazon S3, iDrive, and several others, including OpenStack Swift. These choices can be bewildering at first, so it pays to do some up-front research.

In a later article I’ll be going over my specific example for cloud storage in detail.

Getting Started

Now that I’ve laid out the foundation for a Hybrid Backup Solution with my Synology setup, it’s time to actually get started. In the next article, I’ll go into specifics of how I managed to get the data from different internal hard drives inside my Mac to the first DiskStation. Following that, we’ll back up everything to the second one, and then finally offload a smaller subset of that data to a cloud storage offering.

[Update 2015.08.26: Added in additional information about Time Backup, corrected naming conventions, minor clarifications.]

[Disclosure: No payment was received for these articles. However, Synology provided me the Diskstation 1813+ (but not the drives), free of charge, for evaluation purposes. I also did get valuable help from Franklin Hua, Synology Sr. Technical Marketing Engineer, to whom I am especially grateful.  Absolutely no editorial guidance was offered or solicited by Synology, other than to ask that I correctly identify their products. 🙂 ]

4 Comments

Leave a Reply

%d bloggers like this: