Backing up secret keys (and actually getting them back)

This seems to be an easy task at first, but can turn out a lot more complicated than you might think depending on your needs for security and media longevity.

In regards to keys, I mostly concern myself with RSA 8192 keys and anything smaller than that, this includes other systems like x25519 or other ECC schemes but also things like passwords. The total data size is not expected to exceed a few dozen kb.

The chapters below explore different methods from commonly practiced to obscure, and will contain possible data loss scenarios using this type of backup, pros, and cons.

This only concerns storage and not how the data is transmitted.

An ideal solution should have these properties:

Wide availability of hardware
Decades long data retention without intervention
Difficult to accidentally overwrite, but ideally can be overwritten
Preferrably no ongoing costs
Operating system independent
Immune to humidity or particles in the air

Strategy 1: #YOLO

It seems like a really bad idea to not perform any backups (also known as "faith based availability"). But it turns out it can actually be a viable solution for keys you don't have to get back. An example is the key for a public website certificate.

Most CAs allow you to re-key your certificate (reissue with new key but identically otherwise), and this is usually free if it doesn't happens too often. Some CAs will revoke the old certificate if you do this, meaning that the old key (if rediscovered) becomes useless for certificate purposes, but will still be usable to you to decrypt data that was encrypted previously with it.

Data loss scenario

Yes, intentional

Pro

Cheapest solution possible (initial cost and running cost approx. zero)
Easy to implement (can't get any simpler than doing absolutely nothing)

Contra

Manually have to generate a new key and reissue the certificate
Not suitable if data was encrypted using the old key

Conclusion

Not making backups is a viable solution if the cost of having a backup outweighs the cost of working with new keys.

This does not apply to my use case, but it's worth mentioning that before you ask yourself how to back up your stuff whether you actually need a backup at all.

Strategy 2: The cloud

Lol, No. The goal of this is to backup keys safely, and not to hand them over to someone who per their terms and conditions takes no responsibility if the key is lost or stolen (a so-called "surprise backup").

Provided you can just get your data back without jumping through hoops means an attacker that manages to intrude their systems can likely too.

Data loss scenario

Non-payment or late payment
Forgetting to renew contracts
"Accidental" exposure
Service closure
Mishandling by the provider

Pro

Georedundancy

Contra

Dependency on another party
Ongoing costs
No guarantee that data is kept offline

Conclusion

Cloud based backup is a viable solution if you don't want to pay potentially high upfront costs for a local backup solution, but after a few years it likely will be more expensive due to continuous ongoing costs, which can be minimal for some local solutions. You also have to factor in the potential cost of your provider closing the service, and you having to copy all backups to a new provider.

By properly encrypting them you can avoid leaked backups to become a liability, but then you're again in the "how to backup the encryption key" territory.

Strategy 3: Using magnetic fields (modern)

This is one of the simpler solutions. Just store the data on a hard drive. Of course this needs to be an external drive so you can easily unplug it and store it somewhere.

Harddrives store data on a magnetized layer on top of a spinning platter. By changing the orientation of the field of the tiny magnetic particles they encode binary data on it.

These fields eventually average out with the fields around them and become unreadable. A normal harddrive will automatically rewrite the data to recreate the fields occasionally. This of course doesn't happens when the drive is unplugged.

And although they don't fail that often, they're still fairly low on the reliability scale. You can guard against the odd bad sector by repeatedly writing the data until the disk is full. Of course if the bad sector happens to be one that's critical for the file system, data recovery can become a hassle, especially if the data is fragmented.

Additionally, these drives can fail in many other ways as shown below. Some can be easy to fix. If the USB controller is broken you can likely just remove the drive from the enclosure and plug it into another controller, provided the controller is a separate board and not integrated into the drive itself.

Data loss scenario

Magnetization loss
Drop damage
Failure of moving parts (motor and head)
Failure of the disk electronics
Failure of the communication controller

Pro

Fairly cheap
Easy to use and store
SATA and USB are likely going to be around for a while

Contra

Easy to accidentally erase
Many ways of failure
Require occasional maintenance

Conclusion

Harddrives are commonly available and fairly cheap. They're basically plug & play except from having to be formatted. Being mechanical makes them not quite as reliable as other methods. They're your preferred solution for online backups (such as to a NAS) or bulk storage with faster access times than other bulk media.

They're kinda fragile, which is not ideal for backup media. On the other hand, the cost per byte is very low, meaning you can offset their problems by just buying more of them, and mirroring your backup.

External drives usually come with USB, but enclosures you can directly connect to SATA exist too.

Even the smallest drives vastly exceed the storage space needed for my case, so this is not exactly an ideal solution. They're also too easy to overwrite.

Strategy 4: Using magnetic fields (traditional)

Magnetic tape has been around for decades and was pretty much the first bulk storage media. They are awkward to use but modern tapes offer incredible data density and are intended for long term storage.

The most common tape format (LTO - Linear Tape-Open) has an expected shelf life of 30 to 50 years. This is only reached in dry condition. Tape deteriorates faster in humid air. This makes them perfect for backups, if it wasn't for them to quickly get obsolete.

LTO tapes are in the process of continuous improvement. Approximately every other year a new version comes to market. Tape drives only go two versions back at most, sometimes only one, and often will only read, but not write the oldest supported tapes. Rule of thumb is you can read and write the current and previous generation, and you may be able to read one generation further back. You cannot use newer generation tapes.

You can of course keep your old tape drive as long as it lasts, but when it breaks you likely won't find a replacement for it anymore. This effectively means that for long term storage, you need to copy old tapes to new tapes about every 5 years.

The tape shell is made of plastic, and if you drop it, there's a chance it breaks. The tape inards are fairly trivial. The only important components are the single tape reel it contains, and the RFID chip that identifies the tape. You can always dismantle another tape shell and replace the contents with those from the broken tape.

The RFID chip can break, which renders the tape unusable. You can usually weasel yourself around the problem by installing the chip from another tape, but it has to be from the same tape generation.

Tapes have excellent write protection abilities. All tapes have a write protect tab that can be enabled and disabled at the users convenience. Additionally, WORM tapes (write-once-read-many) can only be written to once. You can always append to, but never overwrite these tapes.

The WORM feature is marked on the RFID chip as well as the tape format. LTO tapes contain a pre-written track used for synchronizazion. This track also mentions that WORM is in use, which prevents crafty people from swapping chips. In the end, it's still a software feature. There is nothing physically different in the tape formula that would prevent overwriting.

WORM tapes cannot be erased either. To destroy the data, the tape has to be demagnetized. Devices that do this properly exist, but (for how simple they are) are not exactly cheap.

Data loss scenario

Degradation of the tape due to moisture
Drop damage
Failure of the tape electronics
Incompatibility with new drives

Pro

Very reliable
Easy to store
Built-in write protection

Contra

Extremely expensive (into the thousands for a single tape and drive)
Requires special software to read and write
Requires regular upgrades
Usually requires a server disk bus (SAS) instead of SATA

Conclusion

While giving the impression that this is the ideal format for long term storage, the continuous deprecation of older tape version makes this (at least LTO) not something you can store and forget.

I won't spend thousands on a tape drive and a hundred on a cartridge to backup a few KB. And buying a used drive for an obsolete version isn't exactly ideal either. But for big amounts of data, this is still the best price/GB solution but that's not why you're here, are you?

Scenario 5: Trapping electrons in purgatory

Flash based memory has been around for a long time, but due to recent advancements in speed and capacity, is now the state of the art for daily use, and has mostly relegated harddrives (internal and external) to the bulk storage department.

Flash comes in two common variants, one is the traditional flash storage you find in your SD card or USB flash drive, the other is commonly known as the SSD (Solid State Drive).

They're both fundamentally the same technology, but SSDs are a lot faster due to various optimizations, including but not limited to:

Parallelizing cells (like RAID 0) to access data faster
Using internal RAM cache for writing to not make the user wait
Wear leveling of cells to make them all degrade at the same rate
Pre-erasing cells marked as free so they're immediately ready for the next write request

Data in the memory is retained by trapping charges. It doesn't relies on an external power source to retain the data. Retention by various manufacturers is usually given in a 5-20 years time frame. It depends greatly on the temperature, and storing them in a cool place extends the data retention duration. Calculations are usually made at 25°C.

The retention duration also depends on how often a memory cell has been written. Each write degrades it a little. Most flash based memory has guaranteed write cycles somewhere between 100'000 and 1'000'000, but it can be as low as 10'000 for cheap memory cells.

Additionally, reading a specific memory cell often, can eventually program the cells next to it, meaning these cells will need to be rewritten by the controller before this happens to reset the accumulated charges, or the frequently used cell has to occasionally be copied to another location. This effect is known as "Read disturb".

The controller does all read/write management in the background.

Flash memory requires much higher voltage for writing and erasing than for reading. This voltage is usually provided internally with a charge pump. If it fails you cannot write to the flash anymore, but you can still read it. This is not a problem if the data is already backed up of course, but if you ever had a USB flash drive that magically became write protected and you could not write to it anymore by any means, it is probably this specific part that's defective.

Data loss scenario

Accelerated cell breakdown due to frequent reads or writes
Heat (Daily direct sun exposure for example)
Controller failure

Pro

Flash cells themselves are immune to many environmental factors. Micro SD cards for example are often waterproof
Easy to store
Easy to use
Slow memory is available at very affordable price ranges

Contra

Not intended for long term storage
Easy to accidentally erase
More expensive than harddrives in regards to price/GB

Conclusion

Small USB flash drives are very cheap to the point where you can easily buy multiple drives to combat hardware failure.

The limited write cycles is generally not a problem for backup purposes, even on cheap drives that tend to do poor block management.

Although we're going to leave 99.99% of the storage space unused, they appear to be a good canditate. Maybe a bit too good. You want to make sure you store them where you keep other backup media and label it as such, otherwise someone might one day grab it and accidentally use it.

Scenario 6: Optical discs

Optical discs have been getting less common over the years to the point where even desktop computers often come without a drive. These discs have mostly been replaced with online downloads and streaming offers.

They're slow to read and even slower to write, but they're a true "passive media", containing absolutely no electronics whatsoever.

With Blu-ray discs, data capacity has grown a lot. 100 USD will get you a writer that supports BDXL format which stores up to 120 GB on a single disc. Those drives also read and write CDs and DVDs. With CD-R being introduced approximately 35 years ago, that's a fairly decent amount of compatibility.

The biggest problems are the steady degradation of the media as well as damage to the surface. Shelf life is usually given as 10 years, but can be less. This problem is more prominent in writable discs than factory pressed discs, because writable discs work based on a biological layer where the tracks are literally burned into the material with a laser (hence the term "burning a disc"). This layer degrades easier than that of factory pressed discs.

Data loss scenario

Degrades somewhat quickly over time
Sensitive to sunlight, especially UV
Easy to damage and render parts of it unreadable
A scratch close to the center can render the entire disc unreadable

Pro

Very cheap bulk media
Easy to store
Easy to use
Difficult to overwrite
Passive media
Unlimited read cycles

Contra

Hardware for this is slowly being phased out in favor of online streaming
Easy to damage
Short shelf life if stored under less than perfect conditions

Conclusion

This is the next best thing to tape without the massive costs, but less capacity. Because this is no longer seen as a backup media and because of how media has evolved, it may be difficult to find drives for it in the future.

Because it is passive and mostly plastic, it can survive getting wet or covered in muck during disasters, provided you clean it carefully.

Noteworthy alternative

M-DISC (Millennial Disc) exists as an alternative, with a projected lifespan of 1000 years. Those discs however can't be written in regular DVD or Blu-ray drives (but can be read). A disc costs approximately 2-3 times as much as a regular disc, and they do not solve the drive obsolecense problem either.

It may not be a viable business model either, as the company went bankrupt, and M-DISC compatible discs are now sold by other, larger corporations.

Scenario 7: The obsolete way of using magnetic fields

Floppy disks. This is floppy disks. Not only are they not intended for long term storage, the hardware for them is virtually non-existent anymore. Disks are also fairly hard to find. They're not protected against magnetic fields like harddrives are.

Data loss scenario

Unavailability of hardware
Damage to the disk by dirt and dust
Magnetic fields

Pro

Passive media
The closest so far in size to what we need
Write protect functionality

Contra

Hardware is difficult to find
Disks are difficult to find
Somewhat easy to damage
Not as reliable as other passive media

Conclusion

Although fitting the size budget as close as possible so far, they have been phased out for years now except for old industrial machines maybe, and even for them there is replacement hardware that emulates a floppy using a USB flash drive.

A fun little idea but not reliable enough, especially since later disks were sometimes made with inferior materials.

Scenario 8: Wood

Or more precisely, paper. Paper is an interesting media. Not only is it completely offline, it is not bound to any technological changes. As long as you can read the data, you can get it back. Reliably printing to paper is usually easier than reliably scanning it again.

The simplest form of a backup would be to convert your data into printable text and then simply print it. This works, but restoring would be subject to OCR mistakes. Chosing an OCR optimized font and adding error correcting codes will help in this case.

If your computer supports web fonts, this text will be rendered in an OCR optimized font (Name: OCR-A), otherwise you will get a generic monospaced font.

This OCR font is ugly for humans to read, but works well for machines. Because of this, it's widely used on machine readable sections of documents, usually in a text strip at the bottom. It's very common in the financial sector.

A more readable variant OCR-B also exists but it's slightly harder for machines to read.

Another way is to couple the data to some technological standard, for example a QR code. A single code can hold up to almost 3000 bytes. Or you can use software specifically designed to print files on paper, for example Paperback. Computer readable data ensures you can simply scan the page to get the data back, provided there is still an algorithm implemented somewhere for this. Caution has to be taken when this is done, because these codes can potentially be read from great distance by sufficiently capable camera optics.

Paper is subject to easy destruction with liquids or fire.

Data loss scenario

Degradation of the paper due to water, fire, moisture, or chemical spills
Parts of the data can literally be torn off
Accidental disposal
Fading of ink

Pro

Passive media
Not bound to any technical standard as long as printing and scanning in any form exists
Data can be enriched with human readable instructions

Contra

Can be copied by an adversary from great distance if not careful
Easy to accidentally destroy
Very limited data density
Data restore can be fiddly
Possible data recovery from printer or scanner memory by 3rd party

Conclusion

If we're talking about passwords and/or ECC keys, this is not actually a bad idea. This is unbreakable except by a personal visit from somebody if you manage to conceal the paper from the outside world.

Of course if the goal is to hide this from somebody who breaks into your home you may want to evaluate carefully where to hide it, as obvious places like a safe might not be a good idea in that scenario, but too obscure places might make you forget where it is yourself.

Scenario 9: Cryptographic processor

This is not a backup solution. Not only will your backup depend on a lot of cryptography in a small device you likely don't even have the source code to, these things are actively designed to not give you the key back. They're a way to use your key in a safer manner but not for backup purposes.

Data loss scenario

Guaranteed. They're designed to keep the key a secret

Pro

Allow you to safely work with a key without exposing it

Contra

You're not actually getting your key back

Conclusion

If you look for a way to safely use your key, this is the way to go. You still need a backup somewhere to handle loss or damage of the device.

Scenario 10: EEPROM

EEPROMs and flash storage are in a similar category. They work in a different manner though, and usually come as a raw IC without any controller. This means that similar to a CD, you need additional hardware to read and write it.

Reading an EEPROM with a microcontroller is trivially easy. In fact, there is no lower end for the clock speed. In a pinch, you can read them with a USB power supply, a bunch of switches to set the address lines and toggle the clock line, and a bunch of LEDs (or a hex display unit) to read the addressed byte.

Because of that, any microcontroller that runs at 5V logic level can read them fairly easily.

PROMs (programmable read only memory) were the first iteration of this technology. They were only programmable once. Not bad for backup purposes, but very annoying in the evaluation phase where you may want to do multiple tries.

EPROMs (Erasable PROM) were an evolution of this technology that allowed the memory to be rewritten. To erase it, you had to shine a strong UV light at a glass window in the chip for a few minutes. To retain data, a sticker was placed on the window to prevent it from being erased by the sun over time.

EEPROMs (Elecrtically EPROM) are the final step in the evolution, allowing the chip to be erased within milliseconds by using voltage spikes. This makes accidental erasure fairly difficult, because the erase and program voltages are multiple times the normal operating voltage.

EEPROMs usually work on 5V, which makes them compatible with microcontrollers and TTL logic. You don't have to do that though, EEPROM programmers exist for around 50 USD and come with free software to read and write the chips.

EEPROMs themselves are also fairly cheap, usually going for around a dollar a piece. They exist as serial and parallel versions. For this usecase this doesn't matters, but if you want to use them on a microcontroller, a serial version might be preferred due to only requiring a few pins, while parallel versions tend to have 20 pins or more.

The amount of data on them is fairly limited, usually a few dozen kilobytes at most. Data retention on the other hand is fantastic, usually at least half a century at room temperature, and the chips can withstand temperatures from around -60°C to +120°C without any damage. They don't suffer negative side effects from reading the data either.

They're completely encased in an IC package and because of this are waterproof for extended time periods. I submerged the one I tested for a day and after drying it off with a towel it immediately resumed to work. I also baked it in the oven at 100°C for an hour, and exposed it to gamma radiation for another 24 hours.

No data loss or corruption was observed when reading the data afterwards.

Note: I performed the tests on a Winbond W27C512-45Z. This EEPROM is no longer manufactured. If you're interested in trying this for yourself, use a 24LC512 instead. They have the same capacity and are still in production.

The capacity of these is given in kilobit. You have to divide this by 8 to get kilobytes. EEPROM manufacturers use 1024 for the "kilo" unit rather than 1000.

Data loss scenario

Corrosion of the pins
Degradation of data over decades
Pinout information is lost

Pro

Very long data retention
Unlikely to have compatibility problems in the near future
Very cheap
Around 1'000'000 write cycles
Can usually be completely read within a second

Contra

Requires a special programmer device to write
Requires extra hardware to read
Very little space
Difficult to store multiple files on it
Writing data is a lot slower than reading

Conclusion

They're fantastic long term storage devices, and at their price point, you can easily just buy 20 of them to have 20 backups of your keys.

The programmer is cheap but awkward to use if you want to store multiple keys on it. You either have to create a file that contains them all, or load each key individually and adjust the write offset in the programmer. If you want to read the keys back, you of course have to be able to reverse those steps.

There's a chance you no longer know how to read it in 10 years, so it's a good idea to store the datasheet (or at least the pinout) together with it too.

EEPROM software usually comes with a builtin hex viewer. If you have enough free storage space, you can use it for instructions in ASCII format that you can read inside of the software.

If you fancy more creative solutions, this is the way to go, otherwise, just use a few USB flash drives.

TinyFS

I wrote a C# program to bundle files into a small container named TinyFS. It comes with a library, console application, and Windows UI application. The UI is the most comfortable way of managing TinyFS container, as it supports file copy/paste and drag&drop operations.

The container format is fairly easy to reverse engineer, allowing you to get the data back even if you no longer have the documentation.

Maximum file name length: 255 bytes
Maximum file size: 65535 bytes
Maximum file count: 255
File compression: Optional, gzip
Encryption: Optional, index and file contents, AES-GCM

TinyFS containers are managed from your disk, and can be read from or written to your IC using a programmer

Alternatives

As an alternative, you could write a KeePass database to the IC.