Backblaze provides our unlimited online backup service to
individuals, organizations, and businesses in over 140 countries. Key to operating this service is our ability to
cost effectively store data that can be recovered quickly, accurately, and efficiently.
For the past seven years we’ve used software RAID technology in our
Backblaze Storage Pods to
provide the file redundancy and reliability needed. When we designed Backblaze Vaults we took the opportunity to
rethink our data storage and recovery strategies, and Backblaze Reed-Solomon erasure coding was born.
Putting Erasure Coding to
Work
An
erasure code takes a message, such as a data file, and
makes a longer message in a way that the original can be reconstructed even though parts of the longer message are
lost.
Reed-Solomon is an erasure
code with exactly the properties we needed for file storage and reliable recovery. It is simple and straightforward
to implement while being a reliable, well-proven technique that ensures that an entire data element can be recovered
even when part or parts of the original stored data element are lost or unavailable.
The practical application for Backblaze is that in a cloud-scale datacenter, you have to assume that hard drives
containing terabytes of data will die on a regular basis. The
Backblaze Vault Architecture, utilizing our Reed-Solomon erasure coding implementation, is
durable by design so you can trust that your data is safe.
Open Source
We are releasing Backblaze Reed-Solomon as Open Source. The code is licensed with the
MIT License, which means that you can use it in your own
projects, for free. You can even use it in commercial projects.
The source code is packaged in a ZIP file containing the files listed below.
- LICENSE – The MIT license (Text)
- README.md – A quick overview of the files (Text)
- ./com/erasure/{files} - the Java source code and example files
- Galois.java
- Matrix.java
- ReedSolomon.java
- SampleDecoder.java
- SampleEncoder.java
Download Source Code
(JavaReedSolomon.zip – 14K ZIP
file, 94K on disk)
You can also download the ZIP file from Backblaze on
GitHub.
More Information
Check out
our blog post on Backblaze Reed-Solomon to learn
how Reed Solomon works. We included an example on how data can be divided into a coding matrix then be completely
recovered even after losing portions of the original data.