Files
A file contains a sequence of bytes. Any file on your computer can be uploaded to B2 and stored in a Cloud Storage, as long as it's not too big. Files can range in size from 0 bytes to 5 GiB (5 x 230 or 5,368,709,120 bytes).
Once a file is uploaded, you can download it right away, or wait years and then download it. You can download it once, or give the URL to all of your friends and they can all download it.
Uploading the same file name more than once results in multiple versions of the same file. This can be useful for tracking the history of a document. See File Versions for more details.
The API calls related to files are:
b2_delete_file_version
- deletes one version of one fileb2_download_file_by_id
- downloads a specific version of a fileb2_download_file_by_name
- downloads the most recent version of a fileb2_get_file_info
- returns information about a fileb2_hide_file
- hides a file, without deleting its datab2_list_file_names
- lists the file names in a bucketb2_list_file_versions
- lists all of the file versions in a bucketb2_upload_file
- uploads a new file (or version of a file)
Warning: Do not include Protected Health Information (PHI) or Personally Identifiable Information (PII) in bucket names, object/file/folder names, or other metadata. Such metadata is not encrypted in a way that meets Health Insurance Portability and Accountability Act (HIPAA) protection requirements for PHI/PII data and is not generally encrypted in client-side encryption architectures.
File Names
Files have names, which are set when a file is uploaded. Once a file is uploaded, its name cannot be changed. You can then download a file if you know its name.
Names can be pretty much any UTF-8 string up to 1024 bytes long. There are a few picky rules:
- No character codes below 32 are allowed.
- DEL characters (127) are not allowed.
These are all valid file names:
Kitten Videos
users/beatrice/kitten.jpg
自由.txt
When downloading or exporting files, be aware that the file name requirements above are fairly permissive and may allow names that are not compatible with your disk file system.
File IDs
In addition to a name, each file uploaded has a unique ID that identifies that specific version of that file. A File ID will not be more than 200 characters. If you want to download an older version of a file, you'll need to know its File ID. File IDs look like this:
4_zb330e285948b7a6d4b1b0712_f000000000000472a_d20140104_m032022_c001_v0000123_t0104
Listing Files
You can call b2_list_file_names
to get a list of the files in a bucket, and
b2_list_file_versions to list
all of the versions of files in a bucket.
See File Versions
for more details.
Downloading Files
If you have uploaded a file called cats/kitten.jpg
to a
bucket called cute_pictures
, you'll be able to view
the file in a browser with a URL that looks like this:
https://f001.backblazeb2.com/file/cute_pictures/cats/kitten.jpg
The first part of the URL comes is the download URL that you get from
the b2_authorize_account
call. Then comes /file/
, then the bucket name, another /
,
and then the file name.
Folders (There Are No Folders)
A bucket holds files. There is no hierarchy of folders, just one long, flat list of file names. For example, a bucket might have four files in it, with these four names:
cats/cat.jpg cats/kitten.jpg dogs/dog.jpg dogs/puppy.jpg
There are no folders. The name of the first file is cats/cat.jpg
,
and the name of the second file is cats/kitten.jpg
. There
is nothing called just cats
.
Even though there are no folders, many of the tools that work with files
in a bucket act like there are folders. The file browser on the Backblaze
web site acts like there are folders, and so does the b2
command-line tool. Under the covers, they both just
scan through the flat list of files and pretend. Here's an example of using
the command-line tool:
$ b2 ls my_bucket cats/ dogs/ $ b2 ls my_bucket cats cat.jpg kitten.jpg
We recommend that you use "/" to separate folder names, just like you would for files on your computer. (Or just like you would use "\" if you use Windows.) That way the tools can figure out the implied folder structure.
Checksums
To ensure the integrity of your data, when you upload a file you must provide a SHA1 checksum of the data. This ensures that if any of the data is corrupted in the network on its way to B2, it will be detected before the file is stored. When you download a file, the SHA1 checksum is attached so that you can verify that the data you receive is intact.
MIME Types
When you upload a file, you also provide a MIME type for the file, which
will be used when a browser downloads the file so that it knows what
kind of file it is. For example, if you say that your file kitten.jpg
has a MIME type of image/jpeg
, then a browser that downloads
the file will know that it's an image to be displayed.
File Info
Each file has information associated with it, in addition to the sequence of bytes that the file contains. Every file has a size (the number of bytes in the file), a MIME type, and a SHA1 checksum. You can also add your own custom information.
You can add key/value pairs as custom file information. Each key is a UTF-8 string up to 50 bytes long, and can contain letters, numbers, and the following list of special characters: "-", "_", ".", "`", "~", "!", "#", "$", "%", "^", "&", "*", "'", "|", "+". Each key is converted to lowercase. Names that begin with "b2-" are reserved. There is an overall 7000-byte limit on the headers needed for file name and file info, unless the file is uploaded with Server-Side Encryption, in which case the limit is 2048 bytes. (See next section.)
For names that don't start with "b2-", there is no limit on the size or content of the values, other than the overall size limit.
Names that start with "b2-" must be in the list of defined "b2-" names and their values must be valid. See the list below for details. B2 rejects any upload request with an unexpected "b2-" file info name. B2 also rejects any upload with a "b2-" file info name whose value doesn't meet the specified format for that name.
b2-content-disposition
optional
If this is present, B2 will use it as the value of the 'Content-Disposition' header when the file is downloaded (unless it's overridden by a value given in the download request). The value must match the grammar specified in RFC 6266. Parameter continuations are not supported. 'Extended-value's are supported for charset 'UTF-8' (case-insensitive) when the language is empty. Note that this file info will not be included in downloads as a x-bz-info-b2-content-disposition header. Instead, it (or the value specified in a request) will be in the Content-Disposition header.
b2-content-language
optional
If this is present, B2 will use it as the value of the 'Content-Language' header when the file is downloaded (unless it's overridden by a value given in the download request). The value must match the grammar specified in RFC 2616. Note that this file info will not be included in downloads as a x-bz-info-b2-content-language header. Instead, it (or the value specified in a request) will be in the Content-Language header.
b2-expires
optional
If this is present, B2 will use it as the value of the 'Expires' header when the file is downloaded (unless it's overridden by a value given in the download request). The value must match the grammar specified in RFC 2616. Note that this file info will not be included in downloads as a x-bz-info-b2-expires header. Instead, it (or the value specified in a request) will be in the Expires header.
b2-cache-control
optional
If this is present, B2 will use it as the value of the 'Cache-Control' header when the file is downloaded (unless it's overridden by a value given in the download request), and overriding the value defined at the bucket level. The value must match the grammar specified in RFC 2616. Note that this file info will not be included in downloads as a x-bz-info-cache-control header. Instead, it (or the value specified in a request) will be in the Cache-Control header.
b2-content-encoding
optional
If this is present, B2 will use it as the value of the 'Content-Encoding' header when the file is downloaded (unless it's overridden by a value given in the download request). The value must match the grammar specified in RFC 2616. Note that this file info will not be included in downloads as a x-bz-info-b2-content-encoding header. Instead, it (or the value specified in a request) will be in the Content-Encoding header.
You provide the File Info with the b2_upload_file
call for regular files, and b2_start_large_file
for large files.
It is set when the file is uploaded and cannot be changed. The
b2_get_file_info
call returns the
information about a file. The information is also returned in the HTTP
headers when you download a file.
Recommended File Info key/value: If the original source of the file being uploaded has a last modified
time concept, Backblaze recommends using
src_last_modified_millis
as the key,
and for the value use a string holding the base 10 number number of milliseconds since
midnight, January 1, 1970 UTC. This fits in a 64 bit integer such
as the type "long" in the programming language Java. It is intended
to be compatible with Java's time long. For example, it can be passed
directly into the Java call Date.setTime(long time).
Recommended File Info key/value: If this is a large file (meaning the caller is using b2_start_large_file)
and if the caller knows the SHA1 of the entire large file being uploaded,
Backblaze recommends using
large_file_sha1
as the key,
and for the value use a 40 byte hex string representing the SHA1.
HTTP Header Size Limit
The file name and file info must fit, along with the other necessary headers, within an 8KB limit imposed by some web servers and proxies. To ensure this, both now and in the future, B2 limits the combined header size for all file info. There are two possible limits depending on the features in use for a file.
- In most cases, B2 limits the combined header size for the file name and all file info to 7,000 bytes. This limit applies to the fully encoded HTTP header line, including the carriage-return and newline. The header line below is counted as 40 bytes.
- Newer features of the B2 API require additional headers. For files encrypted with Server-Side Encryption and/or in Object Lock-enabled buckets, the limit is reduced to 2,048 bytes to ensure sufficient space for additional response headers. This limit is on the file info header names and values only. The header line below is counted as 36 bytes.
-
X-Bz-File-Name: %E8%87%AA%E7%94%B1.txt\r\n