Bionic Buffalo Tech Note #66: File Identification Headers
reasons:
In some situations, the software may be concerned with one but not the other. For
instance, an application may be concerned about locating its own files for backup or
uninstallation (regardless of the organization), while a database utility might be concerned
about backing up or reorganizing a file without regard to the interpretation of the data
records themselves.
Not all applications, nor all organizations, can be known in advance. When a file of
unknown organization is encountered (for example), the application owning it should still
be able to recognize the file as its own.
An application may allow stoage of the same data using several different organizations.
For example, more than one type of indexed file might be possible, depending on the size
of the database or the performance requirements. The application should be able to select
the appropriate access routine by knowing the organization.
It is also important to be able to recognize an application, organization, or owner without knowing
anything about it. In some cases, a program might want to invoke the access routine or application
dynamically, without using complex logic to determine which dynamic routine to use. In other cases, a
human may find it convenient to know about an unfamiliar file, again without having to understand the
details of the file's organization or of the application which created it.
Instead of relying on a central repository of known or registered applications, organizations, and
owners, the standard header identifies entities in two ways: UUID and text string. UUIDs, or
“universally unique identifiers”, are 16-octet sequences based on pseudo-random numbers, and are
highly unlikely to collide. A UUID may be created by using the common
uuidgen
utility, and will
serve to identify an application, organization, or user without resorting to a central repository. Since an
unknown UUID is practically useless to humans wanting to know something about a file, the header
also contains text strings for the file's application, organization, and owner.
The use of UUIDs and text strings allows the header format to be used by anyone, not only by Bionic
Buffalo. Except for UUID and other parameter values, there should be no difference among headers
defined by Bionic Buffalo and those defined by other developers.
Standard Header Structure
Each of the three aspects of a file is represented by the values of a single data structure, the
identification information. There are three parts to this structure: a fixed part with specific fields, a text
string part (with two text strings), and an unstructured sequence of octets. The text strings are Unicode,
encoded using UTF-8.
The standard identification header has the following components, beginning with the first octet of the
file:
Page 2 of 7