% tar xvzf mird-1.0.tar.gz
Configure for your system:
% ./configure creating cache ./config.cache configuring in /users/mirar/mird/build/your_machine/ creating cache ./config.cache checking for gcc... gcc [...] creating Makefile %
Make and install the library:
% make install building in /users/mirar/mird/build/mistel.idonex.se/ make[1]: Entering directory `/export/home/mirar/mird/build/mistel.idonex.se' gcc -g -O2 -g -Wall -Wno-parentheses -I. -I../../src -c ../../src/blocks.c -o blocks.o [...] installing mird.h in /usr/local/include ln -sf libmird.so.1.0 /usr/local/lib/libmird.so %
Now the library is installed.
If you want it installed in some other path, run
the configure with
The database file generated does not have any machine dependent
data, which means that the database itself can be transferred between
systems with different endian. Everything is stored in network byte
order.
The authors make NO WARRANTY or representation, either express or implied,
with respect to this software, its quality, accuracy, merchantability, or
fitness for a particular purpose. This software is provided "AS IS", and you,
its user, assume the entire risk as to its quality and accuracy.
This software is copyright (C) 1999-2000, Mirar Hagland
All Rights Reserved except as specified below.
Permission is hereby granted to use, copy, modify, and distribute this
software (or portions thereof) for any purpose, without fee, subject to these
conditions:
These conditions apply to any software derived from or based on the
Mird database code, not just to the unmodified library. If you use
this work, you ought to acknowledge it.
I permit and encourage the use of this software as the basis of
commercial products, provided that all warranty or liability claims
are assumed by the product vendor.
The Unix configuration script "configure" was produced with GNU
Autoconf. It is copyright by the Free Software Foundation but is
freely distributable. The same holds for its supporting scripts
(config.guess, config.sub, ltconfig, ltmain.sh). Another support
script, install-sh, is copyright by M.I.T. but is also freely
distributable.
¹ this legalese is stolen from the JPEG license Example:
Also note that the close or sync operations shouldn't be too often.
The operations to find free space are summarily quicker if there is
more data to work on at the same time.
The result of a transaction close can be a conflict. In this case,
the transaction must be cancelled.
Example:
The transaction scope can be used to read the database, too. It
keeps the scope the database had until it's closed (or synced against the current state in any other
way).
Note that the database cannot be synced when there is ongoing
transactions.
The database is table-oriented, so the first thing you
will have to do with a new database, is to create the tables
you need.
When the tables exists, you will want to write down key:value pairs
in them. This is done by calling mird_key_store, or for
string key tables, mird_s_key_store:
Note also that mird_free may not fail.
If you need to read from the database in one and the same state,
you can create a transaction to read in. This will keep the same state
as the database were when the transaction was created, and may be read
from by using mird_transaction_key_lookup and
mird_transaction_s_key_lookup.
Otherwise, you may wish to handle this error in one way of other.
To do this correctly, you may look at the elements in the strucure,
and/or use the functions and macros available to examine an error;
mird_describe_error, mird_perror and the
convinience macro MIRD_IS_CONFLICT.
The returned type MIRD_RES is a pointer to a
structure containing the following elements:
error_no contains the error number the exception had, see
the error list in the reference manual.
s, x, y and z contains different
information depending on the errors nature (from filenames and block
numbers to table identifier and keys in conflicts). This is also
described in the reference manual.
mird_describe_error and mird_perror converts
the error to something a little bit more human-readable.
An error handling situation can look like this:
If you get an exception, but continues to run the application, you
must always free the exception. This is done by calling
mird_free_error (which may not fail).
This chapter is for you out there that really
wants to know how this library works, and is by all means neccesary
to use the library - although it could be enlightening.
The database engine is divided up in several distinct areas, from
low-level storage to handling transactions and tables:
The superblocks contain information of the database, like where to
find information about the tables, what the block size are, and other
various neccesary knowledge to read a database. Every 4^n-1 block is a
superblock (block 0, block 3, block 15, ...). They also contain
information to tell if the database is clean or in dirty mode - the
later means that the journal file must be read back at database open.
The free list blocks contains lists of free blocks, that can be
reused. The superblock has information on how big the file is, so if
the datpabase runs out of blocks to reuse, it makes the file bigger
instead.
The free list is never appended, until the database is closed or
synced. No block can be reused until the database is synchronized with
the disk. Unfortunately, this makes the database grow up to this point
- but beyond this point, the unused space is reused.
The cache system is a closed hash table of blocks, that has been
read to memory. When a block is needed, it first checks the cache for
the block, and if it isn't in the cache, it disposes of a block in the
cache, and reuses it's space. What block is disposed of is random.
When a block is written to, it is also read to the cache, or
allocated in the cache if it's a new block, the cache data pointer is
returned, and the block is marked as dirty. A dirty block that has to be
reused in the cache is saved to disk before reuse.
Later, when a transaction is completed, the transactions blocks still
marked dirty in the cache is flushed to disk.
Any piece of data has an address to either a fragblock or a block
of it's own. Some bits are used to address the block itself, and a few
bits are used to address the data in a a fragblock, per default 27
respective 5 bits. If the later bits are zero, the address is a block
of its own, but any other value is pointing to an address in a frag
block. This gives 2^n-1 possible data pieces in a frag block, per
default 2^5-1 = 31 pieces.
A piece of data can either fit in a fragblock or a big block, or
not fit at all. In this case, the data cell takes up more then one
block (or one or some block and a piece of a fragblock). These blocks
is ordered in a linked list. The allocation system is supposed to make
these blocks allocated in the right order, if possible, so that
readback will be forward, allowing read-cache optimization in the
operative system to work more efficient.
This structure makes it possible to have a huge hashtable (or
integer key table, as it is in reality), without having to rewrite the
whole table for any minor change. In a write to the table, the
database only writes down the nodes that has change, and the new nodes
neccesary to branch the tree.
It also makes a rather fast way of reading back the keys, which
will be in logarithmic time. Per default there is 32 branches per
node, which gives a write or readback time of approximately
c + d·log32 n, where n is the
number of keys in the table, and c and d is constant. This is
efficient enough in about any case.
Example:
We want to find the key 4711 in a database that is using 5 bits per
hashtrie node (32 branches).
If the cell key had been different to 4711, or any branch didn't
lead anywhere, we would know our key wasn't in this table.
This again makes it possible to control the whole database by
knowing just the top node, and that makes it possible to cancel a
transaction, which is neccesary for atomicity and isolation between
transactions, which is quite useful.
I can be contacted either on <mirar@mirar.org> or at
<mirar@lysator.liu.se>.
Mirds homepage is http://www.mirar.org/mird/, where
any new release can be downloaded.
% ./configure --prefix=path
where path is the base path of installation (the prefix to .../lib
and .../include). You can also run
% ./configure --help
to read about the other options to configure.
Supported platforms
The library is configured by using autoconf. Most Unix- or Posix-like
platforms should be supported, since the library itself does very
little system-specific operations.
It has been tested to compile and run on the following systems:
If you succeed to run it in any other environment, please report the
success to the author.
License
The Mird database library may be freely distributed and used
with these limits¹:
In legalese:
(1) If any part of the source code for this software is
distributed, then this license must be included, with this
copyright and no-warranty notice unaltered; and any additions,
deletions, or changes to the original files must be clearly indicated
in accompanying documentation.
(2) If only executable code is distributed, then the accompanying
documentation must state that "this software is utilizing the Mird
database".
(3) Permission for use of this software is granted only if the
user accepts full responsibility for any undesirable consequences; the
authors accept NO LIABILITY for damages of any kind.
Workflow
Opening, syncing and closing the database
Opening the database consists of two steps, first to create the
database object structure, then to create, open or restore1 the database file.
...
...
Please note that you neither need to change or setup the database
parameters, or sync the database. On a long-lived application (a
server, for instance) the mird_sync operation may help to
keep down the size of the files, though, since both the database and
journal file only may grow up to the point where the database is
synced or closed. At that point, the database starts to reuse any free
space generated from the previous operations and the journal file is
restarted.
MIRD_RES res;
struct mird *my_database;
/* initialize the structure */
if ( (res=mird_initialize("my_database_file.mird",&my_database)) )
[...do something with the error...]
/* change the default parameters if necessary */
my_database->flags|=MIRD_NOCREATE;
my_database->cache_size=1024;
/* open the database file */
if ( (res=mird_open(my_database)) )
[...do something with the error...]
[...]
/* upon need, */
/* sync the database and flush the journal file */
if ( (res=mird_sync(my_database)) )
[...do something with the error...]
[...]
/* close the database */
if ( (res=mird_close(my_database)) )
[...do something with the error...]
Transactions
Any change to the database must be in the scope of a transaction. To
change anything in the database, you must therefore first create a
transaction, in which you do the change to the database, and then
close it:
...
...
Several transactions can be going on at the same time. If you doesn't
have more then one transactions at the same time, you can never get
conflicts.
struct mird_transaction *mtr;
/* begin a new transaction */
if ( (res=mird_transaction_new(my_database,&mtr)) )
[...do something with the error...]
[...operate on the transacton ...]
/* close the transaction */
if ( (res=mird_transaction_close(mtr)) )
{
if (!MIRD_IS_CONFLICT(res))
[...do something with the error...]
[...do something with the conflict...]
mird_free_error(res);
if ( (res=mird_transaction_cancel(mtr)) )
[...do something with the error...]
}
Writing to the database
Within a transaction,
changes to the database is possible.
A table may either be a mapping from an integer (32-bit) key to an
any-length string, or from a string to a string.
To create a table, call mird_key_new_table or
mird_s_key_new_table:
Note that if the table pre-exists, this call fails and the resulting
exception will be MIRD_TABLE_EXISTS.
/* create table 17, integer key to string value table */
if ( (res=mird_key_new_table(mtr,17)) )
[...error...]
/* create table 18, string key to string value table */
if ( (res=mird_s_key_new_table(mtr,18)) )
[...error...]
Any key can be written any number of times in a transaction. Conflicts
appear at first when the transaction
is closed.
/* store an integer key : string value in the table 17 */
mird_key_t int_key;
unsigned char *value;
mird_size_t value_len;
if ( (res=mird_key_store(mtr,17,int_key,value,value_len)) )
[...error...]
/* store a string key : string value in the table 18 */
unsigned char *string_key;
mird_size_t key_len;
unsigned char *value;
mird_size_t value_len;
if ( (res=mird_s_key_store(mtr,18,string_key,key_len,value,value_len)) )
[...error...]
Reading from the database
Reading from the database is done in a similar way as writing to the
database, by calling mird_key_lookup or
mird_s_key_lookup, for integer and string keys, respectively.
When the data read is no longer used, it must be freed by calling mird_free.
Note that if the key doesn't exist in the table, the resulting value
pointer will be NULL. If the table does not exist, or
is the wrong type, this will result in the MIRDE_NO_TABLE
respectively MIRDE_WRONG_TABLE exceptions.
unsigned char *value;
mird_size_t value_len;
/* read from an integer key table (table 17) */
if ( (res=mird_key_lookup(my_database,17,int_key,&value,&value_len)) )
[...error...]
if (!value)
[...action if there was no value from that key...]
/* free the resulting value */
if (value) mird_free(value);
[...]
/* read from a string key table (table 18) */
if ( (res=mird_s_key_lookup(my_database,18,string_key,key_len,&value,&value_len)) )
[...error...]
if (!value)
[...action if there was no value from that key...]
/* free the resulting value */
if (value) mird_free(value);
Reading a whole table
Errors and exceptions
On some occations, the database functions will return an exception.
If the return value is NULL, everything went fine.
int error_no;
char *s;
unsigned x,y,z;
/* do something */
if ( (res=mird_operation([...])) )
{
/* check if it is something we expected */
if (res->error_no==MIRD_NO_TABLE)
{
[ ... do something about that ... ]
mird_free_error(res);
}
else
{
/* it isn't something we expected, */
/* print the error and quit immediately */
mird_perror("my_program (doing some operation)",res);
exit(1);
}
}
Threads
The database library is not thread safe. Almost any operation on the
database must lock the whole database, so this is the solution I
recommend to use in a threaded application. The library is thread safe
if you avoid working on the same database structure, so you can have
more then one database open at the same time and open on both of them.
(There is no global variables in the library itself.)
Implementation
+---------------------------------------------------+
| low-level block storage |
|------------------------------------------|--------|
| block cache system | |
| |------------| |
| | free block | super |
|-------------------------| |-|----| list | blocks |
| frag block system | | | | |
|------------------|------| |------|--| |--| |
| hashtries | data cells | | | | |--------------|
|------------------|------------------| |--| | | journal file |
| tables | | | | | |
| |----------------------------------| |----| |------|--------| |
| | transactions | | | |
|--|-----------------------------------------|-|---------------|-----|
| database |
|--------------------------------------------------------------------|
Low-level blocks, superblocks and the free block list
The database is based on blocks. A block can be in four states, it can be
All blocks are of constant size, per default 2048 bytes.
This makes the database less sensitive to fragmentation.
Block cache system
Directly below the low-level block storage, there is a block cache
system; various other levels use this system as both read- and write
cache.
Fragblocks
For small segments of data, a complete block isn't used. More then one
small piece of data can be stored in a special block, called
fragblock. The fragblock has a table of contents at the start.
Data cells
The hashtrie
The integer-key lookup table is a heavily branching tree - I think the
data structures is called hashtrie. The first node branches off on the
lowest bits of the hash key, the next node branches off on the second
lowest bits, etc. In the end, the leaf is always a data cell. The cell
itself contains the hash key it's supposed to map from, so a full tree
is never neccesary.
top node
|
+----+----+-- ... --+
bits value 0 1 2 n-1
| | | |
node node node node
|
+----+----+-- ... --+
0 1 2 n-1
| | | |
node node node node
|
leaf with data cell
(in this example the key has
2*n+1 as the lowest bits)
Writing the key is a bit more obstacled, since we can't use any other
transactions already existing nodes if we have to write in them. (See
below.)
For each node we have to change, we have to allocate some data
space and copy the old node there, before writing to it - if we didn't
create or write in the node already. This of course makes the node
directly above this node affected to change, so the whole chain back
to the top node has to be rewritten.
String key tables
String key tables, which can be quite more useful then the normal
integer key or hash key table, is handled through a special case of
this hashtrie. The string key is converted to a hash key, which is
used in a normal hashtrie table, but the key is stored in the data
cell, and it is allowed to keep more then one key:value pair in the
data cell. This is neccesary; of course more then one string can be
converted to the same hash key (there are more then 2^32 strings
possible, the string can be any length), but not very often - that is
the main function of the string-to-hash-key conversion, so the
database is still very efficient.
Master table
The database can contain more then one of these tables, and that is
controlled by another special case of the above described hashtrie, called
the master table. The only difference in this table is that the leafs
of this table are not data, but table descriptors. The table
descriptor tells us where the root of the table are, and what kind of
table it is.
Journal file and transactions
Conflicts
Freeing blocks
Cleaning dirty databases
Author
The author is Mirar.
I'm a nice guy, please praise me for my work and report bugs so
the database can be even more stable.