The DB-All.e Python bindings provide 2 levels of access to a DB-All.e database: a complete API similar to the Fortran and C++ API, and a high-level API called volnd that allows to automatically export matrices of data out of the database.
Contents
The 'dballe' module has a few global methods:
- describe_level(ltype1=None, l1=None, ltype2=None, l2=None)
- Return a string description for a level
- describe_trange(pind=None, p1=None, p2=None)
- Return a string description for a time range
- var(varcode[, default])
- Query the DB-All.e variable table returning a Var, optionally initialized with a value
- varinfo(varcode)
- Query the DB-All.e variable table returning a Varinfo
and several classes, documented in their own sections.
a Var holds a measured value and all available information related to it.
To create a Var, use the method dballe.var.
Its members are:
- code
- variable code
- info
- Varinfo for this variable
- isset
- true if the value is set
- enq()
- get the value of the variable, as int, float or str according the variable definition
- enqc()
- get the value of the variable, as a str
- enqd()
- get the value of the variable, as a float
- enqi()
- get the value of the variable, as an int
- format(default='')
- format the value of the variable to a string
- get(default=None)
- get the value of the variable, with a default if it is unset
Examples:
v = dballe.var("B12101", 32.5) # v.info returns detailed informations about the variable in a Varinfo object. print "%s: %s %s %s" % (v.code, str(v), v.info.unit, v.info.desc)
a Varinfo holds all possible information about a variable, such as its measurement unit, description and number of significant digits.
Its members are:
- bit_len
- number of bits used to encode the value in BUFR
- bit_ref
- reference value added after scaling, for BUFR decoding
- desc
- description
- is_string
- true if the value is a string
- len
- number of significant digits
- ref
- reference value added after scaling
- scale
- scale of the value as a power of 10
- unit
- measurement unit
- var
- variable code
a Record holds one or more Var variables, together with a range of metadata key=value pairs. The available metadata pairs are documented in the Fortran API documentation.
A Record is used to make queries to the database, and read results.
Its members are:
- key
- return a var key from the record
- clear()
- remove all data from the record
- clear_vars()
- remove all variables from the record, leaving the keywords intact
- copy()
- return a copy of the Record
- date_extremes()
- get two datetime objects with the lower and upper bounds of the datetime period in this record
- get(key, default=None)
- lookup a value, returning a fallback value (None by default) if unset
- keys()
- return a sequence with all the varcodes of the variables set on the Record. Note that this does not include keys.
- set_from_string(str)
- set values from a 'key=val' string
- set_station_context()
- set the date, level and time range values to match the station data context
- update(**kwargs)
- set many record keys/vars in a single shot, via kwargs
- var(code=None)
- return a variable from the record. If no varcode is given, use record['var']
- vars()
- return a sequence with all the variables set on the Record. Note that this does not include keys.
When creating a new record, keyword arguments can be passed and they are set as if Record.update(**kwargs) had been called.
There are 6 extra keys available in the Python API, which can be used as shortcuts to get and set many values in one shot:
- date
- a datetime.datetime()
- datemin
- a datetime.datetime()
- datemax
- a datetime.datetime()
- level
- a tuple of integers
- trange
- a tuple of integers
- timerange
- a tuple of integers
Examples:
rec = Record(lat=44.05, lon=11.03, B12101=22.1) # Metadata and variables can be accessed via normal lookup print rec["lat"], rec["B12101"] # Iterating a record iterates on variable codes, but not metadata for code in rec: print code, rec.get(code, "undefined"), rec.var(code).info.desc
a DB is used to access the database.
Its members are:
- query_summary
- Query the summary of the results of a query; returns a Cursor
- attr_insert(varcode, attrs, reference_id=None, replace=True)
- Insert new attributes into the database
- attr_remove(varcode, reference_id, attrs=None)
- Remove attributes
- connect(dsn, user='', password='')
- Create a DB connecting to an ODBC source
- connect_from_file(filename)
- Create a DB connecting to a SQLite file
- connect_from_url(url)
- Create a DB as defined in an URL-like string
- connect_test()
- Create a DB for running the test suite, as configured in the test environment
- disappear()
- Remove all our traces from the database, if applicable.
- export_to_file(query, format, filename, generic=False)
- Export data matching a query as bulletins to a named file
- insert(record, can_replace=False, can_add_stations=False)
- Insert a record in the database
- is_url(string)
- Checks if a string looks like a DB url
- query_attrs(varcode, reference_id, attrs=None)
- Query attributes
- query_data(query)
- Query the variables in the database; returns a Cursor
- query_stations(query)
- Query the station archive in the database; returns a Cursor
- remove(query)
- Remove records from the database
- reset([repinfo_filename])
- Reset the database, removing all existing Db-All.e tables and re-creating them empty.
- vacuum()
- Perform database cleanup operations
Examples:
# Connect to a database and run a query db = dballe.DB.connect_from_file("db.sqlite") query = dballe.Record(latmin=44.0, latmax=45.0, lonmin=11.0, lonmax=12.0) # The result is a dballe.Cursor, which can be iterated to get results as # dballe.Record objects. # The results always point to the same Record to avoid creating a new one # for every iteration: if you need to store them, use Record.copy() for rec in db.query_data(query): print rec["lat"], rec["lon"], rec["var"], rec.var().format("undefined") # Insert 2 new variables in the database rec = dballe.Record( lat=44.5, lon=11.4, level=(1,), trange=(254,), date=datetime.datetime(2013, 4, 25, 12, 0, 0), B11101=22.4, B12103=17.2, ) db.insert(rec)
a Cursor is the result of database queries. It is generally not used explicitly and just iterated, but it does have a few members:
- remaining
- number of results still to be returned
- next()
- x.next() -> the next value, or raise StopIteration
- query_attrs(attrs=None)
- Query attributes for the current variable
volnd is an easy way of extracting entire matrixes of data out of a DB-All.e database.
This module allows to extract multidimensional matrixes of data given a list of dimension definitions. Every dimension definition defines what kind of data goes along that dimension.
Dimension definitions can be shared across different extracted matrixes and multiple extractions, allowing to have different matrixes whose indexes have the same meaning.
This example code extracts temperatures in a station by datetime matrix:
query = dballe.Record() query["var"] = "B12001" query["rep_memo"] = "synop" query["level"] = (105, 2) query["trange"] = (0,) vars = read(self.db.query(query), (AnaIndex(), DateTimeIndex())) data = vars["B12001"] # Data is now a 2-dimensional Masked Array with the data # # Information about what values correspond to an index in the various # directions can be accessed in data.dims, which contains one list per # dimension with all the information corresponding to every index. print "Ana dimension is", len(data.dims[0]), "items long" print "Datetime dimension is", len(data.dims[1]), "items long" print "First 10 stations along the Ana dimension:", data.dims[0][:10] print "First 10 datetimes along the DateTime dimension:", data.dims[1][:10]
This is the list of dimensions supported by dballe.volnd:
- AnaIndex
Index for stations, as they come out of the database.
The constructor syntax is: AnaIndex(shared=True, frozen=False, start=None).
The index saves all stations as AnaIndexEntry tuples, in the same order as they come out of the database.
- NetworkIndex
Index for networks, as they come out of the database.
The constructor syntax is: NetworkIndex(shared=True, frozen=False, start=None).
The index saves all networks as NetworkIndexEntry tuples, in the same order as they come out of the database.
- LevelIndex
Index for levels, as they come out of the database
The constructor syntax is: LevelIndex(shared=True, frozen=False), start=None.
The index saves all levels as dballe.Level tuples, in the same order as they come out of the database.
- TimeRangeIndex
Index for time ranges, as they come out of the database.
The constructor syntax is: TimeRangeIndex(shared=True, frozen=False, start=None).
The index saves all time ranges as dballe.TimeRange tuples, in the same order as they come out of the database.
- DateTimeIndex
Index for datetimes, as they come out of the database.
The constructor syntax is: DateTimeIndex(shared=True, frozen=False, start=None).
The index saves all datetime values as datetime.datetime objects, in the same order as they come out of the database.
- IntervalIndex
Index by fixed time intervals: index points are at fixed time intervals, and data is acquired in one point only if it is within a given tolerance from the interval.
The constructor syntax is: IntervalIndex(start, step, tolerance=0, end=None, shared=True, frozen=False).
start is a datetime.datetime object giving the starting time of the time interval of this index.
step is a datetime.timedelta object with the interval between sampling points.
tolerance is a datetime.timedelta object specifying the maximum allowed interval between a datum datetime and the sampling step. If the interval is bigger than the tolerance, the data is discarded.
end is an optional datetime.datetime object giving the ending time of the time interval of the index. If omitted, the index will end at the latest accepted datum coming out of the database.
The data objects used by AnaIndex and NetworkIndex are:
- AnaIndexEntry
AnaIndex entry, with various data about a single station.
- It is a tuple of 4 values:
- station id
- latitude
- longitude
- mobile station identifier, or None
- NetworkIndexEntry
NetworkIndex entry, with various data about a single station.
- It is a tuple of 2 values:
- network code
- network name
The extraction is done using the dballe.volnd.read function:
- read(cursor, dims, filter=None, checkConflicts=True, attributes=None)
cursor is a dballe.Cursor resulting from a dballe query
dims is the sequence of indexes to use for shaping the data matrixes
filter is an optional filter function that can be used to discard values from the query: if filter is not None, it will be called for every output record and if it returns False, the record will be discarded
checkConflicts tells if we should raise an exception if two values from the database would fill in the same position in the matrix
attributes tells if we should read attributes as well: if it is None, no attributes will be read; if it is True, all attributes will be read; if it is a sequence, then it is the sequence of attributes that should be read.
The result of dballe.volnd.read is a dict mapping output variable names to a dballe.volnd.Data object with the results. All the Data objects share their indexes unless the xxx-Index definitions have been created with shared=False.
This is the dballe.volnd.Data class documentation:
- Data
Container for collecting variable data. It contains the variable data array and the dimension indexes.
If v is a Data object, you can access the tuple with the dimensions as v.dims, and the masked array with the values as v.vals.
The methods of dballe.volnd.Data are:
- append
Collect a new value from the given dballe record.
You need to call finalise() before the values can be used.
- appendAttrs
Collect attributes to append to the record.
You need to call finalise() before the values can be used.
- finalise
- Stop collecting values and create a masked array with all the values collected so far.