Patterns in static

Apophenia

Functions
Conversion functions

Functions

gsl_vector * apop_array_to_vector (double *in, int size)
 
gsl_matrix * apop_vector_to_matrix (const gsl_vector *in, char row_col)
 
apop_dataapop_text_to_data (char const *text_file, int has_row_names, int has_col_names, int const *field_ends, char const *delimiters)
 
void apop_data_unpack (const gsl_vector *in, apop_data *d, char use_info_pages)
 
gsl_vector * apop_data_pack (const apop_data *in, gsl_vector *out, char all_pages, char use_info_pages)
 
int apop_text_to_db (char const *text_file, char *tabname, int has_row_names, int has_col_names, char **field_names, int const *field_ends, apop_data *field_params, char *table_params, char const *delimiters, char if_table_exists)
 
int apop_data_to_db (const apop_data *set, const char *tabname, const char output_append)
 

Detailed Description

The functions to shunt data between text files, database tables, GSL matrices, and plain old arrays.

Converting from database table to <tt>gsl_matrix</tt> or \ref apop_data

Use fill_me = apop_query_to_matrix("select * from table_name;"); or fill_me = apop_query_to_data("select * from table_name;");. [See apop_query_to_matrix; apop_query_to_data.]

Function Documentation

gsl_vector* apop_array_to_vector ( double *  in,
int  size 
)

Just copies a one-dimensional array to a gsl_vector. The input array is undisturbed.

Parameters
inAn array of doubles. (No default. Must not be NULL);
sizeHow long line is. If this is zero or omitted, I'll guess using the sizeof(line)/sizeof(line[0]) trick, which will work for most arrays allocated using double [] and won't work for those allocated using double *. (default = auto-guess)
Returns
A gsl_vector (which I will allocate for you).
  • If you send in a NULL vector, you get a NULL pointer in return. I warn you of this if apop_opts.verbosity >=1 .
gsl_vector* apop_data_pack ( const apop_data in,
gsl_vector *  out,
char  all_pages,
char  use_info_pages 
)

This function takes in an apop_data set and writes it as a single column of numbers, outputting a gsl_vector. It is valid to use the out_vector->data element as an array of doubles of size out_vector->data->size (i.e. its stride==1).

The complement is apop_data_unpack. I.e.,

1 apop_data_unpack(apop_data_pack(in_data), data_copy)

will return the original data set (stripped of text and names).

Parameters
inan apop_data set. No default; if NULL, return NULL.
outIf this is not NULL, then put the output here. The dimensions must match exactly. If NULL, then allocate a new data set. Default = NULL.
all_pagesIf 'y', then follow the ->more pointer to fill subsequent pages; else fill only the first page. Informational pages will still be ignored, unless you set .use_info_pages='y' as well. Default = 'y'.
use_info_pagesPages in XML-style brackets, such as <Covariance> will be ignored unless you set .use_info_pages='y'. Be sure that this is set to the same thing when you both pack and unpack. Default: 'n'.
Returns
A gsl_vector with the vector data (if any), then each row of data (if any), then the weights (if any), then the same for subsequent pages (if any && .all_pages=='y'). If out is not NULL, then this is out.
Exceptions
NULLIf you give me a vector as input, and its size is not correct, returns NULL.
int apop_data_to_db ( const apop_data set,
const char *  tabname,
const char  output_append 
)

Dump an apop_data set into the database.

This function is basically preempted by apop_data_print. Use that one; this may soon no longer be available.

Column names are inserted if there are any. If there are, all dots are converted to underscores. Otherwise, the columns will be named c1, c2, c3, &c.

  • If apop_opts.db_name_column is not blank (the default is "row_name"), then a so-named column is created, and the row names are placed there.
  • If there are weights, they will be the last column of the table, and the column will be named "weights".
  • If the table exists; append to. If the table does not exist, create. So perhaps call apop_table_exists ("tabname", 'd') to ensure that the table is removed ahead of time.
  • You can also call this via apop_data_print (data, "tabname", .output_type='d', .output_append='w') to overwrite a new table or with .output_append='a' to append.
  • If your data set has zero data (i.e., is just a list of column names or is entirely blank), I return -1 without creating anything in the database.
  • Especially if you are using a pre-2007 version of SQLite, there may be a speed gain to wrapping the call to this function in a begin/commit pair:
1 apop_query("begin;");
2 apop_data_print(dataset, .output_name="dbtab", .output_type='d');
3 apop_query("commit;");
Parameters
setThe name of the matrix
tabnameThe name of the db table to be created
output_appendSee apop_prep_output.
Returns
0=OK, -1=error
void apop_data_unpack ( const gsl_vector *  in,
apop_data d,
char  use_info_pages 
)

This is the complement to apop_data_pack, qv. It writes the gsl_vector produced by that function back to the apop_data set you provide. It overwrites the data in the vector and matrix elements and, if present, the weights (and that's it, so names or text are as before).

Parameters
inA gsl_vector of the form produced by apop_data_pack. No default; must not be NULL.
dThat data set to be filled. Must be allocated to the correct size. No default; must not be NULL.
use_info_pagesPages in XML-style brackets, such as <Covariance> will be ignored unless you set .use_info_pages='y'. Be sure that this is set to the same thing when you both pack and unpack. Default: 'n'.
  • If I get to the end of the first page and have more vector to unpack, and the data to fill has a more element, then I will continue into subsequent pages.
apop_data* apop_text_to_data ( char const *  text_file,
int  has_row_names,
int  has_col_names,
int const *  field_ends,
char const *  delimiters 
)

Read a delimited text file into the matrix element of an apop_data set.

See Notes on input text file formatting.

Parameters
text_file= "-" The name of the text file to be read in. If "-" (the default), use stdin.
has_row_names= 'n'. Does the lines of data have row names?
has_col_names= 'y'. Is the top line a list of column names? If there are row names, then there should be no first entry in this line like 'row names'. That is, for a 100x100 data set with row and column names, there are 100 names in the top row, and 101 entries in each subsequent row (name plus 100 data points).
field_endsIf fields have a fixed size, give the end of each field, e.g. {3, 8 11}.
delimitersA string listing the characters that delimit fields. default = "|,\t"
Returns
Returns an apop_data set.
Exceptions
out->error=='a'allocation error
out->error=='t'text-reading error

example: See apop_ols.

int apop_text_to_db ( char const *  text_file,
char *  tabname,
int  has_row_names,
int  has_col_names,
char **  field_names,
int const *  field_ends,
apop_data field_params,
char *  table_params,
char const *  delimiters,
char  if_table_exists 
)

Read a text file into a database table.

See Notes on input text file formatting.

See the apop_ols page for an example that uses this function to read in sample data (also listed on that page).

Especially if you are using a pre-2007 version of SQLite, there may be a speedup to putting this function in a begin/commit wrapper:

1 apop_query("begin;");
2 apop_data_print(dataset, .output_name="dbtab", .output_type='d');
3 apop_query("commit;");
Parameters
text_fileThe name of the text file to be read in. If "-", then read from STDIN. (default = "-")
tabnameThe name to give the table in the database (default = text_file up to the first dot, e.g., text_file=="pant_lengths.csv" gives tabname=="pant_lengths"; default in Python/R interfaces="t")
has_row_namesDoes the lines of data have row names? (default = 0)
has_col_namesIs the top line a list of column names? (default = 1)
field_namesThe list of field names, which will be the columns for the table. If has_col_names==1, read the names from the file (and just set this to NULL). If has_col_names == 1 && field_names !=NULL, I'll use the field names. (default = NULL)
field_endsIf fields have a fixed size, give the end of each field, e.g. {3, 8 11}.
field_paramsThere is an implicit create table in setting up the database. If you want to add a type, constraint, or key, put that here. The relevant part of the input apop_data set is the text grid, which should be $N \times 2$. The first item in each row (your_params->text[n][0], for each $n$) is a regular expression to match against the variable names; the second item (your_params->text[n][1]) is the type, constraint, and/or key (i.e., what comes after the name in the create query). Not all variables need be mentioned; the default type if nothing matches is numeric. I go in order until I find a regex that matches the given field, so if you don't like the default, then set the last row to have name .*, which is a regex guaranteed to match anything that wasn't matched by an earlier row, and then set the associated type to your preferred default. See apop_regex on details of matching.
table_paramsThere is an implicit create table in setting up the database. If you want to add a table constraint or key, such as not null primary key (age, sex), put that here.
delimitersA string listing the characters that delimit fields. default = "|,\t"
if_table_existsWhat should I do if the table exists?
'n' Do nothing; exit this function. (default)
'd' Retain the table but delete all data; refill with the new data (i.e., call "delete * from your_table").
'o' Overwrite the table from scratch; deleting the previous table entirely.
'a' Append new data to the existing table.
Returns
Returns the number of rows on success, -1 on error.
gsl_matrix* apop_vector_to_matrix ( const gsl_vector *  in,
char  row_col 
)

Mathematically, a vector of size $N$ and a matrix of size $N \times 1 $ are equivalent, but they're two different types to the GSL. This function copies the data in a vector to a new one-column (or one-row) matrix and returns the newly-allocated and filled matrix.

For the reverse, try apop_data_pack.

Parameters
ina gsl_vector (No default. If NULL, I return NULL, with a warning if apop_opts.verbose >=1 )
row_colIf 'r', then this will be a row (1 x N) instead of the default, a column (N x 1). (default: 'c')
Returns
a newly-allocated gsl_matrix with one column (or row).
  • If you send in a NULL vector, you get a NULL pointer in return. I warn you of this if apop_opts.verbosity >=1 .
  • If gsl_matrix_alloc fails and apop_opts.stop_on_warn=='n', you get a NULL pointer in return.
  • This function uses the Designated initializers syntax for inputs.

Autogenerated by doxygen on Sun Oct 26 2014 (Debian 0.999b+ds3-2).