![]() |
|
Functions | |
gsl_vector * | apop_array_to_vector (double *in, int size) |
gsl_matrix * | apop_vector_to_matrix (const gsl_vector *in, char row_col) |
apop_data * | apop_text_to_data (char const *text_file, int has_row_names, int has_col_names, int const *field_ends, char const *delimiters) |
void | apop_data_unpack (const gsl_vector *in, apop_data *d, char use_info_pages) |
gsl_vector * | apop_data_pack (const apop_data *in, gsl_vector *out, char all_pages, char use_info_pages) |
int | apop_text_to_db (char const *text_file, char *tabname, int has_row_names, int has_col_names, char **field_names, int const *field_ends, apop_data *field_params, char *table_params, char const *delimiters, char if_table_exists) |
int | apop_data_to_db (const apop_data *set, const char *tabname, const char output_append) |
The functions to shunt data between text files, database tables, GSL matrices, and plain old arrays.
fill_me = apop_query_to_matrix("select * from table_name;");
or fill_me = apop_query_to_data("select * from table_name;");
. [See apop_query_to_matrix; apop_query_to_data.] gsl_vector* apop_array_to_vector | ( | double * | in, |
int | size | ||
) |
Just copies a one-dimensional array to a gsl_vector
. The input array is undisturbed.
in | An array of double s. (No default. Must not be NULL ); |
size | How long line is. If this is zero or omitted, I'll guess using the sizeof(line)/sizeof(line[0]) trick, which will work for most arrays allocated using double [] and won't work for those allocated using double * . (default = auto-guess) |
gsl_vector
(which I will allocate for you).NULL
vector, you get a NULL
pointer in return. I warn you of this if apop_opts.verbosity >=1
.gsl_vector* apop_data_pack | ( | const apop_data * | in, |
gsl_vector * | out, | ||
char | all_pages, | ||
char | use_info_pages | ||
) |
This function takes in an apop_data set and writes it as a single column of numbers, outputting a gsl_vector
. It is valid to use the out_vector->data
element as an array of doubles
of size out_vector->data->size
(i.e. its stride==1
).
The complement is apop_data_unpack
. I.e.,
will return the original data set (stripped of text and names).
in | an apop_data set. No default; if NULL , return NULL . |
out | If this is not NULL , then put the output here. The dimensions must match exactly. If NULL , then allocate a new data set. Default = NULL . |
all_pages | If 'y' , then follow the ->more pointer to fill subsequent pages; else fill only the first page. Informational pages will still be ignored, unless you set .use_info_pages='y' as well. Default = 'y' . |
use_info_pages | Pages in XML-style brackets, such as <Covariance> will be ignored unless you set .use_info_pages='y' . Be sure that this is set to the same thing when you both pack and unpack. Default: 'n' . |
gsl_vector
with the vector data (if any), then each row of data (if any), then the weights (if any), then the same for subsequent pages (if any && .all_pages=='y'
). If out
is not NULL
, then this is out
. NULL | If you give me a vector as input, and its size is not correct, returns NULL .
|
int apop_data_to_db | ( | const apop_data * | set, |
const char * | tabname, | ||
const char | output_append | ||
) |
Dump an apop_data set into the database.
This function is basically preempted by apop_data_print. Use that one; this may soon no longer be available.
Column names are inserted if there are any. If there are, all dots are converted to underscores. Otherwise, the columns will be named c1
, c2
, c3
, &c.
("tabname", 'd')
to ensure that the table is removed ahead of time.(data, "tabname", .output_type='d', .output_append='w')
to overwrite a new table or with .output_append='a'
to append.set | The name of the matrix |
tabname | The name of the db table to be created |
output_append | See apop_prep_output. |
void apop_data_unpack | ( | const gsl_vector * | in, |
apop_data * | d, | ||
char | use_info_pages | ||
) |
This is the complement to apop_data_pack
, qv. It writes the gsl_vector
produced by that function back to the apop_data
set you provide. It overwrites the data in the vector and matrix elements and, if present, the weights
(and that's it, so names or text are as before).
in | A gsl_vector of the form produced by apop_data_pack . No default; must not be NULL . |
d | That data set to be filled. Must be allocated to the correct size. No default; must not be NULL . |
use_info_pages | Pages in XML-style brackets, such as <Covariance> will be ignored unless you set .use_info_pages='y' . Be sure that this is set to the same thing when you both pack and unpack. Default: 'n' . |
more
element, then I will continue into subsequent pages.apop_data* apop_text_to_data | ( | char const * | text_file, |
int | has_row_names, | ||
int | has_col_names, | ||
int const * | field_ends, | ||
char const * | delimiters | ||
) |
Read a delimited text file into the matrix element of an apop_data set.
See Notes on input text file formatting.
text_file | = "-" The name of the text file to be read in. If "-" (the default), use stdin. |
has_row_names | = 'n'. Does the lines of data have row names? |
has_col_names | = 'y'. Is the top line a list of column names? If there are row names, then there should be no first entry in this line like 'row names'. That is, for a 100x100 data set with row and column names, there are 100 names in the top row, and 101 entries in each subsequent row (name plus 100 data points). |
field_ends | If fields have a fixed size, give the end of each field, e.g. {3, 8 11}. |
delimiters | A string listing the characters that delimit fields. default = "|,\t" |
out->error=='a' | allocation error |
out->error=='t' | text-reading error |
example: See apop_ols.
int apop_text_to_db | ( | char const * | text_file, |
char * | tabname, | ||
int | has_row_names, | ||
int | has_col_names, | ||
char ** | field_names, | ||
int const * | field_ends, | ||
apop_data * | field_params, | ||
char * | table_params, | ||
char const * | delimiters, | ||
char | if_table_exists | ||
) |
Read a text file into a database table.
See Notes on input text file formatting.
See the apop_ols page for an example that uses this function to read in sample data (also listed on that page).
Especially if you are using a pre-2007 version of SQLite, there may be a speedup to putting this function in a begin/commit wrapper:
text_file | The name of the text file to be read in. If "-" , then read from STDIN . (default = "-") |
tabname | The name to give the table in the database (default = text_file up to the first dot, e.g., text_file=="pant_lengths.csv" gives tabname=="pant_lengths" ; default in Python/R interfaces="t") |
has_row_names | Does the lines of data have row names? (default = 0) |
has_col_names | Is the top line a list of column names? (default = 1) |
field_names | The list of field names, which will be the columns for the table. If has_col_names==1 , read the names from the file (and just set this to NULL ). If has_col_names == 1 && field_names !=NULL, I'll use the field names. (default = NULL) |
field_ends | If fields have a fixed size, give the end of each field, e.g. {3, 8 11}. |
field_params | There is an implicit create table in setting up the database. If you want to add a type, constraint, or key, put that here. The relevant part of the input apop_data set is the text grid, which should be ![]() your_params->text[n][0] , for each ![]() your_params->text[n][1] ) is the type, constraint, and/or key (i.e., what comes after the name in the create query). Not all variables need be mentioned; the default type if nothing matches is numeric . I go in order until I find a regex that matches the given field, so if you don't like the default, then set the last row to have name .* , which is a regex guaranteed to match anything that wasn't matched by an earlier row, and then set the associated type to your preferred default. See apop_regex on details of matching. |
table_params | There is an implicit create table in setting up the database. If you want to add a table constraint or key, such as not null primary key (age, sex) , put that here. |
delimiters | A string listing the characters that delimit fields. default = "|,\t" |
if_table_exists | What should I do if the table exists?'n' Do nothing; exit this function. (default)'d' Retain the table but delete all data; refill with the new data (i.e., call "delete * from your_table" ).'o' Overwrite the table from scratch; deleting the previous table entirely.'a' Append new data to the existing table. |
gsl_matrix* apop_vector_to_matrix | ( | const gsl_vector * | in, |
char | row_col | ||
) |
Mathematically, a vector of size and a matrix of size
are equivalent, but they're two different types to the GSL. This function copies the data in a vector to a new one-column (or one-row) matrix and returns the newly-allocated and filled matrix.
For the reverse, try apop_data_pack.
in | a gsl_vector (No default. If NULL , I return NULL , with a warning if apop_opts.verbose >=1 ) |
row_col | If 'r' , then this will be a row (1 x N) instead of the default, a column (N x 1). (default: 'c' ) |
gsl_matrix
with one column (or row).NULL
vector, you get a NULL
pointer in return. I warn you of this if apop_opts.verbosity >=1
. gsl_matrix_alloc
fails and apop_opts.stop_on_warn=='n'
, you get a NULL
pointer in return.