![]() |
|
Functions | |
gsl_rng * | apop_rng_alloc (int seed) |
apop_data * | apop_jackknife_cov (apop_data *in, apop_model *model) |
apop_data * | apop_bootstrap_cov (apop_data *data, apop_model *model, gsl_rng *rng, int iterations, char keep_boots, char ignore_nans) |
Copyright (c) 2006–2007 by Ben Klemens. Licensed under the modified GNU GPL v2; see COPYING and COPYING2.
apop_data* apop_bootstrap_cov | ( | apop_data * | data, |
apop_model * | model, | ||
gsl_rng * | rng, | ||
int | iterations, | ||
char | keep_boots, | ||
char | ignore_nans | ||
) |
Give me a data set and a model, and I'll give you the bootstrapped covariance matrix of the parameter estimates.
data | The data set. An apop_data set where each row is a single data point. (No default) |
model | An apop_model, whose estimate method will be used here. (No default) |
iterations | How many bootstrap draws should I make? (default: 1,000) |
rng | An RNG that you have initialized, probably with apop_rng_alloc . (Default: an RNG from apop_rng_get_thread) |
keep_boots | If 'y', then add a page to the output apop_data set with the statistics calculated for each bootstrap iteration. They are packed via apop_data_pack, so use apop_data_unpack if needed. (Default: 'n') |
ignore_nans | If 'y' and any of the elements in the estimation return NaN , then I will throw out that draw and try again. If 'n' , then I will write that set of statistics to the list, NaN and all. I keep count of throw-aways; if there are more than iterations elements thrown out, then I throw an error and return with estimates using data I have so far. That is, I assume that NaNs are rare edge cases; if they are as common as good data, you might want to rethink how you are using the bootstrap mechanism. (Default: 'n') |
apop_data
set whose matrix element is the estimated covariance matrix of the parameters. out->error=='n' | NULL input data. |
out->error=='N' | too many Nans.
|
apop_data* apop_jackknife_cov | ( | apop_data * | in, |
apop_model * | model | ||
) |
Give me a data set and a model, and I'll give you the jackknifed covariance matrix of the model parameters.
The basic algorithm for the jackknife (with many details glossed over): create a sequence of data sets, each with exactly one observation removed, and then produce a new set of parameter estimates using that slightly shortened data set. Then, find the covariance matrix of the derived parameters.
Jackknife or bootstrap? As a broad rule of thumb, the jackknife works best on models that are closer to linear. The worse a linear approximation does (at the given data), the worse the jackknife approximates the variance.
Sample usage:
in | The data set. An apop_data set where each row is a single data point. |
model | An apop_model, that will be used internally by apop_estimate. |
out->error=='n' | NULL input data. |
apop_data
set whose matrix element is the estimated covariance matrix of the parameters.