Patterns in static

Apophenia

Functions | Variables
apop_update.c File Reference

Functions

double countup (double in)
 
apop_modelapop_update (apop_data *data, apop_model *prior, apop_model *likelihood, gsl_rng *rng)
 

Variables

apop_modelproduct
 

Detailed Description

The apop_update function.

Function Documentation

apop_model* apop_update ( apop_data data,
apop_model prior,
apop_model likelihood,
gsl_rng *  rng 
)

Take in a prior and likelihood distribution, and output a posterior distribution.

This function first checks a table of conjugate distributions for the pair you sent in. If the names match the table, then the function returns a closed-form model with updated parameters. If the parameters aren't in the table of conjugate priors/likelihoods, then it uses Markov Chain Monte Carlo to sample from the posterior distribution, and then outputs a histogram model for further analysis. Notably, the histogram can be used as the input to this function, so you can chain Bayesian updating procedures.

  • If the prior distribution has a p or log_likelihood element, then I use apop_model_metropolis to generate the posterior.
  • If the prior does not have a p or log_likelihood but does have a draw element, then I make draws from the prior and weight them by the p given by the likelihood distribution. This is not a rejection sampling method, so the burnin is ignored.

Here are the conjugate distributions currently defined:

Prior Likelihood Notes
Beta Binomial
Beta Bernoulli
Exponential Gamma Gamma likelihood represents the distribution of $\lambda^{-1}$, not plain $\lambda$
Normal Normal Assumes prior with fixed $\sigma$; updates distribution for $\mu$
Gamma Poisson Uses sum and size of the data
  • The conjugate table is stored using a vtable; see Registering new methods in vtables for details. The typedef new functions must conform to and the hash used for lookups are:
1 typedef apop_model *(*apop_update_type)(apop_data *, apop_model , apop_model);
2 #define apop_update_hash(m1, m2) ((size_t)(m1).draw + (size_t)((m2).log_likelihood ? (m2).log_likelihood : (m2).p)*33)
Parameters
dataThe input data, that will be used by the likelihood function (default = NULL.)
priorThe prior apop_model. If the system needs to estimate the posterior via MCMC, this needs to have a log_likelihood or p method. (No default, must not be NULL.)
likelihoodThe likelihood apop_model. If the system needs to estimate the posterior via MCMC, this needs to have a log_likelihood or p method (ll preferred). (No default, must not be NULL.)
rngA gsl_rng, already initialized (e.g., via apop_rng_alloc). (default: an RNG from apop_rng_get_thread)
Returns
an apop_model struct representing the posterior, with updated parameters.

Here is a test function that compares the output via conjugate table and via Metropolis-Hastings sampling:

#include <apop.h>
//For the test suite.
void distances(gsl_vector *v1, gsl_vector *v2, double tol){
double error = apop_vector_distance(v1, v2, .metric='m');
double updated_size = apop_vector_sum(v1);
Apop_stopif(error/updated_size > tol, exit(1), 0, "The error is %g, which is too big.", error/updated_size);
}
int main(){
gsl_rng *r = apop_rng_alloc(2468);
double binom_start = 0.6;
double beta_start_a = 0.3;
double beta_start_b = 0.5;
double n = 4000;
//First, the easy estimation using the conjugate distribution table.
apop_model *beta = apop_model_set_parameters(apop_beta, beta_start_a, beta_start_b);
apop_model *updated = apop_update(.prior= beta, .likelihood=bin,.rng=r);
//Now estimate via MCMC.
//Requires a one-parameter binomial, with n fixed,
//and a data set of n data points with the right p.
apop_data *bin_draws = apop_data_falloc((1,2), n*(1-binom_start), n*binom_start);
bin = apop_model_fix_params(bcopy);
Apop_settings_add_group(beta, apop_mcmc, .burnin=.2, .periods=1e5);
apop_model *out_h = apop_update(bin_draws, beta, bin, NULL);
apop_model *out_beta = apop_estimate(out_h->data, apop_beta);
//Finally, we can compare the conjugate and Gibbs results:
distances(updated->parameters->vector, out_beta->parameters->vector, 0.01);
//The apop_update function used apop_model_metropolis to generate
//a batch of draws. Let's use apop_model_metropolis_draw to get draws.
int i, draws = 1.3e5;
apop_data *d = apop_data_alloc(draws, 1);
for(i=0; i < draws; i ++)
apop_draw(apop_data_ptr(d, i, 0), r, out_h);
distances(updated->parameters->vector, drawn->parameters->vector, 0.02);
}

Variable Documentation

apop_model* product
Initial value:
= &(apop_model){"product of two models",
.log_likelihood=product_ll, .constraint=product_constraint}
struct apop_model apop_model
Definition: apop.h:99

Autogenerated by doxygen on Sun Oct 26 2014 (Debian 0.999b+ds3-2).