Usage¶
To use Bayes Logistic Regression in a project:
import bayes_logistic
Methods¶
-
bayes_logistic.
bayes_logistic_prob
(X, w, H)[source]¶ - Posterior predictive logistic regression probability. Uses probit approximation
- to the logistic regression sigmoid. Also has overflow prevention via exponent truncation.
- X : array-like, shape (N, p)
- array of covariates
- w : array-like, shape (p, )
- array of fitted MAP parameters
- H : array-like, shape (p, p) or (p, )
- array of log posterior Hessian (covariance matrix of fitted MAP parameters)
- pr : array-like, shape (N, )
- moderated (by full distribution) logistic probability
Chapter 8 of Murphy, K. ‘Machine Learning a Probabilistic Perspective’, MIT Press (2012) Chapter 4 of Bishop, C. ‘Pattern Recognition and Machine Learning’, Springer (2006)
-
bayes_logistic.
get_pvalues
(w, H)[source]¶ Calculates p-values on fitted parameters. This can be used for variable selection by, for example, discarding every parameter with a p-value less than 0.05 (or some other cutoff)
- w : array-like, shape (p, )
- array of posterior means on the fitted parameters
- H : array-like, shape (p, p) or (p, )
- array of log posterior Hessian
- pvals : array-like, shape (p, )
- array of p-values for each of the fitted parameters
Chapter 2 of Pawitan, Y. ‘In All Likelihood’, Oxford University Press (2013) Also see: Gerhard, F. ‘Extraction of network topology from multi-electrode recordings: is there a small world effect’, Frontiers in Computational Neuroscience (2011) for a use case of p-value based variable selection.
-
bayes_logistic.
fit_bayes_logistic
(y, X, wprior, H, weights=None, solver='Newton-CG', bounds=None, maxiter=100)[source]¶ Bayesian Logistic Regression Solver. Assumes Laplace (Gaussian) Approximation to the posterior of the fitted parameter vector. Uses scipy.optimize.minimize
- y : array-like, shape (N, )
- array of binary {0,1} responses
- X : array-like, shape (N, p)
- array of features
- wprior : array-like, shape (p, )
- array of prior means on the parameters to be fit
- H : array-like, shape (p, p) or (p, )
- array of prior Hessian (inverse covariance of prior distribution of parameters)
- weights : array-like, shape (N, )
- array of data point weights. Should be within [0,1]
- solver : string
- scipy optimize solver used. this should be either ‘Newton-CG’, ‘BFGS’ or ‘L-BFGS-B’. The default is Newton-CG.
- bounds : iterable of length p
- a length p list (or tuple) of tuples each of length 2. This is only used if the solver is set to ‘L-BFGS-B’. In that case, a tuple (lower_bound, upper_bound), both floats, is defined for each parameter. See the scipy.optimize.minimize docs for further information.
- maxiter : int
- maximum number of iterations for scipy.optimize.minimize solver.
- w_fit : array-like, shape (p, )
- posterior parameters (MAP estimate)
- H_fit : array-like, shape like H
- posterior Hessian (Hessian of negative log posterior evaluated at MAP parameters)
Chapter 8 of Murphy, K. ‘Machine Learning a Probabilistic Perspective’, MIT Press (2012) Chapter 4 of Bishop, C. ‘Pattern Recognition and Machine Learning’, Springer (2006)