PoissonGAM#
- class pygam.pygam.PoissonGAM(terms='auto', max_iter=100, tol=0.0001, callbacks=['deviance', 'diffs'], fit_intercept=True, verbose=False, **kwargs)[source]#
Bases:
GAMPoisson GAM.
This is a GAM with a Poisson error distribution, and a log link.
- Parameters:
- termsexpression specifying terms to model, optional.
By default a univariate spline term will be allocated for each feature.
For example:
>>> GAM(s(0) + l(1) + f(2) + te(3, 4))
will fit a spline term on feature 0, a linear term on feature 1, a factor term on feature 2, and a tensor term on features 3 and 4.
- callbackslist of str or list of CallBack objects, optional
Names of callback objects to call during the optimization loop.
- fit_interceptbool, optional
Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function. Note: the intercept receives no smoothing penalty.
- max_iterint, optional
Maximum number of iterations allowed for the solver to converge.
- tolfloat, optional
Tolerance for stopping criteria.
- verbosebool, optional
whether to show pyGAM warnings.
- Attributes:
- coef_array, shape (n_classes, m_features)
Coefficient of the features in the decision function. If fit_intercept is True, then self.coef_[-1] will contain the bias.
- statistics_dict
Dictionary containing model statistics like GCV/UBRE scores, AIC/c, parameter covariances, estimated degrees of freedom, etc.
- logs_dict
Dictionary containing the outputs of any callbacks at each optimization loop.
The logs are structured as
{callback: [...]}
Methods
confidence_intervals(X[, width, quantiles])Estimate confidence intervals for the model.
deviance_residuals(X, y[, weights, scaled])Method to compute the deviance residuals of the model.
fit(X, y[, exposure, weights])Fit the generalized additive model.
generate_X_grid(term[, n, meshgrid])Create a nice grid of X data.
get_params([deep])Returns a dict of all of the object's user-facing parameters.
gridsearch(X, y[, exposure, weights, ...])Performs a grid search over a space of parameters for a given objective.
loglikelihood(X, y[, exposure, weights])Compute the log-likelihood of the dataset using the current model.
partial_dependence(term[, X, width, ...])Computes the term functions for the GAM and possibly their confidence intervals.
predict(X[, exposure])Predict expected value of target given model and input X often this is done via expected value of GAM given input X.
predict_mu(X)Predict expected value of target given model and input X
sample(X, y[, quantity, sample_at_X, ...])Simulate from the posterior of the coefficients and smoothing params.
score(X, y[, weights])Compute the explained deviance for a trained model for a given X data and y labels.
set_params([deep, force])Sets an object's parameters.
summary()Produce a summary of the model statistics.
References
Simon N. Wood, 2006Generalized Additive Models: an introduction with RHastie, Tibshirani, FriedmanThe Elements of Statistical LearningPaul Eilers, Brian Marx, and Maria Durbán, 2015Twenty years of P-splines- confidence_intervals(X, width=0.95, quantiles=None)[source]#
Estimate confidence intervals for the model.
- Parameters:
- Xarray-like of shape (n_samples, m_features)
Input data matrix
- widthfloat on [0,1], optional
- quantilesarray-like of floats in (0, 1), optional
Instead of specifying the prediction width, one can specify the quantiles. So
width=.95is equivalent toquantiles=[.025, .975]
- Returns:
- intervals: np.array of shape (n_samples, 2 or len(quantiles))
Notes
- Wood 2006, section 4.9
Confidence intervals based on section 4.8 rely on large sample results to deal with non-Gaussian distributions, and treat the smoothing parameters as fixed, when in reality they are estimated from the data.
- deviance_residuals(X, y, weights=None, scaled=False)[source]#
Method to compute the deviance residuals of the model.
these are analogous to the residuals of an OLS.
- Parameters:
- Xarray-like
Input data array of shape (n_samples, m_features)
- yarray-like
Output data vector of shape (n_samples, )
- weightsarray-like shape (n_samples, ) or None, optional
Sample weights. if None, defaults to array of ones
- scaledbool, optional
whether to scale the deviance by the (estimated) distribution scale
- Returns:
- deviance_residualsnp.array
with shape (n_samples, )
- fit(X, y, exposure=None, weights=None)[source]#
Fit the generalized additive model.
- Parameters:
- Xarray-like, shape (n_samples, m_features)
Training vectors, where n_samples is the number of samples and m_features is the number of features.
- yarray-like, shape (n_samples, )
Target values (integers in classification, real numbers in regression) For classification, labels must correspond to classes.
- exposurearray-like shape (n_samples, ) or None, default: None
containing exposures if None, defaults to array of ones
- weightsarray-like shape (n_samples, ) or None, default: None
containing sample weights if None, defaults to array of ones
- Returns:
- selfobject
Returns fitted GAM object
- generate_X_grid(term, n=100, meshgrid=False)[source]#
Create a nice grid of X data.
array is sorted by feature and uniformly spaced, so the marginal and joint distributions are likely wrong
if term is >= 0, we generate n samples per feature, which results in n^deg samples, where deg is the degree of the interaction of the term
- Parameters:
- termint,
Which term to process.
- nint, optional
number of data points to create
- meshgridbool, optional
Whether to return a meshgrid (useful for 3d plotting) or a feature matrix (useful for inference like partial predictions)
- Returns:
- if meshgrid is False:
np.array of shape (n, n_features) where m is the number of (sub)terms in the requested (tensor)term.
- else:
tuple of len m, where m is the number of (sub)terms in the requested (tensor)term.
each element in the tuple contains a np.ndarray of size (n)^m
- Raises:
- ValueError
If the term requested is an intercept since it does not make sense to process the intercept term.
- get_params(deep=False)[source]#
Returns a dict of all of the object’s user-facing parameters.
- Parameters:
- deepboolean, default: False
when True, also gets non-user-facing parameters
- Returns:
- dict
- gridsearch(X, y, exposure=None, weights=None, return_scores=False, keep_best=True, objective='auto', **param_grids)[source]#
Performs a grid search over a space of parameters for a given objective.
NOTE: gridsearch method is lazy and will not remove useless combinations from the search space, e.g.
>> n_splines=np.arange(5,10), fit_splines=[True, False]
will result in 10 loops, of which 5 are equivalent because even though fit_splines==False
it is not recommended to search over a grid that alternates between known scales and unknown scales, as the scores of the candidate models will not be comparable.
- Parameters:
- Xarray
input data of shape (n_samples, m_features)
- yarray
label data of shape (n_samples, )
- exposurearray-like shape (n_samples, ) or None, default: None
containing exposures if None, defaults to array of ones
- weightsarray-like shape (n_samples, ) or None, default: None
containing sample weights if None, defaults to array of ones
- return_scoresboolean, default False
whether to return the hyperparameters and score for each element in the grid
- keep_bestboolean
whether to keep the best GAM as self. default: True
- objectivestring, default: ‘auto’
metric to optimize. must be in [‘AIC’, ‘AICc’, ‘GCV’, ‘UBRE’, ‘auto’] if ‘auto’, then grid search will optimize GCV for models with unknown scale and UBRE for models with known scale.
- **kwargsdict, default {‘lam’: np.logspace(-3, 3, 11)}
pairs of parameters and iterables of floats, or parameters and iterables of iterables of floats.
if iterable of iterables of floats, the outer iterable must have length m_features.
the method will make a grid of all the combinations of the parameters and fit a GAM to each combination.
- Returns:
- if return_values == True:
- model_scoresdict
Contains each fitted model as keys and corresponding objective scores as values
- else:
self, ie possibly the newly fitted model
- loglikelihood(X, y, exposure=None, weights=None)[source]#
Compute the log-likelihood of the dataset using the current model.
- Parameters:
- Xarray-like of shape (n_samples, m_features)
containing the input dataset
- yarray-like of shape (n, )
containing target values
- exposurearray-like shape (n_samples, ) or None, default: None
containing exposures if None, defaults to array of ones
- weightsarray-like of shape (n, )
containing sample weights
- Returns:
- log-likelihoodnp.array of shape (n, )
containing log-likelihood scores
- partial_dependence(term, X=None, width=None, quantiles=None, meshgrid=False)[source]#
Computes the term functions for the GAM and possibly their confidence intervals.
if both width=None and quantiles=None, then no confidence intervals are computed
- Parameters:
- termint, optional
Term for which to compute the partial dependence functions.
- Xarray-like with input data, optional
if meshgrid=False, then X should be an array-like of shape (n_samples, m_features).
if meshgrid=True, then X should be a tuple containing an array for each feature in the term.
if None, an equally spaced grid of points is generated.
- widthfloat on (0, 1), optional
Width of the confidence interval.
- quantilesarray-like of floats on (0, 1), optional
instead of specifying the prediction width, one can specify the quantiles. so width=.95 is equivalent to quantiles=[.025, .975]. if None, defaults to width.
- meshgridbool, whether to return and accept meshgrids.
Useful for creating outputs that are suitable for 3D plotting.
Note, for simple terms with no interactions, the output of this function will be the same for
meshgrid=Trueandmeshgrid=False, but the inputs will need to be different.
- Returns:
- pdepsnp.array of shape (n_samples, )
- conf_intervalslist of length len(term)
containing np.arrays of shape (n_samples, 2 or len(quantiles))
- Raises:
- ValueError
If the term requested is an intercept since it does not make sense to process the intercept term.
See also
generate_X_gridfor help creating meshgrids.
- predict(X, exposure=None)[source]#
Predict expected value of target given model and input X often this is done via expected value of GAM given input X.
- Parameters:
- Xarray-like of shape (n_samples, m_features), default: None
containing the input dataset
- exposurearray-like shape (n_samples, ) or None, default: None
containing exposures if None, defaults to array of ones
- Returns:
- ynp.array of shape (n_samples, )
containing predicted values under the model
- predict_mu(X)[source]#
Predict expected value of target given model and input X
- Parameters:
- Xarray-like of shape (n_samples, m_features),
containing the input dataset
- Returns:
- ynp.array of shape (n_samples, )
containing expected values under the model
- sample(X, y, quantity='y', sample_at_X=None, weights=None, n_draws=100, n_bootstraps=5, objective='auto')[source]#
Simulate from the posterior of the coefficients and smoothing params.
Samples are drawn from the posterior of the coefficients and smoothing parameters given the response in an approximate way. The GAM must already be fitted before calling this method; if the model has not been fitted, then an exception is raised. Moreover, it is recommended that the model and its hyperparameters be chosen with gridsearch (with the parameter keep_best=True) before calling sample, so that the result of that gridsearch can be used to generate useful response data and so that the model’s coefficients (and their covariance matrix) can be used as the first bootstrap sample.
These samples are drawn as follows. Details are in the reference below.
1.
n_bootstrapsmany “bootstrap samples” of the response (y) are simulated by drawing random samples from the model’s distribution evaluated at the expected values (mu) for each sample inX.2. A copy of the model is fitted to each of those bootstrap samples of the response. The result is an approximation of the distribution over the smoothing parameter
lamgiven the response datay.3. Samples of the coefficients are simulated from a multivariate normal using the bootstrap samples of the coefficients and their covariance matrices.
- Parameters:
- Xarray of shape (n_samples, m_features)
empirical input data
- yarray of shape (n_samples, )
empirical response vector
- quantity{‘y’, ‘coef’, ‘mu’}, default: ‘y’
What quantity to return pseudorandom samples of. If sample_at_X is not None and quantity is either ‘y’ or ‘mu’, then samples are drawn at the values of X specified in sample_at_X.
- sample_at_Xarray of shape (n_samples_to_simulate, m_features) or
None, optional Input data at which to draw new samples.
Only applies for quantity equal to ‘y’ or to ‘mu’. If None, then sample_at_X is replaced by X.
- weightsnp.array of shape (n_samples, )
sample weights
- n_drawspositive int, optional (default=100)
The number of samples to draw from the posterior distribution of the coefficients and smoothing parameters
- n_bootstrapspositive int, optional (default=5)
The number of bootstrap samples to draw from simulations of the response (from the already fitted model) to estimate the distribution of the smoothing parameters given the response data. If n_bootstraps is 1, then only the already fitted model’s smoothing parameter is used, and the distribution over the smoothing parameters is not estimated using bootstrap sampling.
- objectivestring, optional (default=’auto’)
metric to optimize in grid search. must be in [‘AIC’, ‘AICc’, ‘GCV’, ‘UBRE’, ‘auto’] if ‘auto’, then grid search will optimize GCV for models with unknown scale and UBRE for models with known scale.
- Returns:
- draws2D array of length n_draws
Simulations of the given quantity using samples from the posterior distribution of the coefficients and smoothing parameter given the response data. Each row is a pseudorandom sample.
If quantity == ‘coef’, then the number of columns of draws is the number of coefficients (len(self.coef_)).
Otherwise, the number of columns of draws is the number of rows of sample_at_X if sample_at_X is not None or else the number of rows of X.
Notes
A
gridsearchis donen_bootstrapsmany times, so keepn_bootstrapssmall. Maken_bootstraps < n_drawsto take advantage of the expensive bootstrap samples of the smoothing parameters.References
Simon N. Wood, 2006Generalized Additive Models: an introduction with RSection 4.9.3 (pages 198–199) and Section 5.4.2 (page 256–257).
- score(X, y, weights=None)[source]#
Compute the explained deviance for a trained model for a given X data and y labels.
- Parameters:
- Xarray-like
Input data array of shape (n_samples, m_features)
- yarray-like
Output data vector of shape (n_samples, )
- weightsarray-like shape (n_samples, ) or None, optional
Sample weights. if None, defaults to array of ones
- Returns:
- explained deviance score: np.array() (n_samples, )
- set_params(deep=False, force=False, **parameters)[source]#
Sets an object’s parameters.
- Parameters:
- deepboolean, default: False
when True, also sets non-user-facing parameters
- forceboolean, default: False
when True, also sets parameters that the object does not already have
- **parametersparameters to set
- Returns:
- self