smoothDE documentation¶
smoothDE is a library for inferring non-parametric probability distributions from data.
Indices and tables¶
smoothDE
Smooth density estimator
Rhys M. Adams 24.11.07
- class smoothDE.estimator.DensityFitter(box=None, dpow=2, n_ts=500, ts=None, tf_from_alpha0=1, nodes=None, edges=None, positions=None, interpolator=None, max_val=100, enforce_pos_def=True, log_prior_coef=None, log_prior=None)[source]¶
Fit a density estimate.
- Parameters:
box (list of ints, optional) – Bounding box to fit points in. Defaults to None.
dpow (int, optional) – Derivative used for smoothing penalty. Defaults to 2.
n_ts (int, optional) – Number of smoothing penalties to sample. Defaults to 500.
ts (_type_, optional) – Related to smoothing penalties to sample; l ~ e^-t. Defaults to None.
tf_from_alpha0 (float, optional) – Sampling stops when |density * number - tf_from_alpha0| == 0. Defaults to 1. These 3 parameters can be safely ignored unless you want to consider unusual density geometries.
nodes (numpy array, optional) – how much each laplace point is weighted. Defaults to None.
edges (list of 2-tuples, optional) – How each laplace point is connected to each other. Defaults to None.
positions (numpy array, optional) – Coordinate of all points to consider. Defaults to None.
interpolator (sklearn interpolator, optional) – Interpolator to guess densities. Defaults to None.
max_val (int, optional) – Clip phi=-log probability to this number. Defaults to 100.
enforce_pos_def (bool, optional) – For high smoothing penalties, the smoothing penalty matrix can be numerically mistaken to by not positive definite. This alters the starting smoothing penalty so this doesn’t happen. Defaults to True.
log_prior_coef (float optional) - multiply the log prior of t by this constant. If none, defaults to (n-1)
log_prior (lambda optional) – dpow * t / 2
- export_graph()[source]¶
Exports all of the elements of the laplce
- Returns:
Weight of each point,Which coordinates connect,Coordinates
- Return type:
numpy array, list of 2-tuples, numpy array
- export_interpolator()[source]¶
export the interpolator as an sklearn object so you don’t need to pip install smoothDE to make predictions
- Returns:
predicts fit solution of negative log probability, phi=-ln P, but without any checks.
- Return type:
interpolator object
- export_predictor()[source]¶
export a class capable of predictions, but not fitting. Much more memory efficient.
- Returns:
predicts fit solution without all of the variables and memory
- Return type:
- fit(points, phi0_fun=None)[source]¶
fit density to empirical points
- Parameters:
points (numpy array) – data points to train model
phi0_fun (_lambda, optional) – phi0 function not subject to smoothing penalty. Defaults to None.
- Returns:
key statistics during fit. ts <- sample points, phis <- optimal solutions, dets <- determinats of the Hessian of the solution, objs <- action of the solution, occams <- relative goodness of each solution, best_ind <- which solution has the highest occam score
- Return type:
dict
- set_fit_request(*, phi0_fun: bool | None | str = '$UNCHANGED$', points: bool | None | str = '$UNCHANGED$') DensityFitter¶
Request metadata passed to the
fitmethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
phi0_fun (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
phi0_funparameter infit.points (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
pointsparameter infit.
- Returns:
self – The updated object.
- Return type:
object
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') DensityFitter¶
Request metadata passed to the
scoremethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.- Returns:
self – The updated object.
- Return type:
object
Rhys M. Adams 24.11.08
- class smoothDE.estimator_predictor.EstimatorPredictor(box, max_val=100, interpolator=None)[source]¶
Predicts density, cannot fit a new density.
- Parameters:
box (list) – bounding box of density estimator
max_val (int, optional) – clip phi higher than this number. Defaults to 100.
interpolator (interpolator, optional) – interpolator of coordinates to phi solution. Defaults to None.
- predict(X)[source]¶
Predict phi based on coordinates
- Parameters:
X (numpy array) – points to find corresponding phi
- Returns:
phi estimates
- Return type:
numpy array
- predict_prob(X)[source]¶
Predict probability, e^(-phi) based on coordinates
- Parameters:
X (numpy array) – points to predict probability of
- Returns:
probability estimates
- Return type:
numpy array
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') EstimatorPredictor¶
Request metadata passed to the
scoremethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.- Returns:
self – The updated object.
- Return type:
object
Rhys M. Adams 24.11.07
- class smoothDE.make_subfields.MakeSubfields(dpows, n_gridpoints, n_threads=1, paired=False, categories=None, maxent=False)[source]¶
Fit sub-fields model
- Parameters:
dpows (list of ints) – derivative values for density fitter
n_gridpoints (list of ints) – grid sizes for density fitter
n_threads (int, optional) – Number of parrallel threads to use. Defaults to 1.
paired (value of class, optional) – When calculating sub-fields, always subtract this class. Defaults to False.
categories (set, optional) – list of possible classifiers. Defaults to None.
maxent (bool, optional) – Should the density start from a high order entropy model. False means it starts with a uniform distribution. Defaults to False.
- smoothDE.make_subfields.get_density(X, y, category, n_gridpoints, dpow, maxent)[source]¶
Function for running density fitter, allows for parallelization.
- Parameters:
X (numpy array) – Numpy array features to fit.
y (numpy array) – Numpy array of classes corresponding to X
category (set) – set of classes to fit
n_gridpoints (list of ints) – list of gridsizes
dpow (list of ints) – derivative to use for assessing smoothing penalty
maxent (bool) – Start from higher order maximum entropy density estimate?
- Returns:
fit density estimator
- Return type:
Rhys M. Adams 24.11.07
- class smoothDE.interpolator_transformer.InterpolatorTransformer(n_gridpoints, drs=None, paired=None, categories=None)[source]¶
A class dedicated to predicting sub-fields
- Parameters:
n_gridpoints (list of ints) – list of gridpoints, higher is more accurate but slower
drs (list of ints, optional) – Dictionary of smoothDE objects. Defaults to None.
paired (value of class, optional) – If not None, always subtract this paired sub-field from current sub-field. Defaults to None.
categories (list, optional) – List of all classifier categories. Defaults to None.
Rhys M. Adams 24.11.09
- class smoothDE.maxent.MultivariateMaxent(degree, center=None, sigma=None, bbox=None)[source]¶
Class for fitting maximum entropy
- Parameters:
degree (int) – highest moment/polynomial order to fit.
center (numpy array, optional) – center data by this much. Defaults to None.
sigma (numpy array, optional) – rescale data by this much. Defaults to None.
bbox (list, optional) – bounding box of integration. Defaults to None.