smoothDE documentation

smoothDE is a library for inferring non-parametric probability distributions from data.

Indices and tables

smoothDE

Smooth density estimator

Rhys M. Adams 24.11.07

class smoothDE.estimator.DensityFitter(box=None, dpow=2, n_ts=500, ts=None, tf_from_alpha0=1, nodes=None, edges=None, positions=None, interpolator=None, max_val=100, enforce_pos_def=True, log_prior_coef=None, log_prior=None)[source]

Fit a density estimate.

Parameters:
  • box (list of ints, optional) – Bounding box to fit points in. Defaults to None.

  • dpow (int, optional) – Derivative used for smoothing penalty. Defaults to 2.

  • n_ts (int, optional) – Number of smoothing penalties to sample. Defaults to 500.

  • ts (_type_, optional) – Related to smoothing penalties to sample; l ~ e^-t. Defaults to None.

  • tf_from_alpha0 (float, optional) – Sampling stops when |density * number - tf_from_alpha0| == 0. Defaults to 1. These 3 parameters can be safely ignored unless you want to consider unusual density geometries.

  • nodes (numpy array, optional) – how much each laplace point is weighted. Defaults to None.

  • edges (list of 2-tuples, optional) – How each laplace point is connected to each other. Defaults to None.

  • positions (numpy array, optional) – Coordinate of all points to consider. Defaults to None.

  • interpolator (sklearn interpolator, optional) – Interpolator to guess densities. Defaults to None.

  • max_val (int, optional) – Clip phi=-log probability to this number. Defaults to 100.

  • enforce_pos_def (bool, optional) – For high smoothing penalties, the smoothing penalty matrix can be numerically mistaken to by not positive definite. This alters the starting smoothing penalty so this doesn’t happen. Defaults to True.

  • log_prior_coef (float optional) - multiply the log prior of t by this constant. If none, defaults to (n-1)

  • log_prior (lambda optional) – dpow * t / 2

export_graph()[source]

Exports all of the elements of the laplce

Returns:

Weight of each point,Which coordinates connect,Coordinates

Return type:

numpy array, list of 2-tuples, numpy array

export_interpolator()[source]

export the interpolator as an sklearn object so you don’t need to pip install smoothDE to make predictions

Returns:

predicts fit solution of negative log probability, phi=-ln P, but without any checks.

Return type:

interpolator object

export_predictor()[source]

export a class capable of predictions, but not fitting. Much more memory efficient.

Returns:

predicts fit solution without all of the variables and memory

Return type:

EstimatorPredictor

fit(points, phi0_fun=None)[source]

fit density to empirical points

Parameters:
  • points (numpy array) – data points to train model

  • phi0_fun (_lambda, optional) – phi0 function not subject to smoothing penalty. Defaults to None.

Returns:

key statistics during fit. ts <- sample points, phis <- optimal solutions, dets <- determinats of the Hessian of the solution, objs <- action of the solution, occams <- relative goodness of each solution, best_ind <- which solution has the highest occam score

Return type:

dict

set_fit_request(*, phi0_fun: bool | None | str = '$UNCHANGED$', points: bool | None | str = '$UNCHANGED$') DensityFitter

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • phi0_fun (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for phi0_fun parameter in fit.

  • points (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for points parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') DensityFitter

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

Rhys M. Adams 24.11.08

class smoothDE.estimator_predictor.EstimatorPredictor(box, max_val=100, interpolator=None)[source]

Predicts density, cannot fit a new density.

Parameters:
  • box (list) – bounding box of density estimator

  • max_val (int, optional) – clip phi higher than this number. Defaults to 100.

  • interpolator (interpolator, optional) – interpolator of coordinates to phi solution. Defaults to None.

predict(X)[source]

Predict phi based on coordinates

Parameters:

X (numpy array) – points to find corresponding phi

Returns:

phi estimates

Return type:

numpy array

predict_prob(X)[source]

Predict probability, e^(-phi) based on coordinates

Parameters:

X (numpy array) – points to predict probability of

Returns:

probability estimates

Return type:

numpy array

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') EstimatorPredictor

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

Rhys M. Adams 24.11.07

class smoothDE.make_subfields.MakeSubfields(dpows, n_gridpoints, n_threads=1, paired=False, categories=None, maxent=False)[source]

Fit sub-fields model

Parameters:
  • dpows (list of ints) – derivative values for density fitter

  • n_gridpoints (list of ints) – grid sizes for density fitter

  • n_threads (int, optional) – Number of parrallel threads to use. Defaults to 1.

  • paired (value of class, optional) – When calculating sub-fields, always subtract this class. Defaults to False.

  • categories (set, optional) – list of possible classifiers. Defaults to None.

  • maxent (bool, optional) – Should the density start from a high order entropy model. False means it starts with a uniform distribution. Defaults to False.

fit(X, y)[source]

Fit sub-field model

Parameters:
  • X (numpy array) – Features to fit, transform

  • y (numpy array) – Classifier response

fit_transform(X, y)[source]

Fit Transform useful for sklearn.

Parameters:
  • X (numpy array) – Features to fit

  • y (numpy array) – class of features

Returns:

sub-fields

Return type:

numpy array

smoothDE.make_subfields.get_density(X, y, category, n_gridpoints, dpow, maxent)[source]

Function for running density fitter, allows for parallelization.

Parameters:
  • X (numpy array) – Numpy array features to fit.

  • y (numpy array) – Numpy array of classes corresponding to X

  • category (set) – set of classes to fit

  • n_gridpoints (list of ints) – list of gridsizes

  • dpow (list of ints) – derivative to use for assessing smoothing penalty

  • maxent (bool) – Start from higher order maximum entropy density estimate?

Returns:

fit density estimator

Return type:

DensityFitter

Rhys M. Adams 24.11.07

class smoothDE.interpolator_transformer.InterpolatorTransformer(n_gridpoints, drs=None, paired=None, categories=None)[source]

A class dedicated to predicting sub-fields

Parameters:
  • n_gridpoints (list of ints) – list of gridpoints, higher is more accurate but slower

  • drs (list of ints, optional) – Dictionary of smoothDE objects. Defaults to None.

  • paired (value of class, optional) – If not None, always subtract this paired sub-field from current sub-field. Defaults to None.

  • categories (list, optional) – List of all classifier categories. Defaults to None.

predict(X)[source]

Create a dictionary of subfields from the points in X

Parameters:

X (numpy array) – a numpy array of datapoints

Returns:

a dictionary of numpy arrays corresponding to sub-fields

Return type:

dict

transform(X)[source]

Create a numpy array from the points in X

Parameters:

X (numpy array) – a numpy array of datapoints

Returns:

a numpy array corresponding to sub-fields

Return type:

numpy array

Rhys M. Adams 24.11.09

class smoothDE.maxent.MultivariateMaxent(degree, center=None, sigma=None, bbox=None)[source]

Class for fitting maximum entropy

Parameters:
  • degree (int) – highest moment/polynomial order to fit.

  • center (numpy array, optional) – center data by this much. Defaults to None.

  • sigma (numpy array, optional) – rescale data by this much. Defaults to None.

  • bbox (list, optional) – bounding box of integration. Defaults to None.

fit(data, params0=None)[source]

fit a maximum entropy solution

Parameters:
  • data (numpy array) – data

  • params0 (numpy array, optional) – initial guess of solution. Defaults to None.

predict(data)[source]

get ln P of maximum entropy solution

Parameters:

data (numpy array) – points to predict

Returns:

ln P, prediction of maximum entropy solution

Return type:

numpy array