Ensemble

fast_automl.ensemble.ClassifierWeighter

class fast_automl.ensemble.ClassifierWeighter(loss=log_loss) [source]

Trains weights for ensemble of classifiers.

Parameters:	loss : callable, default=log_loss Loss function to minimize by weight fitting.
Attributes:	coef_ : array-like of shape (n_estimators,) Weights on the given estimators.

Examples

from fast_automl.ensemble import ClassifierWeighter

import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import StratifiedKFold, cross_val_predict, train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC

from copy import deepcopy

X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, shuffle=True)

svc = SVC(probability=True).fit(X_train, y_train)
knn = KNeighborsClassifier().fit(X_train, y_train)

cv = StratifiedKFold(random_state=np.random.RandomState(), shuffle=True)
X_meta = np.array([
    cross_val_predict(clf, X_train, y_train, cv=deepcopy(cv), method='predict_proba')
    for clf in (svc, knn)
]).transpose(1, 2, 0)
weighter = ClassifierWeighter().fit(X_meta, y_train)

X_meta_test = np.array([
    clf.predict_proba(X_test) for clf in (svc, knn)
]).transpose(1, 2, 0)
weighter.score(X_meta_test, y_test)

Methods

fit(self, X, y) [source]

Parameters:

X : array-like of shape (n_samples, n_classes, n_estimators)

X_ice is the probability estimator e puts on sample i being in class c.

y : array-like of shape (n_samples,)

Targets.

Returns:

self :

predict(self, X) [source]

Predict class labels for samples in X.

Parameters:	X : array-like of shape (n_samples, n_features) Samples.
Returns:	C : array of shape (n_samples,) Predicted class label for each sample.

predict_proba(self, X) [source]

Probability estimates.

Parameters:	X : array-like of shape (n_samples, n_features) Samples.
Returns:	T : array-like of shape (n_samples, n_classes) Probability of the sample for each classes.

fast_automl.ensemble.BaseStackingCV

class fast_automl.ensemble.BaseStackingCV(estimators, cv=None, shuffle_cv=True, scoring=None, n_jobs=None, verbose=0) [source]

Parameters:

estimators : list

Base estimators which will be stacked together. Each element of the list is defined as a tuple of string (i.e. name) and an estimator instance.

cv : int, cross-validation generator, or iterable, default=None

Scikit-learn style cv parameter.

shuffle_cv : bool, default=True

Indicates that cross validator should shuffle observations.

scoring : str, callable, list, tuple, or dict, default=None

Scikit-learn style scoring parameter.

n_jobs : int, default=None

Number of jobs to run in parallel.

verbose : int, default=0

Controls the verbosity.

Methods

fit(self, X, y, sample_weight=None) [source]

predict(self, X) [source]

Predict class labels for samples in X.

Parameters:	X : array-like of shape (n_samples, n_features) Samples.
Returns:	C : array of shape (n_samples,) Predicted outcome for each sample.

predict_proba(self, X) [source]

Probability estimates.

Parameters:	X : array-like of shape (n_samples, n_features) Samples.
Returns:	T : array-like of shape (n_samples, n_classes) Probability of the sample for each classes on the model.

Notes

Only applicable to classifiers.

transform(self, X) [source]

Transforms raw features into a prediction matrix.

Parameters:	X : array-like of shape (n_samples, n_features) Features.
Returns:	X_meta : array-like Prediction matrix of shape (n_samples, n_estimators) for regression and (n_estimators, n_samples, n_classes) for classification.

fast_automl.ensemble.RFEVotingEstimatorCV

Selects estimators using recursive feature elimination. Inherits from BaseStackingCV.

Attributes:

best_estimator_ : estimator

The voting estimator associated with the highest CV score.

best_score_ : scalar

Highest CV score attained by any voting estimator.

weights_ : array-like

Weights the voting estimator places on each of the estimators in its ensemble.

names_ : list

List of estimator names in the best estimator.

Methods

fit(self, X, y, sample_weight=None) [source]

Fit the model.

Parameters:

X : array-like of shape (n_samples, n_features)

Training data.

y : array-like of shape (n_samples,)

Target values.

sample_weight, array-like of shape (n_samples,), default=Noone :

Individual weights for each sample.

Returns:

self :

fast_automl.ensemble.RFEVotingClassifierCV

Selects classifiers using recursive feature elimination. Inherits from RFEVotingEstimatorCV.

Examples

from fast_automl.ensemble import RFEVotingClassifierCV

from sklearn.datasets import load_breast_cancer
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score, train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC

X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, shuffle=True)

clf = RFEVotingClassifierCV([
    ('rf', RandomForestClassifier()),
    ('knn', KNeighborsClassifier()),
    ('svm', SVC(probability=True))
]).fit(X_train, y_train)
print('CV score: {:.4f}'.format(cross_val_score(clf.best_estimator_, X_train, y_train).mean()))
print('Test score: {:.4f}'.format(clf.score(X_test, y_test)))

fast_automl.ensemble.RFEVotingRegressorCV

Selects regressors using recursive feature elimination. Inherits from RFEVotingEstimatorCV.

Examples

from fast_automl.ensemble import RFEVotingRegressorCV

from sklearn.datasets import load_boston
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import cross_val_score, train_test_split
from sklearn.neighbors import KNeighborsRegressor
from sklearn.svm import SVR

X, y = load_boston(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, shuffle=True)

reg = RFEVotingRegressorCV([
    ('rf', RandomForestRegressor()),
    ('knn', KNeighborsRegressor()),
    ('svm', SVR())
]).fit(X_train, y_train)
print('CV score: {:.4f}'.format(cross_val_score(reg.best_estimator_, X_train, y_train).mean()))
print('Test score: {:.4f}'.format(reg.score(X_test, y_test)))

fast_automl.ensemble.StepwiseVotingEstimatorCV

Selects estimators using stepwise addition. Inherits from BaseStackingCV.

Attributes:

best_estimator_ : estimator

The voting estimator associated with the highest CV score.

best_score_ : scalar

Highest CV score attained by any voting estimator.

weights_ : array-like

Weights the voting estimator places on each of the estimators in its ensemble.

names_ : list

List of estimator names in the best estimator.

Methods

fit(self, X, y, sample_weight=None) [source]

Fit the model.

Parameters:

X : array-like of shape (n_samples, n_features)

Training data.

y : array-like of shape (n_samples,)

Target values.

sample_weight, array-like of shape (n_samples,), default=Noone :

Individual weights for each sample.

Returns:

self :

fast_automl.ensemble.StepwiseVotingClassifierCV

Selects classifiers using stepwise addition. Inherits from StepwiseVotingEstimatorCV.

Examples

from fast_automl.ensemble import StepwiseVotingClassifierCV

from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score, train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, shuffle=True)

clf = StepwiseVotingClassifierCV([
    ('rf', RandomForestClassifier()),
    ('knn', KNeighborsClassifier()),
    ('svm', SVC(probability=True))
]).fit(X_train, y_train)
print('CV score: {:.4f}'.format(cross_val_score(clf.best_estimator_, X_train, y_train).mean()))
print('Test score: {:.4f}'.format(clf.score(X_test, y_test)))

fast_automl.ensemble.StepwiseVotingRegressorCV

Selects regressors using stepwise addition. Inherits from StepwiseVotingEstimatorCV.

Examples

from fast_automl.ensemble import StepwiseVotingRegressorCV

from sklearn.datasets import load_diabetes
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import cross_val_score, train_test_split
from sklearn.neighbors import KNeighborsRegressor
from sklearn.svm import SVR

X, y = load_diabetes(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, shuffle=True)

reg = StepwiseVotingRegressorCV([
    ('rf', RandomForestRegressor()),
    ('knn', KNeighborsRegressor()),
    ('svm', SVR())
]).fit(X_train, y_train)
print('CV score: {:.4f}'.format(cross_val_score(reg.best_estimator_, X_train, y_train).mean()))
print('Test score: {:.4f}'.format(reg.score(X_test, y_test)))