Automated machine learning
fast_automl.automl.make_cv_regressors
def fast_automl.automl.make_cv_regressors() [source]
Returns: | cv_regressors : list
List of default CV regresssors. |
---|
fast_automl.automl.make_cv_classifiers
def fast_automl.automl.make_cv_classifiers() [source]
Returns: | cv_classifiers : list
List of default CV classifiers. |
---|
fast_automl.automl.AutoEstimator
class fast_automl.automl.AutoEstimator(cv_estimators=[], preprocessors=[], ensemble_method= 'auto', max_ensemble_size=50, n_ensembles=1, n_iter=10, n_jobs=None, verbose=False, cv=None, scoring=None) [source]
Parameters: | cv_estimators : list of CVEstimators, default=[]
If an empty list, a default list of CVEstimators will be created. preprocessors : list, default=[]
List of preprocessing steps before data is fed to the
If The maximum number of estimators to consider adding to the ensemble. n_ensembles : int, default=1Number of ensembles to create using different CV splits. These ensembles get equal votes in a meta-ensemble. n_iter : int, default=10Number of iterations to run randomized search for the CVEstimators. n_jobs : int or None, default=NoneNumber of jobs to run in parallel. verbose : bool, default=FalseControls the verbosity. cv : int, cross-validation generator, or iterable, default=NoneScikit-learn style cv parameter. scoring : str, callable, list, tuple, or dict, default=NoneScikit-learn style scoring parameter. By default, a regressor ensembles maximizes R-squared and a classifier ensemble maximizes ROC AUC. |
---|---|
Attributes: | best_estimator_ : estimator
Ensemble or meta-ensemble associated with the best CV score. |
Methods
fit(self, X, y, sample_weight=None) [source]
Fit the model.
Parameters: | X : array-like of shape (n_samples, n_features)
Training data. y : array-like of shape (n_samples,)Target values. sample_weight, array-like of shape (n_samples,), default=Noone :Individual weights for each sample. |
---|---|
Returns: | self :
|
predict(self, X) [source]
Predict class labels for samples in X.
Parameters: | X : array-like of shape (n_samples, n_features)
Samples. |
---|---|
Returns: | C : array of shape (n_samples,)
Predicted class label for each sample. |
predict_proba(self, X) [source]
Probability estimates.
Parameters: | X : array-like of shape (n_samples, n_features)
Samples. |
---|---|
Returns: | T : array-like of shape (n_samples, n_classes)
Probability of the sample for each classes on the model, ordered by |
fast_automl.automl.AutoClassifier
Automatic classifier. Inherits from AutoEstimator
.
Examples
from fast_automl.automl import AutoClassifier
from sklearn.datasets import load_digits
from sklearn.model_selection import cross_val_score, train_test_split
X, y = load_digits(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, shuffle=True, stratify=y)
clf = AutoClassifier(ensemble_method='stepwise', n_jobs=-1, verbose=True).fit(X_train, y_train)
print('CV score: {:.4f}'.format(cross_val_score(clf.best_estimator_, X_train, y_train).mean()))
print('Test score: {:.4f}'.format(clf.score(X_test, y_test)))
This runs for about 6-7 minutes and typically achieves a test accuracy of 96-99% and ROC AUC above .999.
fast_automl.automl.AutoRegressor
Automatic regressor. Inherits from AutoEstimator
.
Examples
from fast_automl.automl import AutoRegressor
from sklearn.datasets import load_diabetes
from sklearn.model_selection import cross_val_score, train_test_split
X, y = load_diabetes(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, shuffle=True)
reg = AutoRegressor(n_jobs=-1, verbose=True).fit(X_train, y_train)
print('CV score: {:.4f}'.format(cross_val_score(reg.best_estimator_, X_train, y_train).mean()))
print('Test score: {:.4f}'.format(reg.score(X_test, y_test)))
This runs for about 30 seconds and typically achieves a test R-squared of .47-.53.