Fast AutoML
Most autoML packages aim for exceptional performance but need to train for an exceptional amount of time. Fast-autoML aims for reasonable performance in a reasonable amount of time.
Fast-autoML includes additional utilities, such as tools for comparing model performance by repeated cross-validation.
Installation
$ pip install fast-automl
Quickstart
from fast_automl.automl import AutoClassifier
from sklearn.datasets import load_digits
from sklearn.model_selection import cross_val_score, train_test_split
X, y = load_digits(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, shuffle=True, stratify=y)
clf = AutoClassifier(ensemble_method='stepwise', n_jobs=-1, verbose=True).fit(X_train, y_train)
print('CV score: {:.4f}'.format(cross_val_score(clf.best_estimator_, X_train, y_train).mean()))
print('Test score: {:.4f}'.format(clf.score(X_test, y_test)))
This runs for about 6-7 minutes and typically achieves a test accuracy of 96-99% and ROC AUC above .999.
from fast_automl.automl import AutoRegressor
from sklearn.datasets import load_diabetes
from sklearn.model_selection import cross_val_score, train_test_split
X, y = load_diabetes(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, shuffle=True)
reg = AutoRegressor(n_jobs=-1, verbose=True).fit(X_train, y_train)
print('CV score: {:.4f}'.format(cross_val_score(reg.best_estimator_, X_train, y_train).mean()))
print('Test score: {:.4f}'.format(reg.score(X_test, y_test)))
This runs for about 30 seconds and typically achieves a test R-squared of .47-.53.
Citation
@software{bowen2021fast-automl,
author = {Dillon Bowen},
title = {Fast-AutoML},
url = {https://dsbowen.github.io/fast-automl/},
date = {2021-02-05},
}
License
Users must cite this package in any publications which use it.
It is licensed with the MIT License.
Acknowledgments
This package and its documentation draw heavily on scikit-learn.