Welcome to imbalanced-ensemble documentation!

https://raw.githubusercontent.com/ZhiningLiu1998/figures/master/imbalanced-ensemble/imbalanced_ensemble_header.png

Documentation Status

[Github] [Gallery] [PyPI] [Changelog] [Source] [Download] [知乎/Zhihu] [中文README] [arXiv]

Date: Nov 30, 2021 Version: 0.1.6

Paper: IMBENS: Ensemble Class-imbalanced Learning in Python

Citing US

If you find IMBENS helpful in your work or research, we would greatly appreciate citations to the following paper [PDF]:

@article{liu2021imbens,
   title={IMBENS: Ensemble Class-imbalanced Learning in Python},
   author={Liu, Zhining and Wei, Zhepei and Yu, Erxin and Huang, Qiang and Guo, Kai and Yu, Boyang and Cai, Zhaonian and Ye, Hangting and Cao, Wei and Bian, Jiang and Wei, Pengfei and Jiang, Jing and Chang, Yi},
   journal={arXiv preprint arXiv:2111.12776},
   year={2021}
}

imbalanced-ensemble (IMBENS, imported as imbalanced_ensemble) is a Python toolbox for quick implementation, modification, evaluation, and visualization of ensemble learning algorithms for class-imbalanced data. It was built on the basis of scikit-learn and imbalanced-learn. IMBENS includes more than 15 ensemble imbalanced learning (EIL) algorithms, from the classical SMOTEBoost (2003) and RUSBoost (2010) to recent SPE (2020), from resampling-based methods to cost-sensitive ensemble learning.

IMBENS is featured for:

  • Unified, easy-to-use APIs, detailed documentation and examples.

  • Capable for out-of-the-box multi-class imbalanced (long-tailed) learning.

  • Optimized performance with parallelization when possible using joblib.

  • Powerful, customizable, interactive training logging and visualizer.

  • Full compatibility with other popular packages like scikit-learn and imbalanced-learn.

API Demo:

>>> from imbalanced_ensemble.ensemble import SelfPacedEnsembleClassifier
>>> from imbalanced_ensemble.datasets import generate_imbalance_data
>>> from imbalanced_ensemble.utils import evaluate_print
>>> from imbalanced_ensemble.visualizer import ImbalancedEnsembleVisualizer
>>>
>>> X_train, X_test, y_train, y_test = generate_imbalance_data(
...     n_samples=200, weights=[.9,.1], test_size=.5)
>>>
>>> clf = SelfPacedEnsembleClassifier()                    # initialize ensemble
>>> clf.fit(X_train, y_train)
>>>
>>> y_test_pred = clf.predict(X_test)                      # predict labels
>>> y_test_proba = clf.predict_proba(X_test)               # predict probabilities
>>>
>>> evaluate_print(y_test, y_test_pred, "SPE")             # performance evaluation
SPE balanced Acc: 0.972 | macro Fscore: 0.886 | macro Gmean: 0.972
>>>
>>> visualizer = ImbalancedEnsembleVisualizer()            # initialize visualizer
>>> visualizer.fit({'SPE': clf})
>>>
>>> visualizer.performance_lineplot()                      # performance visualization
>>> visualizer.confusion_matrix_heatmap()                  # prediction visualization