Evaluate classification by compiling a report

Specific metrics have been developed to evaluate classifier which has been trained using imbalanced data. “mod:imbalanced_ensemble provides a classification report (imbalanced_ensemble.metrics.classification_report_imbalanced()) similar to sklearn, with additional metrics specific to imbalanced learning problem.

Out:

C:\Softwares\Anaconda3\lib\site-packages\sklearn\svm\_base.py:985: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.
  warnings.warn("Liblinear failed to converge, increase "
                   pre       rec       spe        f1       geo       iba       sup

          0       0.41      0.84      0.87      0.55      0.85      0.73       123
          1       0.98      0.87      0.84      0.92      0.85      0.73      1127

avg / total       0.92      0.87      0.84      0.89      0.85      0.73      1250

# Adapted from imbalanced-learn
# Authors: Guillaume Lemaitre <g.lemaitre58@gmail.com>
# License: MIT

from sklearn import datasets
from sklearn.svm import LinearSVC
from sklearn.model_selection import train_test_split

from imbalanced_ensemble.sampler import over_sampling as os
from imbalanced_ensemble import pipeline as pl
from imbalanced_ensemble.metrics import classification_report_imbalanced

print(__doc__)

RANDOM_STATE = 42

# sphinx_gallery_thumbnail_path = '../../docs/source/_static/thumbnail.png'

# Generate a dataset
X, y = datasets.make_classification(
    n_classes=2,
    class_sep=2,
    weights=[0.1, 0.9],
    n_informative=10,
    n_redundant=1,
    flip_y=0,
    n_features=20,
    n_clusters_per_class=4,
    n_samples=5000,
    random_state=RANDOM_STATE,
)

pipeline = pl.make_pipeline(
    os.SMOTE(random_state=RANDOM_STATE), LinearSVC(random_state=RANDOM_STATE)
)

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=RANDOM_STATE)

# Train the classifier with balancing
pipeline.fit(X_train, y_train)

# Test the classifier and get the prediction
y_pred_bal = pipeline.predict(X_test)

# Show the classification report
print(classification_report_imbalanced(y_test, y_pred_bal))

Total running time of the script: ( 0 minutes 8.242 seconds)

Estimated memory usage: 17 MB

Gallery generated by Sphinx-Gallery