Note
Go to the end to download the full example code.
Generate an imbalanced dataset
An illustration of using the
generate_imbalance_data()
function to create an imbalanced dataset.
# Authors: Zhining Liu <zhining.liu@outlook.com>
# License: MIT
print(__doc__)
from imbens.datasets import generate_imbalance_data
from imbens.utils._plot import plot_2Dprojection_and_cardinality
from collections import Counter
Generate the dataset
X_train, X_test, y_train, y_test = generate_imbalance_data(
n_samples=1000,
weights=[0.7, 0.2, 0.1],
test_size=0.5,
kwargs={'n_informative': 3},
)
print("Train class distribution: ", Counter(y_train))
print("Test class distribution: ", Counter(y_test))
Train class distribution: Counter({np.int64(0): 344, np.int64(1): 103, np.int64(2): 53})
Test class distribution: Counter({np.int64(0): 343, np.int64(1): 103, np.int64(2): 54})
Plot the generated (training) data
plot_2Dprojection_and_cardinality(X_train, y_train)

(<Figure size 1000x400 with 2 Axes>, (<Axes: title={'center': 'Dataset (2D projection by KernelPCA)'}>, <Axes: title={'center': 'Class Distribution'}, xlabel='Class'>))
Total running time of the script: (0 minutes 0.260 seconds)