-
Notifications
You must be signed in to change notification settings - Fork 546
Description
Describe the bug
I have a data set where I have tried to optimise the hyperparameters on Flaml, and it seems that the model keeps getting worse, the longer I give it. Here is a simple example of the code I have for the model I am trying to optimise:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.utils.class_weight import compute_sample_weight
from sklearn.metrics import f1_score, confusion_matrix, classification_report, precision_score, recall_score
from flaml import AutoML
import numpy as np
import joblib
def create_and_train_pipeline(X_train, y_train, numerical_features, categorical_features, time_budget=60):
"""
Creates and trains a pipeline without requiring custom wrapper class
"""
# First, create and fit the preprocessor
numeric_transformer = Pipeline(steps=[
('scaler', StandardScaler())
])
categorical_transformer = Pipeline(steps=[
('onehot', OneHotEncoder(handle_unknown='ignore'))
])
preprocessor = ColumnTransformer(
transformers=[
('num', numeric_transformer, numerical_features),
('cat', categorical_transformer, categorical_features)
],
remainder='drop',
sparse_threshold=0
)
# Fit the preprocessor first
X_train_transformed = preprocessor.fit_transform(X_train)
# Train AutoML on the transformed data
automl = AutoML()
# Train AutoML
settings = {
"time_budget": time_budget,
"task": "classification",
"estimator_list": ['lgbm', 'rf'],
"eval_method": "cv",
"metric": "f1",
"n_splits": 5,
"split_type": "stratified"
}
automl.fit(X_train_transformed, y_train, **settings)
# Create final pipeline with best model
final_pipeline = Pipeline([
('preprocessor', preprocessor),
('classifier', automl.model.estimator) # Use the best model directly
])
# Print training results
print(f"Best ML model:")
print(automl.model.estimator)
print("\nBest hyperparameter configuration:")
print(automl.best_config)
print("\nBest score on validation data: {:.4f}".format(automl.best_loss))
# Generate and print test metrics
y_pred = final_pipeline.predict(X_test)
print("\nTraining Set Metrics:")
print("\nClassification Report:")
print(classification_report(y_test, y_pred))
print("\nConfusion Matrix:")
print(confusion_matrix(y_test, y_pred))
# Save the pipeline
joblib.dump(final_pipeline, 'full_prediction_pipeline.joblib')
return final_pipeline, automl
if __name__ == "__main__":
categorical_features = ['created_on', 'dex_id', 'price_confidence']
numerical_features = [col for col in X_train.columns if col not in categorical_features]
pipeline, automl = create_and_train_pipeline(
X_train=X_train,
y_train=y_train,
numerical_features=numerical_features,
categorical_features=categorical_features,
time_budget=35
)
Giving a minor f1 of 0.37 and a major f1 of 0.96 with a budget of 35 seconds:
Best score on validation data: 0.5886
Training Set Metrics:
Classification Report:
precision recall f1-score support
0 0.97 0.95 0.96 930
1 0.32 0.45 0.37 49
accuracy 0.92 979
macro avg 0.64 0.70 0.67 979
weighted avg 0.94 0.92 0.93 979
Confusion Matrix:
[[883 47]
[ 27 22]]
If I increase it to 60 seconds I get a minor f1 of 0.34 and a major f1 of 0.96:
Best score on validation data: 0.5815
Training Set Metrics:
Classification Report:
precision recall f1-score support
0 0.97 0.95 0.96 930
1 0.30 0.39 0.34 49
accuracy 0.92 979
macro avg 0.63 0.67 0.65 979
weighted avg 0.93 0.92 0.93 979
Confusion Matrix:
[[885 45]
[ 30 19]]
And after 120 seconds minor f1 of 0.33 and major f1 of 0.96:
Training Set Metrics:
Classification Report:
precision recall f1-score support
0 0.97 0.95 0.96 930
1 0.29 0.39 0.33 49
accuracy 0.92 979
macro avg 0.63 0.67 0.65 979
weighted avg 0.93 0.92 0.93 979
Confusion Matrix:
[[884 46]
[ 30 19]]
I am wondering why it is doing this? The error in the logs seems to be getting reduced however the output model is worse. This seems to be the case even when I define my own custom metric (and negate the output of course). As the negative number is getting minimised (absolute value getting bigger), it seems to give a worse final confusion matrix. What am I doing wrong here? Thanks a lot
Steps to reproduce
No response
Model Used
No response
Expected Behavior
No response
Screenshots and logs
No response
Additional Information
No response