[Bug]: best_model_for_estimator returns inconsistent feature_importances_ compared to automl.model when model_history=True #1422

Stickic-cyber · 2025-04-14T10:19:22Z

Describe the bug

While testing Issue #1398, I found an inconsistency related to feature_importances_:

When I manually set model_history=True, the call works and returns feature_importances_.

However, the feature_importances_ from automl.best_model_for_estimator(automl.best_estimator) and from automl.model are not the same.

This leads me to suspect that best_model_for_estimator may not be returning the actual best model selected by AutoML.

Steps to reproduce

from flaml import AutoML
import pandas as pd
import numpy as np

np.random.seed(41) 
n_samples = 50

X = pd.DataFrame({
    'feature1': np.random.rand(n_samples) * 10, 
    'feature2': np.random.randint(0, 5, n_samples),
    'feature3': np.random.normal(5, 2, n_samples)
})

y_train = 2 * X['feature1'] + 3 * X['feature2'] - 0.5 * X['feature3'] + np.random.randn(n_samples) * 2

settings = {
    "time_budget": 60,
    "estimator_list": ["extra_tree", "xgboost"],
    "task": "regression",
    "log_file_name": "test_ensemble.log",
    "seed": 41,
    "ensemble": False,
    "n_concurrent_trials": 1,
    "verbose": 1,
    "metric": "rmse",
    "mlflow_logging": True,
    "model_history": True
}

automl = AutoML()
automl.fit(X_train=X, y_train=y_train, **settings)

best_model = automl.best_model_for_estimator(automl.best_estimator)
print(automl.best_estimator)

feature_importances1 = best_model.feature_importances_ # 输出为None(当"model_history"为默认False时)
print("Feature importances1:", feature_importances1)
feature_importances2 = automl.model.feature_importances_ # 输出不为None
print("Feature importances2:", feature_importances2)

Screenshots and logs

Additional Information

FLAML Version: 2.3.4
python Version: 3.11
Operating System: Windows

The text was updated successfully, but these errors were encountered:

Stickic-cyber added the bug Something isn't working label Apr 14, 2025

thinkall assigned murunlin May 13, 2025

murunlin mentioned this issue May 13, 2025

fix: best_model_for_estimator returns inconsistent feature_importances_ compared to automl.model #1429

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: best_model_for_estimator returns inconsistent feature_importances_ compared to automl.model when model_history=True #1422

[Bug]: best_model_for_estimator returns inconsistent feature_importances_ compared to automl.model when model_history=True #1422

Stickic-cyber commented Apr 14, 2025 •

edited

Loading

[Bug]: best_model_for_estimator returns inconsistent feature_importances_ compared to automl.model when model_history=True #1422

[Bug]: best_model_for_estimator returns inconsistent feature_importances_ compared to automl.model when model_history=True #1422

Comments

Stickic-cyber commented Apr 14, 2025 • edited Loading

Describe the bug

Steps to reproduce

Screenshots and logs

Additional Information

Stickic-cyber commented Apr 14, 2025 •

edited

Loading