Skip to content

[Bug]: best_model_for_estimator returns inconsistent feature_importances_ compared to automl.model when model_history=True #1422

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Stickic-cyber opened this issue Apr 14, 2025 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@Stickic-cyber
Copy link
Contributor

Stickic-cyber commented Apr 14, 2025

Describe the bug

While testing Issue #1398, I found an inconsistency related to feature_importances_:

When I manually set model_history=True, the call works and returns feature_importances_.

However, the feature_importances_ from automl.best_model_for_estimator(automl.best_estimator) and from automl.model are not the same.

This leads me to suspect that best_model_for_estimator may not be returning the actual best model selected by AutoML.

Steps to reproduce

from flaml import AutoML
import pandas as pd
import numpy as np

np.random.seed(41) 
n_samples = 50

X = pd.DataFrame({
    'feature1': np.random.rand(n_samples) * 10, 
    'feature2': np.random.randint(0, 5, n_samples),
    'feature3': np.random.normal(5, 2, n_samples)
})

y_train = 2 * X['feature1'] + 3 * X['feature2'] - 0.5 * X['feature3'] + np.random.randn(n_samples) * 2

settings = {
    "time_budget": 60,
    "estimator_list": ["extra_tree", "xgboost"],
    "task": "regression",
    "log_file_name": "test_ensemble.log",
    "seed": 41,
    "ensemble": False,
    "n_concurrent_trials": 1,
    "verbose": 1,
    "metric": "rmse",
    "mlflow_logging": True,
    "model_history": True
}

automl = AutoML()
automl.fit(X_train=X, y_train=y_train, **settings)

best_model = automl.best_model_for_estimator(automl.best_estimator)
print(automl.best_estimator)

feature_importances1 = best_model.feature_importances_ # 输出为None(当"model_history"为默认False时)
print("Feature importances1:", feature_importances1)
feature_importances2 = automl.model.feature_importances_ # 输出不为None
print("Feature importances2:", feature_importances2)

Screenshots and logs

Image

Additional Information

FLAML Version: 2.3.4
python Version: 3.11
Operating System: Windows

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants