Skip to content

Error while fitting #510

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
GDGauravDutta opened this issue Apr 9, 2022 · 3 comments
Open

Error while fitting #510

GDGauravDutta opened this issue Apr 9, 2022 · 3 comments

Comments

@GDGauravDutta
Copy link

AssertionError Traceback (most recent call last)
/tmp/ipykernel_33/4176855811.py in
----> 1 automl.fit(X_train, y_train, task="regression",metric='rmse',time_budget=3600)

/opt/conda/lib/python3.7/site-packages/flaml/automl.py in fit(self, X_train, y_train, dataframe, label, metric, task, n_jobs, log_file_name, estimator_list, time_budget, max_iter, sample, ensemble, eval_method, log_type, model_history, split_ratio, n_splits, log_training_metric, mem_thres, pred_time_limit, train_time_limit, X_val, y_val, sample_weight_val, groups_val, groups, verbose, retrain_full, split_type, learner_selector, hpo_method, starting_points, seed, n_concurrent_trials, keep_search_state, early_stop, append_log, auto_augment, min_sample_size, use_ray, metric_constraints, **fit_kwargs)
2089
2090 self._validate_data(
-> 2091 X_train, y_train, dataframe, label, X_val, y_val, groups_val, groups
2092 )
2093 self._search_states = {} # key: estimator name; value: SearchState

/opt/conda/lib/python3.7/site-packages/flaml/automl.py in _validate_data(self, X_train_all, y_train_all, dataframe, label, X_val, y_val, groups_val, groups)
887 assert isinstance(y_train_all, np.ndarray) or isinstance(
888 y_train_all, pd.Series
--> 889 ), "y_train_all must be a numpy array or a pandas series."
890 assert (
891 X_train_all.size != 0 and y_train_all.size != 0

AssertionError: y_train_all must be a numpy array or a pandas series.

Info about data

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15129 entries, 0 to 15128
Data columns (total 28 columns):

Column Non-Null Count Dtype


0 bedrooms 15129 non-null uint8
1 bathrooms 15129 non-null float32
2 sqft_living 15129 non-null uint16
3 sqft_lot 15129 non-null uint32
4 floors 15129 non-null float32
5 waterfront 15129 non-null uint8
6 view 15129 non-null uint8
7 condition 15129 non-null uint8
8 grade 15129 non-null uint8
9 sqft_above 15129 non-null uint16
10 sqft_basement 15129 non-null uint16
11 yr_built 15129 non-null uint16
12 yr_renovated 15129 non-null uint16
13 lat 15129 non-null float32
14 long 15129 non-null float32
15 dateyear 15129 non-null uint16
16 datequarter 15129 non-null uint8
17 datemonth 15129 non-null uint8
18 dateday 15129 non-null uint8
19 dateday_of_week 15129 non-null uint8
20 dateday_of_year 15129 non-null uint16
21 dateweekofyear 15129 non-null uint8
22 dateis_month_end 15129 non-null uint8
23 dateis_month_start 15129 non-null uint8
24 dateis_quarter_end 15129 non-null uint8
25 dateis_quarter_start 15129 non-null uint8
26 dateis_year_end 15129 non-null uint8
27 dateis_weekend 15129 non-null uint8
dtypes: float32(4), uint16(7), uint32(1), uint8(16)

@qingyun-wu
Copy link
Contributor

Hi @GDGauravDutta, is your y_train_all a pandas series? Make sure it is in pandas series format. Let me know if it does not resolve the issue. Thank you!

@GDGauravDutta
Copy link
Author

Hi @GDGauravDutta, is your y_train_all a pandas series? Make sure it is in pandas series format. Let me know if it does not resolve the issue. Thank you!

All are series

@qingyun-wu
Copy link
Contributor

Hi @GDGauravDutta, Can you share with me a copy or a sub-sample of your data such that I can reproduce the error you had? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants