-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support initializing matrices with Patsy? #145
Comments
I like the idea, but just want to add a word of caution from my previous experience using patsy. Patsy seems to be focused on non-regularized models. For instance, it's rather cumbersome to specify a one-hot-encoded variable in patsy without dropping a column. I'm sure we could adapt patsy to our needs though. While thinking about this, I found this: https://github.com/matthewwardrop/formulaic, which seems to be fixing some of patsy's issues and would be easier to integrate to tabmat (since it has sparse matrix support built-in). |
As info, patsy has issues with pickle, see pydata/patsy#26. |
Addressed by #286. |
I think we've discussed this, but I don't remember the conclusion and can't find an issue now.
We recommend
from_pandas
as the way "most users" should construct tabmat objects.from_pandas
then guesses which columns should be treated as categorical. I think it would be really nice to have Patsy-like formulas as an alternative, sinceI'm not sure how feasible this would be, since Patsy is a sizable library that allows for fairly sophisticated formulas and it would be quite an endeavor to replicate all of the functionality. A few ways of doing this would be
pd.DataFrame.
That would not be any more efficient than (1), but would just save the user a little typing and the need to install patsy. On the down side, it adds a dependency and may force creation of a very large dense matrix.The text was updated successfully, but these errors were encountered: