Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(api): add FieldNotFoundError #10412

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

NickCrews
Copy link
Contributor

@NickCrews NickCrews commented Nov 1, 2024

I have been getting sick of typing some_table_or_struct.field_that_doesnt_exist_or_has_a_small_typo and then getting a useless error message. It also is annoying to do some_table.select(doesnt_exist, also_doesnt_exist), and you only get an error for the very first one. It would be better if you got back info on ALL the values that failed to get resolved.

This PR makes that UX much better.

Still need to add tests, but I wanted to get this up here for some initial thoughts before I invested more time. Is this something we want to pursue?

@NickCrews NickCrews force-pushed the field-not-found-error branch from 2fae0d4 to 8be679f Compare November 1, 2024 19:21
values = []
errs = []
# bind positional arguments
for arg in args:
Copy link
Contributor Author

@NickCrews NickCrews Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somewhat orthogonal to the FieldNotFoundError stuff. But if I have multiple bogus columns, eg table.select("bogus", "also_bogus") I only see the first error. I want to see them all.

if len(bindings) != 1:
raise com.IbisInputError(
"Keyword arguments cannot produce more than one value"
)
(value,) = bindings
values.append(value.name(key))
if errs:
raise com.IbisError(
Copy link
Contributor Author

@NickCrews NickCrews Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if IbisError is the best type for this.

@@ -739,8 +757,9 @@ def __getattr__(self, key: str) -> ir.Column:
"""
try:
return ops.Field(self, key).to_expr()
except com.IbisTypeError:
pass
except com.FieldNotFoundError as e:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a slight difference in ux here that I'd love to unify

  • if I do Table.totally_bogus, then I get AttributeError: 'Table' object has no attribute 'bogus'
  • if I do Table["totally_bogus"], then I get FieldNotFoundError: 'bogus' not found in Table object. Possible options: {'x'}

I'm not sure which is better. If we say Possible options:... then that only includes the field names, and misses all the Table methods. But, all methods are 1. in the docs and 2. should have tab completion in many cases, so I bet typos are a lot more likely on column typos than method typos. So I think the FieldNotFoundError with the suggestion might be better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a difference between Tables and Structs. I'd love for them to have the same behavior too.

@@ -205,7 +205,7 @@ def __getitem__(self, name: str) -> ir.Value:
KeyError: 'foo_bar'
"""
if name not in self.names:
raise KeyError(name)
raise FieldNotFoundError(self, name, self.names)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's OK to be breaking here and not raise a KeyError anymore?

Copy link
Contributor Author

@NickCrews NickCrews Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KeyError seems semantically a little wrong. I think KeyError should be for collections with a dynamic set of keys, such as a vanilla python dict. But structs have a static set of keys, so FieldNotFoundError, as a subclass of AttributeError, seems better to me.

I have been getting sick of typing some_table_or_struct.field_that_doesnt_exist_or_has_a_small_typo and then getting a useless error message. This PR makes that UX much better.

Still need to add tests, but I wanted to get this up here for some initial thoughts before I invested more time. Is this something we want to pursue?
@NickCrews NickCrews force-pushed the field-not-found-error branch from 8be679f to c5f58d5 Compare November 1, 2024 19:51
@NickCrews
Copy link
Contributor Author

@cpcloud does this look like an idea worth considering? Any behavior requirements that you think I would need to make an implementation satisfy?

@NickCrews NickCrews marked this pull request as ready for review December 5, 2024 00:52
@NickCrews
Copy link
Contributor Author

@cpcloud should I invest more time in this or do you think this whole idea is not the right direction?

@cpcloud
Copy link
Member

cpcloud commented Jan 24, 2025

I'm not opposed to this but I think the behavior should be different for attribute versus getitem syntax.

It all comes down to what we believe the user's intent is, to the best of our knowledge.

For attribute misspellings, I think we should leave them as plain old AttributeError and do nothing fancy. We have no idea whether the user meant a method or a column, and we should really refuse the temptation to guess.

For [] syntax, the only thing that could possibly be is a field access (in both the column and the struct field case), so upgrading the UX with FieldNotFound seems like a nice improvement.

@NickCrews
Copy link
Contributor Author

That sounds like good behavior to me to return an AttributeError for .dot accesses. I think that makes sense to "refuse the temptation to guess" and raise an AttributeError and not a FieldNotFoundError, because that is better to be caught programmatically. But, what about being a little bit fancy/helpful, and returning a TableAttributeError, which is a subclass of AttributeError, which has the helpful "did you mean this column?" error message? Then this is the best of both worlds?

There is one bit of nuance on attribute access though. If I do t.bogus, I agree that should just be the above behavior, it is impossible to guess if the user meant a method or a column, so just return an AttributeError (or subclass of it as I suggest above). However, if I do t.select(_.bogus), then this is unambiguous that they wanted a Column, and so returning a FieldNotFoundError seems better.

In pseudocode:

# Raised on t.bogus
# Includes helpful suggestion "did you mean <best match among all columns AND attributes/methods>"
class TableAttributeError(AttributeError, IbisError): ...

# Raised on my_struct.bogus
# Includes helpful suggestion "did you mean <best match among all fields AND attributes/methods>"
# Maybe should get consolidated with TableAttributeError
class StructAttributeError(AttributeError, IbisError): ...

# Raised on table_or_struct["bogus", "bogus2"] and table_or_struct.select(_.bogus, _.bogus2) and my_struct["bogus"]
# Includes helpful suggestion "did you mean <best match among ONLY columns/fields>"
# Note that this contains ALL not found fields.
# I think that would be simpler to just have one error, and not one for single FieldNotFound.
# Should this inherit from AttributeError and/or KeyError?
# I think AttributeError since 1. these are statically known and 2. then a user can `catch AttributeError`
# and have all of these errors caught.
class FieldsNotFoundError(AttributeError, IbisError): ...

What do you think of this spec?

@NickCrews
Copy link
Contributor Author

and do nothing fancy
we should really refuse the temptation to guess

So I understand your goals, is your reasoning here so that 1. our maintenance is easier or 2. so we don't accidentally do the wrong thing for the user?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants