-
Notifications
You must be signed in to change notification settings - Fork 108
feat: Add push_validity_into_children methods to StructArray #5826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Add methods to push struct-level validity into child fields: - push_validity_into_children(preserve_struct_validity: bool) - push_validity_into_children_default() - convenience method with preserve=false The functionality propagates null information from struct level down to individual fields, with options to preserve or remove the struct-level validity. Includes comprehensive tests covering all scenarios: - preserve_struct_validity = true - preserve_struct_validity = false (default) - no nulls edge case Signed-off-by: amorynan <amorywang111@gmail.com>
Codecov Report✅ All modified and coverable lines are covered by tests. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
connortsui20
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good first step, though I'm trying to figure out how we want to implement this behavior for the new operator world we are migrating to.
@gatesn mentioned this in the original issue:
Since this issue was filed, we now have a "Mask" expression that essentially performs intersection with validity.
So you could take a look at how you might take the validity from the struct array itself, and wrap up each child field in a mask expression (see builtins.rs), before constructing a new StructArray with or without the previous validity. These two are different behaviors.
This PR implements this feature correctly for the old (current) world, but we will likely want to implement the behavior described above very very soon, so soon that it might not even be worth merging this right now? @gatesn do you have any thoughts?
| pub fn push_validity_into_children_default(&self) -> VortexResult<Self> { | ||
| self.push_validity_into_children(false) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to have this method, the one below should be sufficient since a the bool impl of Default is already false.
|
|
||
|
|
||
| pub fn push_validity_into_children(&self, preserve_struct_validity: bool) -> VortexResult<Self> { | ||
| use crate::compute::mask; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this import to the top?
| pub fn push_validity_into_children(&self, preserve_struct_validity: bool) -> VortexResult<Self> { | ||
| use crate::compute::mask; | ||
|
|
||
| // Get the struct-level validity mask |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: periods at the ends of sentences.
For solving : #3859
Add methods to push struct-level validity into child fields:
The functionality propagates null information from struct level down to individual fields, with options to preserve or remove the struct-level validity.
Includes comprehensive tests covering all scenarios: