Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Insufficient rules on custom information to allow data merging #137

Open
Ichoran opened this issue Mar 9, 2017 · 3 comments
Open

Insufficient rules on custom information to allow data merging #137

Ichoran opened this issue Mar 9, 2017 · 3 comments
Milestone

Comments

@Ichoran
Copy link
Contributor

Ichoran commented Mar 9, 2017

We don't have a precise specification for custom data. This is a problem if we want to be able to merge and split time series--what do you do with custom data?

I propose we include the following specification for custom fields in the data section:

  • If the data is arrayed, each value for an associated custom key must either be an array of the length of the number of timepoints, or a single value that is assumed to apply to every timepoint.
  • If the data is not arrayed, there are no restrictions on values
  • When data is split by time, the custom values that are arrays are split at the same indices
  • When data is merged, arrays are concatenated, constant values are collapsed if they are the same, and are duplicated to every timepoint if they differ. If keys are present for some timepoints and not others, the missing timepoints will be filled in by JSON null.

This way the custom JSON data behaves the same way as the time series numeric data. (In particular, like the origin data where you can set a single origin for an arrayed time series.)

@Ichoran
Copy link
Contributor Author

Ichoran commented Mar 16, 2017

I have tried to write something along these lines in the documentation in #146 but it's not yet implemented by readers.

@MichaelCurrie
Copy link

I really like this idea. For now the Python parser just drops any custom fields as soon as the file is read, leaving it to other more specialized readers to handle the custom fields.

Being able to merge them in a way that makes sense would make the readers more useful for labs, and would mean they wouldn't have to specialize the readers at all, they could just deal with the custom fields they are interested in once the object is in memory.

So now all that's left is implementing it!

@MichaelCurrie MichaelCurrie added this to the python_1.2.0 milestone Mar 20, 2017
@Ichoran
Copy link
Contributor Author

Ichoran commented Jun 5, 2017

This is fully implemented in Scala (save for bugs) in #152

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants