Skip to content

[Feature] Process consideration to keeping consistent data formatting #390

Open
@chadwpetersen

Description

@chadwpetersen

Is your feature request related to a problem? Please describe.

CSV file header changes can cause some issues with services that rely on the provided CSV formats provided by this awesome repo. This might break some downstream services that are expecting data in a specific format -i.e. data in a particular column index and maybe even having particular column header names.

To help ensure data guarantees to the community it would be great to keep these changes -if possible to a minimum and at least managed with some lean process. :)

Describe the solution you'd like

  • We could first consider only making column changes to a file append only. So if you want to add something new to the file -maybe consider appending it to the end of the file in that way it does not break any existing indexes others currently rely on.

  • We could also consider appending some sort of versioning to the end of a file name if we want to introduce a backwards incompatible change i.e reordering columns or renaming columns. Where the new file might get a _v2.csv added to it. That way people get time to upgrade to using the new file until sometime when we deprecate the old one.

Describe alternatives you've considered

  • We could add to the README that these file formats can break at anytime as it is not yet in a stable format.

  • We could also consider having a means of agreeing to the format before the format is used. So have the community vote (but this can be a bit too much I think).

Additional context

The reason I ask is that these types of changes should be consider as backwards incompatible as downstream services that rely on these files can break if they expect the values to be in certain columns with certain headings and thus not a great experience when things like this change.

I have experienced a few breakages related to the district data files and was hoping a simple, lean process could be considered when maintaining these data files so as to give the community some data structure guarantees. :)

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions