Skip to content

add support for maps with list values #86

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 21, 2019

Conversation

sfmontyo
Copy link
Contributor

Avro has support for maps with lists of strings. I've modified the AvroJsonSchemafulRecordConverter.java to handle this case.

@hpgrahsl
Copy link
Owner

First of all THX for you contribution. Can you maybe add to this and provide the following?

  • 1 example for each: thus a map and a list AVRO schema that you can support with this change and how it differs from the current behaviour or schema that is now supported?
  • also it would be very helpful in case you can add to the existing unit test for this class so that we have this new behaviour reflected in the test suite as well.

Many THX again for your efforts and looking into this @sfmontyo

@sfmontyo
Copy link
Contributor Author

So I've modified the RecordConverterTest to test for maps with lists of ints or strings in addition to the other data types it was checking and the tests pass.
I wasn't certain what to do about the "examples". Do you mean for me to edit the project Readme to have an example of a map field with list of values?

@hpgrahsl
Copy link
Owner

Ok I see. Honestly by reading the AVRO spec back then when I implemented this I wasn't aware that this is supported. Because the docs actually seem to not mention what you suggest here.

Irrespective of your current implementation, have you tried to write an AVRO schema file containing a map which in turn has an array as values? Also what about the other complex types of AVRO, can they also be used as value types in maps? Happy to hear your thoughts!

@hpgrahsl hpgrahsl self-assigned this Jun 19, 2019
@sfmontyo
Copy link
Contributor Author

sfmontyo commented Jun 19, 2019

Ah yes. We have data being shipped across an avro topic and consumed by the standard elasticsearch connector which blew up when we tried to use the mongo connector. The avro schema was handled fine by the standard elasticsearch connector just not by the mongo one. As far as I know, the only requirement for maps is that their key is a string.

@hpgrahsl
Copy link
Owner

Great that you found this! Coming back to my other question: Also what about the other complex types of AVRO, can they also be used as value types in maps?

And it would be nice - like you assumed anyway - if you can adapt the README in the section near the top with the sample AVRO schema to reflect this change e.g. by adding a map with list of string/int respectively.

If I understood your PR correctly the current behaviour is not affected and still works e.g. maps with string keys and any primitive type, right? Looking forward to get this merged after making some final checks and maybe small adaptions in the tests.

THX again for your contribution!

@sfmontyo
Copy link
Contributor Author

Here is our avro spec for the class.
{
"type": "record",
"name": "mystruct",
"namespace": "com.xxxx.avro_generated",
"fields": [
{ "name": "p", "type": [ "null", "int" ], "default": null },
{ "name": "at",
"type": [ "null", { "type" : "map", "values" : { "type": "array", "items": "string" } } ], "default": null },
{ "name": "pu", "type": [ "null", "string" ], "default": null }
]
}

@sfmontyo
Copy link
Contributor Author

sfmontyo commented Jun 19, 2019

So I've modified the example to have an optional map field that if present, will be a Map<String,List> .
Also, you are correct that my changes do not affect the existing behavior - they merely support allowing map values to be arrays in addtion to the the primitives and structs.

Oh and btw, thank you for maintaining this connector!

@sfmontyo
Copy link
Contributor Author

sfmontyo commented Jun 19, 2019

btw - my reading of the spec is that the values can be any valid schema including primitives, arrays, maps and structs. My changes support arrays, thus allowing map values to be primitives, structs or arrays. Looking at the code base, it doesn't look like it would handle a map having values that are also maps.

@hpgrahsl
Copy link
Owner

... My changes support arrays allowing map values to be primitives, structs or arrays.

sounds good!

Looking at the code base, it doesn't look like it would handle a map having values that are also maps.

you think it would make sense to support maps in maps? I haven't seen any use case for it so far which is why this whole issue never popped up until now that you ran into it :)

@sfmontyo
Copy link
Contributor Author

We don't need maps of maps and given no one has requested it, it doesn't seem of great importance. I only mentioned it as to answer your query what other complex Avro types are supported. That wasn't a request of mine :)

@@ -82,6 +84,8 @@ public static void initializeTestData() {
.build())
)
.field("mySubDoc2", SchemaBuilder.map(Schema.STRING_SCHEMA, Schema.INT32_SCHEMA).build())
.field( "myMapOfStrings", SchemaBuilder.map(Schema.STRING_SCHEMA, SchemaBuilder.array(Schema.STRING_SCHEMA).build()).build())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi again. sorry it took me some time to get back on this...
I'm really looking forward to merging this. Let me suggest one more small aspect which is that I think for consistency reasons of this test we should also add the two maps to the JSON_STRING_1 above at right before the "myBytes" field

see 0c0ad67#diff-ab414b445c07f6db6f112afe2ca89aafL63

could you maybe please do it directly in your branch? then it's the least effort I guess to get it merged :)

@sfmontyo
Copy link
Contributor Author

Ok I modified both the OBJ_MAP_1 and JSON_STRING_1 to have the same fields as OBJ_SCHEMA_1 .

@hpgrahsl hpgrahsl merged commit 2f43344 into hpgrahsl:master Jun 21, 2019
@hpgrahsl
Copy link
Owner

THX @sfmontyo for this contribution. Highly appreciated! As always I'd be happy to learn about your concrete use cases for my project. If possible, please share :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants