reboot - Map and key insertion order

Stackoverflow question is here http://stackoverflow.com/questions/31635076/scala-immutable-map-like-datastructure-that-has-constant-effective-constant-look

@eed3si9n raised a potential issue, and that is the fact that `JObject` uses a `Map`, which means that there is no specified ordering on keys. The `JSON` specification says that you shouldn't rely on the object having any specific order, however this does have practical implications.

Quoting what @eed3si9n said in gitter

> suppose I want to construct a JSON object { "a": 1, "b": 2 }
> and this represents some serialized value
> for caching I might just take the SHA1 of it
> if I use your Map version of JObject to implement it, the SHA1 could change depending on the weather
> so the behavior of JObject becomes non-idempotent (I am not sure if this is the right term)
> this is not a what-if scenario as sbt currently uses json4s to serialize case classes via Scala Pickling - https://github.com/sbt/serialization

The problem is, that `Scala` `stdlib` does not have any immutable map like data structure which preserves key ordering that also has `c` or `eC` lookup time. There is `LinkedMap`, which is a `Map` backed by a `LinkedList`, however that has Linear lookup time. There is `LinkedHashMap` which has a constant lookup time, however its `mutable`. There are some solutions
1. Roll our own datastructure (which inherits from `Scala`'s collection library, so it can be treated to be exactly the same as a normal `Map`) which has all of the properties we need
2. Internally maintain a `Vector` as well as a `Map` so we can have some concept of ordering
3. Just use a `SortedMap`. We still can't guarantee that that the key order will be the same according to insertion, however we can guarantee that the keys will be `Sorted`, which means that if you `SHA-1` two `JObject`'s that have the same contents, they will be equal. It also means that when you serialize, we can guarantee that it will always be ordered. EDIT: Default implementation of `SortedMap` is a `TreeMap`, which has `log` lookup time.
4. Converted the `Map` to a `SortedMap` by some `Ordering`, which means that if you do something like what @eed3si9n above, you would convert your `JObject`'s `Map` to a `SortedMap` before doing a `SHA-1`.  The nicest way to implement this is by having some `sorted` method on a `JValue`, which will replace all internal `JObject`s to something like `JObjectSorted` and return a `JValueSorted`. You can then serialize a `JValueSorted` if you need this sought of functionality.

Taking into account all theoretical/practical/time considerations, I think 4 is most realistic. It fixes @eed3si9n issue indirectly, while still maintaining immutability and `c`/`eC` lookup time.

EDIT: The implementation for 3 (which uses a `SortedMap`), has only one implementation in `Scala`'s `stdlib`, which is a `TreeMap`. `TreeMap` has a `log` lookup time, which is definitely is not as bad as `n`, but its not a `c`/`eC`. Looks like we may need to roll our own data structure, if possible.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

reboot - Map and key insertion order #8

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

reboot - Map and key insertion order #8

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions