Skip to content

Performance issue for large data / kotlinx-io #217

@westnordost

Description

@westnordost

TLDR: Library user experiences performance issues and requests support for kotlinx-io streaming primitives.


As one part of making an app multiplatform, I replaced Java HttpUrlConnection + XML pull parsing with ktor-client + xmlutil + kotlinx-serialization. Unfortunately, the performance at least on parsing large data (tested with ~10MB) got 3-6 times worse. (see StreetComplete#5686 for a short analysis and comparison).

I suspect that using xmlutil's streaming parser instead of deserializing the whole data structure with kotlinx-serialization might improve this somewhat because theoretically, xmlutil should be able start reading the bytes as they come through the wire and in a separate thread than the thread that receives the bytes. (But ultimately not knowing the internals, I can only guess. If you have any other suspicions for the cause of the performance issue, let me know!)

The documentation is a bit thin on streaming, but I understand I need to call xmlStreaming.newGenericReader(Reader) to get an XmlReader which is a xml pull parser interface. However, I need to supply a (Java) Reader, so currently it seems there is no interface for stream parsing on multiplatform.

I understand that byte streaming support and consequently text streaming built on top of that is a bit higgledy-piggledy right now in the Kotlin ecosystem because a replacement for InputSteam etc. has never been available in the Kotlin standard library. So, every library that does something with IO implemented their own thing, if anything at all - sometimes based on okio, sometimes an own implementation.

However, now it seems like things are about to get better: Both ktor and apparently kotlinx-serialization are being migrated to use kotlinx-io and thus the common interfaces like Sink, Source and Buffer, which are replacements for InputStream et al.

So, in case you'd agree that most likely my performance issue with large data stirs from the lack of XML streaming, I guess my request would be to move with kotlinx-serialization and ktor to support a common interface for streaming bytes and text.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions