Skip to content

Add withData #448

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/kd.tree
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@
<toc-element topic="API.md">
<toc-element topic="Plot-API.md"/>
<toc-element toc-title="Data Manipulation">
<toc-element topic="WithData-API.md"/>
<toc-element topic="GroupBy-API.md"/>
<toc-element toc-title="statistics">
<toc-element topic="StatBin-API.md"/>
Expand Down
1 change: 1 addition & 0 deletions docs/topics/API.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
[layout](Layout-API.md)
[tooltips](Tooltips-API.md)
## Data manipulation
[withData](WithData-API.md)
[groupBy](GroupBy-API.md)
### Statistics
[statBin](StatBin-API.md)
Expand Down
1 change: 1 addition & 0 deletions docs/topics/Getting-Started.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,7 @@ providing a quick reference to assist you in building your visualizations.
<list>
<li>Data Manipulation
<list>
<li><a href="WithData-API.md">groupBy</a></li>
<li><a href="GroupBy-API.md">groupBy</a></li>
<li><a href="StatBin-API.md">statBin</a></li>
<li><a href="StatDensity-API.md">statDensity</a></li>
Expand Down
34 changes: 34 additions & 0 deletions docs/topics/apiRef/WithData-API.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# withData

<tldr>
<p><format style="bold" color="GoldenRod">
withData&lt;<a href="#t"><format color="Blue">T</format></a>></format>(
<a href="#dataset"><format style="bold" color="CadetBlue">dataset</format></a>:
<emphasis>DataFrame&lt;T> | Map&lt;String, List&lt;*>></emphasis>) <format style="italic">{ this: DataFrameScope&lt;T> -></format></p>

<format style="italic">}</format>
</tldr>

The `withData` function creates a new plotting context with a new provided [dataset](#dataset).
All layers created in this context use this dataset.
If `DataFrame` is provided as a dataset, you can access its columns in this context.

## Arguments

### T

<p>Type of DataFrame</p>

### dataset

<p>
<format style="superscript" color="Red">Required</format>
</p>
<p>
<format style="superscript" color="#E8488B">DataFrame&lt;T></format>
<format style="superscript" color="#E8488B">Map&lt;String, List&lt;*>></format>
</p>

<p>
New dataset used for layers created in a new context.
</p>
2 changes: 1 addition & 1 deletion gradle/libs.versions.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ dataframe = "0.14.1"
serialization = "1.6.3"
datetime = "0.6.0"
html = "0.11.0"
statistics = "0.4.0-dev-7"
statistics = "0.4.0-dev-8"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to separate our builds for release and dev

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approximately what is done in dataframe:

  • For stable releases or for M, RC versions, we manually change the version and run a release publish on the build server
  • For dev versions, we run publish with the dev flag (or explicitly differentiate between publish and dev publish), and a dev suffix with the build number is automatically added

This will eliminate the need for us to manually control dev versions

letsPlot = "4.7.3"
letsPlotImage = "4.3.3"
mockk = "1.13.10"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ public abstract class MultiLayerPlotBuilder internal constructor() : LayerCreato
*
* @return new dataset builder index in [datasetBuilders].
*/
@PublishedApi
internal abstract fun addDataset(dataset: TableData, initialBuilder: DatasetBuilder? = null): Int
internal abstract fun addEmptyDataset(): Int

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
package org.jetbrains.kotlinx.kandy.dsl.internal

/**
* Marks the DSL for creating and configuring plots within the Kandy library.
*
* The `PlotDslMarker` annotation is used to restrict the scope of DSL functions to prevent unintentional
* interference between different DSL builders. By applying this marker, we ensure a clear and structured
* separation of concerns within the DSL context, leading to safer and more predictable DSL designs.
*
* Now only works for `DataFramePlotBuilder.withData {}`
*/
@DslMarker
internal annotation class PlotDslMarker
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,10 @@ import org.jetbrains.kotlinx.dataframe.*
import org.jetbrains.kotlinx.dataframe.api.GroupBy
import org.jetbrains.kotlinx.dataframe.api.getColumns
import org.jetbrains.kotlinx.dataframe.api.groupBy
import org.jetbrains.kotlinx.dataframe.api.toDataFrame
import org.jetbrains.kotlinx.dataframe.columns.ColumnReference
import org.jetbrains.kotlinx.kandy.dsl.internal.DatasetBuilder
import org.jetbrains.kotlinx.kandy.dsl.internal.PlotDslMarker

/**
* Represents a standard plotting context initialized with a [DataFrame] as its primary dataset.
Expand All @@ -17,6 +19,7 @@ import org.jetbrains.kotlinx.kandy.dsl.internal.DatasetBuilder
*
* @param T the type of the DataFrame.
*/
@PlotDslMarker
public class DataFramePlotBuilder<T> @PublishedApi internal constructor(
@PublishedApi
internal val dataFrame: DataFrame<T>,
Expand All @@ -39,11 +42,37 @@ public class DataFramePlotBuilder<T> @PublishedApi internal constructor(
*/
public fun <C> columns(vararg columns: String): List<AnyCol> = dataFrame.getColumns(*columns)

/**
* Creates and initializes a new layer creator scope with the given dataframe as a dataset.
*
* @param dataFrame The DataFrame to be used as a dataset within the scope.
* @param block layer creator scope with a new dataset.
*/
public inline fun <T> withData(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be reflected somehow in tests?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add tests later

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, at this stage of the library development, we should write tests right away. This will help us catch bugs faster, verify the API if it’s a public one, and reduce issues in the future

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

dataFrame: DataFrame<T>,
block: DataFrameScope<T>.() -> Unit
) {
DataFrameScope(dataFrame, this, addDataset(NamedData(dataFrame), null)).apply(block)
}

/**
* Creates and initializes a new layer creator scope with the given map as a dataset.
*
* @param map The map to be used as a dataset within the scope.
* @param block layer creator scope with a new dataset.
*/
public inline fun withData(
map: Map<String, List<*>>,
block: DataFrameScope<*>.() -> Unit
) {
withData(map.toDataFrame(), block)
}

/**
* Creates and initializes a new context with the dataframe grouped by the specified column names.
*
* @param columns the column names to group the dataframe by.
* @param block a lambda with receiver block that configures the new grouped context.
* @param block layer creator scope with a new dataset.
*/
public inline fun groupBy(
columns: Iterable<String>,
Expand All @@ -61,7 +90,7 @@ public class DataFramePlotBuilder<T> @PublishedApi internal constructor(
* Creates and initializes a new context with the dataframe grouped by the specified column names.
*
* @param columns the column names to group the dataframe by.
* @param block a lambda with receiver block that configures the new grouped context.
* @param block layer creator scope with a new dataset.
*/
public inline fun groupBy(
vararg columns: String,
Expand All @@ -72,7 +101,7 @@ public class DataFramePlotBuilder<T> @PublishedApi internal constructor(
* Creates and initializes a new context with the dataframe grouped by the given column references.
*
* @param columnReferences references to the columns to group by.
* @param block a lambda with receiver block that configures the new grouped context.
* @param block layer creator scope with a new dataset.
*/
public inline fun groupBy(
vararg columnReferences: ColumnReference<*>,
Expand All @@ -83,7 +112,7 @@ public class DataFramePlotBuilder<T> @PublishedApi internal constructor(
* Creates and initializes a new context with the dataframe grouped by the given column references.
*
* @param columnReferences a list of references to the columns to group by.
* @param block a lambda with receiver block that configures the new grouped context.
* @param block layer creator scope with a new dataset.
*/
public inline fun groupBy(
columnReferences: List<ColumnReference<*>>,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
package org.jetbrains.kotlinx.kandy.dsl.internal.dataframe

import org.jetbrains.kotlinx.dataframe.ColumnsContainer
import org.jetbrains.kotlinx.dataframe.DataFrame
import org.jetbrains.kotlinx.kandy.dsl.internal.LayerCreatorScope
import org.jetbrains.kotlinx.kandy.dsl.internal.MultiLayerPlotBuilder
import org.jetbrains.kotlinx.kandy.dsl.internal.PlotDslMarker

/**
* Represents a plot builder data scope with grouped dataset
* created by [DataFramePlotBuilder.withData].
*
* @param T The type of the DataFrame.
*/
@PlotDslMarker
public class DataFrameScope<T> @PublishedApi internal constructor(
dataFrame: DataFrame<T>,
override val plotBuilder: MultiLayerPlotBuilder,
override val datasetIndex: Int,
) : LayerCreatorScope(), ColumnsContainer<T> by dataFrame {
override val layersInheritMappings: Boolean = false
}
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import org.jetbrains.kotlinx.kandy.dsl.internal.MultiLayerPlotBuilder
import org.jetbrains.kotlinx.kandy.ir.data.TableData

public abstract class MultiLayerPlotBuilderImpl : MultiLayerPlotBuilder() {
@PublishedApi
override fun addDataset(dataset: TableData, initialBuilder: DatasetBuilder?): Int {
datasetBuilders.add(DatasetBuilderImpl(dataset, initialBuilder as DatasetBuilderImpl?))
return datasetBuilders.lastIndex
Expand Down
108 changes: 108 additions & 0 deletions kandy-api/src/test/kotlin/org/jetbrains/kotlinx/kandy/dsl/withData.kt
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
/*
* Copyright 2020-2023 JetBrains s.r.o. Use of this source code is governed by the Apache 2.0 license.
*/

package org.jetbrains.kotlinx.kandy.dsl

import org.jetbrains.kotlinx.dataframe.api.column
import org.jetbrains.kotlinx.dataframe.api.toDataFrame
import org.jetbrains.kotlinx.kandy.dsl.impl.*
import org.jetbrains.kotlinx.kandy.dsl.internal.dataframe.NamedData
import org.jetbrains.kotlinx.kandy.ir.Layer
import org.jetbrains.kotlinx.kandy.ir.Plot
import org.jetbrains.kotlinx.kandy.ir.bindings.NonPositionalMapping
import org.jetbrains.kotlinx.kandy.ir.bindings.PositionalMapping
import org.jetbrains.kotlinx.kandy.ir.scale.PositionalContinuousScale
import org.jetbrains.kotlinx.kandy.util.color.Color
import kotlin.test.Test
import kotlin.test.assertEquals

class WithDataTest {

@Test
fun withDataTest() {
val datasetMain = mapOf(
"x" to listOf(1.0, 2.0, 3.0),
"y" to listOf(3F, 12F, 5.5F),
).toDataFrame()
val srcX = column<Double>("x")
val srcY = column<Float>("y")

val datasetSecondary = mapOf(
"width" to listOf(1.0, 2.0, 3.0, 3.0),
"height" to listOf(3F, 12F, 5.5F, 8F),
"type" to listOf("A", "B", "A", "B"),
).toDataFrame()

val width = column<Double>("width")
val height = column<Float>("height")
val type = column<String>("type")

val plot = datasetMain.plot {
x(srcX)
points {
y(srcY)
}
withData(datasetSecondary) {
line {
x(width)
y(height) {
scale = continuous(1f..15f)
}
color(type)
}
}
}

assertEquals(
Plot(
listOf(NamedData(datasetMain), NamedData(datasetSecondary)),
listOf(
Layer(
0,
POINT,
mappings = mapOf(
Y to PositionalMapping<Float>(
Y, srcY.name(), CommonPositionalMappingParametersContinuous()
),
),
settings = emptyMap(),
emptyMap(),
emptyMap(),
),
Layer(
1,
LINE,
mappings = mapOf(
X to PositionalMapping<Double>(
X, width.name(), CommonPositionalMappingParametersContinuous()
),
Y to PositionalMapping(
Y, height.name(), CommonPositionalMappingParametersContinuous(
PositionalContinuousScale(1f, 15f, null, null)
)
),
COLOR to NonPositionalMapping<String, Color>(
COLOR, type.name(), CommonNonPositionalMappingParametersContinuous()
),
),
emptyMap(),
emptyMap(),
emptyMap(),
inheritsBindings = false
)
),
mapOf(
X to PositionalMapping<Float>(
X, srcX.name(), CommonPositionalMappingParametersContinuous()
),
),
emptyMap(),
emptyMap(),
emptyMap()
),
plot
)
}
}