Skip to content

Sync plugin support as of 2024-12-31 #1478

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 22, 2025

Conversation

cindyyuanjiang
Copy link
Collaborator

@cindyyuanjiang cindyyuanjiang commented Dec 31, 2024

Fixes #1452

This PR syns plugin support as of 2024-12-31.

The changes include:

This report documents the differences between the tools existing CSV files and those processed from the plugin.
    Notes:
      1. For new data source/exec/expression from plugin, the first column with supported level will be updated to 'TNEW' for future testing.
      2. Rows marked as "is removed" will be preserved in the final output.
      3. The "Notes" column for rows with "S" for "Supported" will be updated to "None" in the final output.


**supportedDataSource.csv (FROM TOOLS TO PLUGIN)**
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    BOOLEAN: CO -> S
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    BYTE: CO -> S
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    SHORT: CO -> S
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    INT: CO -> S
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    LONG: CO -> S
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    FLOAT: CO -> S
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    DOUBLE: CO -> S
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    DATE: CO -> PS
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    TIMESTAMP: CO -> PS
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    STRING: CO -> S
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    DECIMAL: CO -> S
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    NULL: CO -> NA
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    BINARY: CO -> NS
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    CALENDAR: CO -> NA
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    ARRAY: CO -> PS
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    MAP: CO -> NS
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    STRUCT: CO -> PS
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    UDT: CO -> NS
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    DAYTIME: CO -> NA
Row is changed: JSON, read, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO, CO
    YEARMONTH: CO -> NA

**supportedExecs.csv (FROM TOOLS TO PLUGIN)**
Row is removed: CustomShuffleReaderExec, S, None, Input/Output, S, S, S, S, S, S, S, S, PS, S, S, S, S, NS, PS, PS, PS, NS, NS, NS
Row is removed: RunningWindowFunctionExec, S, None, Input/Output, S, S, S, S, S, S, S, S, PS, S, S, S, NS, NS, PS, PS, PS, NS, NS, NS

**supportedExprs.csv (FROM TOOLS TO PLUGIN)**
Row is changed: HiveHash, S, `hive-hash`, None, project, input, S, S, S, S, S, S, S, S, PS, S, NS, S, NS, NS, NS, NS, NS, NS, NS, NS
    ARRAY: NS -> PS
Row is changed: HiveHash, S, `hive-hash`, None, project, input, S, S, S, S, S, S, S, S, PS, S, NS, S, NS, NS, NS, NS, NS, NS, NS, NS
    STRUCT: NS -> PS
Row is changed: JsonToStructs, NS, `from_json`, This is disabled by default because it is currently in beta and undergoes continuous enhancements. Please consult the [compatibility documentation](../compatibility.md#json-supporting-types) to determine whether you can enable this configuration for your use case, project, jsonStr, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
    Supported: NS -> S
Row is changed: JsonToStructs, NS, `from_json`, This is disabled by default because it is currently in beta and undergoes continuous enhancements. Please consult the [compatibility documentation](../compatibility.md#json-supporting-types) to determine whether you can enable this configuration for your use case, project, jsonStr, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
    Notes: This is disabled by default because it is currently in beta and undergoes continuous enhancements. Please consult the [compatibility documentation](../compatibility.md#json-supporting-types) to determine whether you can enable this configuration for your use case -> None
Row is changed: JsonToStructs, NS, `from_json`, This is disabled by default because it is currently in beta and undergoes continuous enhancements. Please consult the [compatibility documentation](../compatibility.md#json-supporting-types) to determine whether you can enable this configuration for your use case, project, result, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NS, PS, PS, NA, NA, NA
    Supported: NS -> S
Row is changed: JsonToStructs, NS, `from_json`, This is disabled by default because it is currently in beta and undergoes continuous enhancements. Please consult the [compatibility documentation](../compatibility.md#json-supporting-types) to determine whether you can enable this configuration for your use case, project, result, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NS, PS, PS, NA, NA, NA
    Notes: This is disabled by default because it is currently in beta and undergoes continuous enhancements. Please consult the [compatibility documentation](../compatibility.md#json-supporting-types) to determine whether you can enable this configuration for your use case -> None
Row is added: MonthsBetween, TNEW, `months_between`, None, project, timestamp1, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
Row is added: MonthsBetween, TNEW, `months_between`, None, project, timestamp2, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
Row is added: MonthsBetween, TNEW, `months_between`, None, project, round, PS, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
Row is added: MonthsBetween, TNEW, `months_between`, None, project, result, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
Row is added: TruncDate, TNEW, `trunc`, None, project, date, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
Row is added: TruncDate, TNEW, `trunc`, None, project, format, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
Row is added: TruncDate, TNEW, `trunc`, None, project, result, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
Row is added: TruncTimestamp, TNEW, `date_trunc`, None, project, format, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
Row is added: TruncTimestamp, TNEW, `date_trunc`, None, project, date, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
Row is added: TruncTimestamp, TNEW, `date_trunc`, None, project, result, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
Row is changed: XxHash64, S, `xxhash64`, None, project, input, S, S, S, S, S, S, S, S, PS, S, S, S, NS, NS, NS, NS, NS, NS, NS, NS
    ARRAY: NS -> PS
Row is changed: XxHash64, S, `xxhash64`, None, project, input, S, S, S, S, S, S, S, S, PS, S, S, S, NS, NS, NS, NS, NS, NS, NS, NS
    MAP: NS -> PS
Row is changed: XxHash64, S, `xxhash64`, None, project, input, S, S, S, S, S, S, S, S, PS, S, S, S, NS, NS, NS, NS, NS, NS, NS, NS
    STRUCT: NS -> PS
Row is removed: EphemeralSubstring, S, `substr`; `substring`, None, project, str, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NS, NA, NA, NA, NA, NA, NS, NS
Row is removed: EphemeralSubstring, S, `substr`; `substring`, None, project, pos, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
Row is removed: EphemeralSubstring, S, `substr`; `substring`, None, project, len, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
Row is removed: EphemeralSubstring, S, `substr`; `substring`, None, project, result, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NS, NA, NA, NA, NA, NA, NS, NS
Row is removed: DecimalSum, S, `decimalsum`, None, project, input, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
Row is removed: DecimalSum, S, `decimalsum`, None, project, result, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NA, NA, NA, NA, NS, NS

Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cindyyuanjiang for working on that.
It is important to stay up-to-date with the plugin since many of the new operators are affecting our customers.
I updated the issue description with the latest sync-pugin job. So, will be nice to include the remaining execs/expression if they are easy to knock off.

@amahussein amahussein changed the title Add support for new expressions Sync plugin support as of 2024-12-31 Dec 31, 2024
@cindyyuanjiang cindyyuanjiang marked this pull request as ready for review January 17, 2025 01:46
@cindyyuanjiang cindyyuanjiang self-assigned this Jan 17, 2025
@cindyyuanjiang cindyyuanjiang added feature request New feature or request core_tools Scope the core module (scala) labels Jan 17, 2025
Copy link
Collaborator

@parthosa parthosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTME. Minor comment in the test suite.

parthosa
parthosa previously approved these changes Jan 21, 2025
amahussein
amahussein previously approved these changes Jan 22, 2025
Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed-off-by: cindyyuanjiang <[email protected]>
@cindyyuanjiang cindyyuanjiang dismissed stale reviews from amahussein and parthosa via 54f75be January 22, 2025 19:20
@cindyyuanjiang
Copy link
Collaborator Author

I updated this PR to have a parametrized unit test for testing expressions supported in ProjectExec. cc: @parthosa @amahussein

Signed-off-by: cindyyuanjiang <[email protected]>
Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cindyyuanjiang
I like the new tests! Well done!

Copy link
Collaborator

@parthosa parthosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cindyyuanjiang. LGTME.

@cindyyuanjiang cindyyuanjiang merged commit f2a6d62 into NVIDIA:dev Jan 22, 2025
13 checks passed
@cindyyuanjiang cindyyuanjiang deleted the spark-rapids-tools-1452 branch January 22, 2025 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core_tools Scope the core module (scala) feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Sync supported ops with RAPIDS plugin Dec 2024
4 participants