Releases: NVIDIA/spark-rapids-tools
Releases · NVIDIA/spark-rapids-tools
v25.02.3
Packages
- Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/25.02.3/
- PyPI Package: https://pypi.org/project/spark-rapids-user-tools/25.02.3/
Changes
User Tools
- Packaging tools jar with the python package (#1634)
- Integrate Qualx train_and_evaluate into spark_rapids CLI with docs (#1660)
- Adding 'Not Recommended Reason' to qualification summary (#1649)
- Use config file and add docs for Qualx hash_util (#1657)
- Add more qualx unit tests (#1654)
- Qualx pipeline API (#1645)
Core
- Fix Bootstrap to avoid carrying forward CPU run memory configs when insufficient for GPU runs (#1663)
- AutoTuner: Fix incorrect memoryOverheadFactor recommendation for YARN and k8s and add default master in unit tests (#1659)
- Improve unit handling for memory-related configs in AutoTuner (#1652)
- Improve memory allocations in ProfileResults classes (#1642)
- Fix filtered eventlogs in Profiling tool (#1639)
v25.02.2
Packages
- Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/25.02.2/
- PyPI Package: https://pypi.org/project/spark-rapids-user-tools/25.02.2/
Changes
User Tools
- Disable assertions in core-tools (#1635)
Core
v25.02.1
Packages
- Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/25.02.1/
- PyPI Package: https://pypi.org/project/spark-rapids-user-tools/25.02.1/
Changes
User Tools
- Remove Compare/Combined modes from Profiling tool (#1619)
- Disable dropping sqlIDs with failures during qualx prediction (#1615)
- Qualx model updates from weekly KPI run 2025-03-30 (#1604)
- Fix unused global warnings from new flake8 7.2.0 (#1609)
- Reduce qualx logging noise (#1603)
Core
v25.02.0
Packages
- Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/25.02.0/
- PyPI Package: https://pypi.org/project/spark-rapids-user-tools/25.02.0/
Changes
User Tools
- Revert "Update version by jenkins-spark-rapids-tools-auto-release-111" (#1600)
- Qualx unit tests (#1599)
- Qualx model updates from weekly KPI run 2025-03-09 (#1584)
- Add new qualx features for custom model training (#1573)
- Qualx model updates from weekly KPI run 2025-02-23 (#1559)
Core
- Revert "[FEA] Add filtered diagnostic output for GPU slowness in Profiler tool (#1548)" (#1602)
- Revert "Update version by jenkins-spark-rapids-tools-auto-release-111" (#1600)
- [FEA] Add filtered diagnostic output for GPU slowness in Profiler tool (#1548)
- AutoTuner recommends increasing shuffle partitions when shuffle stages have OOM failures on YARN (#1593)
- Add insertIntoHiveTable to the WriteOps report (#1591)
- Fix AutoTuner unit test with dynamic plugin JAR URL value (#1592)
- Adjust maxPartitionBytes if the table scan stage had OOM task failures (#1578)
- Adds aggregation across metrics for failed/succeeded and non completed stages (#1558)
- Support Bin and Slice expressions in Qual tool (#1581)
- Add kryo related settings to the AutoTuner's Bootstrap conf (#1574)
- Add sqlID column to failed_jobs.csv (#1567)
- Pretty print FileFormat in the write_operations file (#1562)
Miscellaneous
v24.12.4
Packages
- Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.12.4/
- PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.12.4/
Changes
User Tools
- Add comment for each expected_raw_feature indicating CSV source (#1547)
Core
- Calculate task metric aggregates on-the-fly to reduce memory usage (#1543)
- Generate a detailed report for the write ops (#1544)
Miscellaneous
- Disable dataproc enhanced optimizer configs (#1554)
v24.12.3
Packages
- Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.12.3/
- PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.12.3/
Changes
User Tools
- Add support for configurable qualx label column (#1528)
- Merge Distributed Qualification Tools CLI (#1516)
Core
- AutoTuner/Bootstrapper should recommend Dataproc Spark performance enhancements (#1539)
- Disable Per-SQL summary text output (#1530)
- Use a stub to store Spark StageInfo (#1525)
- Recommend G6 instead of G5 on EMR (#1523)
- Generate a separate file to list bootstrap properties (#1517)
Miscellaneous
- Disable diagnostics pytests (#1532)
v24.12.2
Packages
- Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.12.2/
- PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.12.2/
Changes
User Tools
- Revert "Follow Up: Make '--platform' argument mandatory in CLI (#1473)" (#1498)
- Fix Hadoop JAR Download Timeouts in Behave Tests (#1503)
Core
- AutoTuner: Set recommendation for spark.task.resource.gpu.amount to a very low value (#1514)
- [FEA] Add IO diagnostic output for GPU slowness in Profiler tool (#1451)
- [BUG] Qual tool should convert time units at stage/job/sql level (#1511)
- Fix string comparison for memory overhead in pinned pool size recommendation in AutoTuner (#1508)
- Update core tools rules to allow cross-build between 2.12 and 2.13 (#1510)
- Sync plugin support as of 2024-12-31 (#1478)
- Add stringType and binaryType to the list of dataType map (#1506)
- [BUG] Remove duplicated executor CPU time and runtime metric from SQLTaskAggMetricsProfileResult (#1504)
- Improve AutoTuner cluster configuration recommendations for GPU runs (#1501)
Miscellaneous
- Use common add-to-project action [skip ci] (#1505)
v24.12.1
Packages
- Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.12.1/
- PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.12.1/
Changes
User Tools
- Add compute_precision_recall utility function (#1500)
- Fix additional FutureWarning issues (#1499)
- Qualx model updates from weekly KPI run 2025-01-10 (#1495)
- Fix future warnings for pandas>=2.2 (#1494)
- Pin scikit-learn dependency for shap (#1491)
- Make spill heuristic 1 TB by default (#1488)
- Support Python 3.9-3.12 (#1486)
- Update models for latest code/datasets (#1485)
Core
- Improve scalastyle rules to detect spaces (#1493)
- Improve shuffle manager recommendation in AutoTuner with version validation (#1483)
- Support group-limit optimization for ROW_NUMBER in Qualification (#1487)
- Bump minimum Spark version to 3.2.0 and improve AutoTuner unit tests for multiple Spark versions (#1482)
- Fix inconsistent shuffle write time sum results in Profiler output (#1450)
- Refine Qualification AutoTuner recommendations for shuffle partitions for CPU event logs (#1479)
- Split AutoTuner for Profiling and Qualification and Override BATCH_SIZE_BYTES (#1471)
v24.12.0
Packages
- Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.12.0/
- PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.12.0/
Changes
User Tools
- Make '--platform' argument mandatory in qualification and profiling CLI to prevent incorrect behavior (#1463)
Core
- Skip processing apps with invalid platform and spark runtime configurations (#1421)
- Improve implementation of finding median in StatisticsMetrics (#1474)
- Optimize implementation of getAggregateRawMetrics in core-tools (#1468)
- Adding Spark 3.5.2 support in auto tuner for EMR (#1466)
- Mark RunningWindowFunction as supported in Qual tool (#1465)
- Deduplicate calls to aggregateSparkMetricsBySql (#1464)
Miscellaneous
- Follow Up: Make '--platform' argument mandatory in CLI (#1473)
v24.10.3
Packages
- Maven Release: https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-tools_2.12/24.10.3/
- PyPI Package: https://pypi.org/project/spark-rapids-user-tools/24.10.3/
Changes
User Tools
- Fix dataframe handling of column-types (#1458)