Skip to content

Remove Compare/Combined modes from Profiling tool #1619

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 9, 2025

Conversation

amahussein
Copy link
Collaborator

@amahussein amahussein commented Apr 8, 2025

Fixes #1582

Removes the two modes (compare/combine). This yields to removal of AppIndex column from all the output files.

This pull request includes several changes to the core module, primarily focusing on removing redundant parameters and simplifying the codebase. The most significant changes involve the removal of the appIndex parameter from various methods and classes.

  • The CSV files containing Spark/Rapids properties, the column appIndex_1 was renamed to propertyValue.
  • For all remaining CSV files, appIndex column was removed.
  • I filed a new issue Remove unused code of ViewableTrait #1617 to remove classes that were introduced to handle generating views for Apps along with their index.
  • I filed a new issue on the internal-docs to remove the combined/compare modes from the Profiling cmd.

Codebase simplification:

Trait adjustments:

  • Simplified the ProfAppIndexMapperTrait by extending QualAppIndexMapperTrait instead of implementing its own method.

These changes collectively aim to streamline the code and remove unnecessary complexity.

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>

Fixes NVIDIA#1582

Cleanup processApp to use better scope of variables

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>

createAppAndProcess should not take a Seq

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>

Fix qualX code removing appIndex

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>
@amahussein amahussein added feature request New feature or request usability track issues related to the Tools's user experience API change A change affecting the output (add/remove/rename files, add/remove/rename columns) labels Apr 8, 2025
@amahussein amahussein requested a review from Copilot April 8, 2025 03:23
@amahussein amahussein self-assigned this Apr 8, 2025
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 40 out of 60 changed files in this pull request and generated no comments.

Files not reviewed (20)
  • core/src/main/scala/com/nvidia/spark/rapids/tool/EventLogPathProcessor.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/analysis/AppIndexMapperTrait.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/analysis/AppSQLPlanAnalyzer.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/analysis/AppSparkMetricsAggTrait.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/analysis/AppSparkMetricsAnalyzer.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/CompareApplications.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/CompareSummaryInfo.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/GenerateDot.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/HealthCheck.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/ProfileArgs.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/ProfileClassWarehouse.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/ProfileMain.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/ProfileOutputWriter.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/Profiler.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/qualification/QualSQLPlanAnalyzer.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/qualification/Qualification.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/tuning/TunerContext.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/views/AggMetricsResultSorter.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/views/AppStageMetricsView.scala: Language not supported
  • core/src/main/scala/com/nvidia/spark/rapids/tool/views/DataSourceView.scala: Language not supported

@github-actions github-actions bot added user_tools Scope the wrapper module running CSP, QualX, and reports (python) core_tools Scope the core module (scala) labels Apr 8, 2025
@amahussein
Copy link
Collaborator Author

@leewyang can you please take to the changes in Python?

@wjxiz1992 FYI, this PR removes appIndex column from the CSV files.

Copy link
Collaborator

@sayedbilalbari sayedbilalbari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

@@ -1,3 +0,0 @@
appIndex,jobId,numTasks,Duration,diskBytesSpilled_sum,duration_sum,duration_max,duration_min,duration_avg,executorCPUTime_sum,executorDeserializeCPUTime_sum,executorDeserializeTime_sum,executorRunTime_sum,input_bytesRead_sum,input_recordsRead_sum,jvmGCTime_sum,memoryBytesSpilled_sum,output_bytesWritten_sum,output_recordsWritten_sum,peakExecutionMemory_max,resultSerializationTime_sum,resultSize_max,sr_fetchWaitTime_sum,sr_localBlocksFetched_sum,sr_localBytesRead_sum,sr_remoteBlocksFetched_sum,sr_remoteBytesRead_sum,sr_remoteBytesReadToDisk_sum,sr_totalBytesRead_sum,sw_bytesWritten_sum,sw_recordsWritten_sum,sw_writeTime_sum
1,0,213,2569,0,26735,1598,10,125.5,6608,3531,12095,13414,0,0,336,0,0,0,0,8,8075,0,2600,80279908,0,0,0,80279908,80279908,2600,1001
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why has this test file been removed ? I am assuming this has to do with the removed compare mode test

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removed files used to represent the combined view of multiple eventlogs.
Since those unit tests are removed there is no need to keep those expected CSV files

Copy link
Collaborator

@leewyang leewyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python edits LGTM

@amahussein amahussein merged commit 17335fd into NVIDIA:dev Apr 9, 2025
15 checks passed
@amahussein amahussein deleted the rapids-tools-1582 branch April 9, 2025 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API change A change affecting the output (add/remove/rename files, add/remove/rename columns) core_tools Scope the core module (scala) feature request New feature or request usability track issues related to the Tools's user experience user_tools Scope the wrapper module running CSP, QualX, and reports (python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Remove Compare/Combined modes from the Profiling tool
3 participants