Description
Describe the bug
the StageModel
references a StageInfo
field to get the details of the stage.
The problem with that design that this causes a deep-levelpointer to data that is not needed by the core tools for now.
@DeveloperApi
class StageInfo(
val stageId: Int,
private val attemptId: Int,
val name: String,
val numTasks: Int,
val rddInfos: Seq[RDDInfo],
val parentIds: Seq[Int],
val details: String,
val taskMetrics: TaskMetrics = null,
private[spark] val taskLocalityPreferences: Seq[Seq[TaskLocation]] = Seq.empty,
private[spark] val shuffleDepId: Option[Int] = None,
val resourceProfileId: Int,
private[spark] var isPushBasedShuffleEnabled: Boolean = false,
private[spark] var shuffleMergerCount: Int = 0) {
/** When this stage was submitted from the DAGScheduler to a TaskScheduler. */
var submissionTime: Option[Long] = None
/** Time when the stage completed or when the stage was cancelled. */
var completionTime: Option[Long] = None
/** If the stage failed, the reason why. */
var failureReason: Option[String] = None
/**
* Terminal values of accumulables updated during this stage, including all the user-defined
* accumulators.
*/
val accumulables = HashMap[Long, AccumulableInfo]()
Ideally, we need to have stub class that only copies what we need.
We did that before in #1206 but we had to roll it back for compatibility with various Spark implementations in #1260