Skip to content
This repository was archived by the owner on Mar 3, 2023. It is now read-only.

Add support for Pod Templates which are read from a ConfigMap. The ConfigMap name can be passed in as a config-property #3707

Closed
nicknezis opened this issue Aug 30, 2021 · 4 comments · Fixed by #3710 or #3752

Comments

@nicknezis
Copy link
Contributor

nicknezis commented Aug 30, 2021

This feature request is to enhance the Kubernetes Scheduler to add support for Pod Templates (similar to Spark's Kubernetes feature)

By providing a Pod Template as a ConfigMap, the Heron Kubernetes Scheduler can retrieve the template and provide it as a base starting point for Pod definition.

@nicknezis
Copy link
Contributor Author

@surahman
Copy link
Member

surahman commented Aug 30, 2021

Hi @nicknezis, I have started poking around in the code base for Heron again and I am trying to wrap my head around what needs to happen here. Again, I do not have any experience in the codebase.

By providing a Pod Template as a ConfigMap, the Heron Kubernetes Scheduler can retrieve the template and provide it as a base starting point for Pod definition.

This appears to be setting up the Config on the Scheduler:

public void initialize(Config config, Config runtime) {
// validate the topology name before moving forward
if (!topologyNameIsValid(Runtime.topologyName(runtime))) {
throw new RuntimeException(getInvalidTopologyNameMessage(Runtime.topologyName(runtime)));
}
// validate that the image pull policy has been set correctly
if (!imagePullPolicyIsValid(KubernetesContext.getKubernetesImagePullPolicy(config))) {
throw new RuntimeException(
getInvalidImagePullPolicyMessage(KubernetesContext.getKubernetesImagePullPolicy(config))
);
}
final Config.Builder builder = Config.newBuilder()
.putAll(config);
if (config.containsKey(Key.TOPOLOGY_BINARY_FILE)) {
builder.put(Key.TOPOLOGY_BINARY_FILE,
FileUtils.getBaseName(Context.topologyBinaryFile(config)));
}
this.configuration = builder.build();
this.runtimeConfiguration = runtime;
this.controller = getController();
this.updateTopologyManager =
new UpdateTopologyManager(configuration, runtimeConfiguration,
Optional.<IScalable>of(this));
}

The following code block appears to be creating the YAML config based on the required record structures:
https://github.com/apache/spark/blob/de59e01aa4853ef951da080c0d1908d53d133ebe/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/PodTemplateConfigMapStep.scala#L39-L66

I am trying to figure out what my starting point for this should be on the Heron side, any ideas? What functionality is available as far as building the YAML record structures (Config.Builder)?

The following utility is preparing the ConfigMap and storing the YAML <Key, Value> pairs from the topologyConfig:

protected static Map<String, String> getComponentConfigMap(TopologyAPI.Topology topology,
String key) throws RuntimeException {
List<TopologyAPI.Config.KeyValue> topologyConfig = topology.getTopologyConfig().getKvsList();
Map<String, String> configMap = new HashMap<>();
// Get the set of component names to make sure the config only specifies valid component name
Set<String> componentNames = getComponentParallelism(topology).keySet();
// Parse the config value
String mapStr = getConfigWithDefault(topologyConfig, key, (String) null);
if (mapStr != null) {
String[] mapTokens = mapStr.split(",");
// Each token should be in this format: component:value
for (String token : mapTokens) {
if (token.trim().isEmpty()) {
continue;
}
String[] componentAndValue = token.split(":");
if (componentAndValue.length != 2) {
throw new RuntimeException("Malformed component config " + key);
}
if (!componentNames.contains(componentAndValue[0])) {
throw new RuntimeException("Invalid component. " + componentAndValue[0] + " not found");
}
configMap.put(componentAndValue[0], componentAndValue[1]);
}
}
return configMap;
}

@surahman
Copy link
Member

surahman commented Sep 2, 2021

A quick review of the code in the heron/schedulers/src/java/org/apache/heron/scheduler/kubernetes/ directory, and some other related code:

  • KubernetesConstants: namespaced constants for configurations.
  • KubernetesContext: child class of Context with Kubernetes specific keys used to lookup values (no "setters", only "getters") stored in the supplied Config object.
  • KubernetesController: the basic interface with Kubernetes to recover info about the topology configs as well as submit/kill/restart them.
  • KubernetesLauncher: used to submit a topology to the Kubernetes cluster.
  • KubernetesScheduler: the interface to manage the cluster, get info, kill, restart, resize, and update the cluster size based on the topology. It contains a KubernetesController object which facilitates the comms with the cluster.
  • KubernetesUtils: various logging utilities.
  • V1Controller: child class of the KubernetesController which contains the actual logic to interface with the Kubernetes cluster. The PackingPlan for the instances is central to operations with configurations stored in a ContainerPlan object.
  • Volumes: utilities used to configure the clusters storage volumes.
  • PackingPlan: heron/spi/src/java/org/apache/heron/spi/packing/ has the classes for PackingPlan, InstancePlan and ContainerPlan.

I have very quickly compiled a basic Pod Config:

# POD CONFIG WITH TEMPLATE:
apiVersion: v1
kind: Pod
metadata:
  name: heron-node
spec:
  replicas: 2
  selector:
    matchLabels:
      app: heron-node
  template:
    metadata:
      labels:
        app: heron-node
    spec:
      containers:
      - name: heron-node
        image: centos:8
        command: ['sh', '-c', 'echo "Heron Kubernetes node." && sleep 3600']
        resources:
          requests:
            memory: "1Gb"
            cpu: "250m"
          limits:
            memory: "5Gb"
            cpu: "500m"
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            add: ["NET_ADMIN", "SYS_TIME"]
      restartPolicy: OnFailure

The code snippet from Spark is appending the following to the pod config, which creates a Volume mount pointing to the pod config template:

  volumes:
    - name: pod-template-name  # from <POD_TEMPLATE_VOLUME>.
      configMap:
        name: configmap-name  # from <configmapName>.
        items:
        - key: pod-template-key  # from <POD_TEMPLATE_KEY>.
          path: executor-pod-spec-template-file-name # from <EXECUTOR_POD_SPEC_TEMPLATE_FILE_NAME>.

This code snippet is adding the Volume mount pointing to a container config template:

  volumes:
    - name: pod-template-name  # from <POD_TEMPLATE_VOLUME>.
      path: executor-pod-spec-template-mount-path # from <EXECUTOR_POD_SPEC_TEMPLATE_MOUNTPATH>.

The constants (caps naming convention) are defined here. If we are to follow what Spark is doing, we will need to declare these constants as conventions and document them. The most prudent place for these constants is KubernetesConstants as they will not be required outside. The configmapName is being acquired as such.

The remaining two routines in the Spark code appear to be extracting information from the configs - they are not modifying them.

The question now is what functionality exists within Heron to modify the Config object to append the YAML elements to the Config as well as where (object) the actual pod/container configs are stored within the Config. Another question is if there is support for both containers and pods configs, and what the structure within the Config object is. Once we have an understanding of these things I can hammer out a solution.

The following are client API components to interface with Kubernetes:

I have never used Heron and am learning about the workflow and setup as I go, so please bear with me. I am also still getting familiar with the vast codebase.

@surahman
Copy link
Member

surahman commented Sep 8, 2021

Hi @nicknezis, I have a plan put together 🤔 and will start to hammer things out. I shall generate a WIP PR into the Heron repo once I have some of the more substantial components put together.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
2 participants