-
Notifications
You must be signed in to change notification settings - Fork 1.2k
[New Scheduler] Run scheduler #5194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
73e99e1
2c7f324
40f3682
3d346e0
4192a5a
2f7d7e3
dd9a6ce
acd7866
6499279
c2e57c1
a18c2b6
8da0f4d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -148,6 +148,58 @@ ansible-playbook -i environments/$ENVIRONMENT prereq.yml | |
|
||
**Hint:** During playbook execution the `TASK [prereq : check for pip]` can show as failed. This is normal if no pip is installed. The playbook will then move on and install pip on the target machines. | ||
|
||
### [Optional] Enable the new scheduler | ||
|
||
You can enable the new scheduler of OpenWhisk. | ||
It will run one more component called "scheduler" and ETCD. | ||
|
||
#### Configure service providers for the scheduler | ||
You can update service providers for the scheduler as follows. | ||
|
||
**common/scala/src/main/resources** | ||
``` | ||
whisk.spi { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. shouldn't a provider for the scheduler db be required here? i.e. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we have the |
||
ArtifactStoreProvider = org.apache.openwhisk.core.database.CouchDbStoreProvider | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mongodb is technically supported now too right for this? It's just documentation example so not a big deal There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I think so there is no sepecific dependency with CouchDB. |
||
ActivationStoreProvider = org.apache.openwhisk.core.database.ArtifactActivationStoreProvider | ||
MessagingProvider = org.apache.openwhisk.connector.kafka.KafkaMessagingProvider | ||
ContainerFactoryProvider = org.apache.openwhisk.core.containerpool.docker.DockerContainerFactoryProvider | ||
LogStoreProvider = org.apache.openwhisk.core.containerpool.logging.DockerToActivationLogStoreProvider | ||
LoadBalancerProvider = org.apache.openwhisk.core.loadBalancer.FPCPoolBalancer | ||
EntitlementSpiProvider = org.apache.openwhisk.core.entitlement.FPCEntitlementProvider | ||
AuthenticationDirectiveProvider = org.apache.openwhisk.core.controller.BasicAuthenticationDirective | ||
InvokerProvider = org.apache.openwhisk.core.invoker.FPCInvokerReactive | ||
InvokerServerProvider = org.apache.openwhisk.core.invoker.FPCInvokerServer | ||
DurationCheckerProvider = org.apache.openwhisk.core.scheduler.queue.ElasticSearchDurationCheckerProvider | ||
} | ||
. | ||
. | ||
. | ||
``` | ||
|
||
#### Enable the scheduler | ||
- Make sure you enable the scheduler by configuring `scheduler_enable`. | ||
|
||
**ansible/environments/local/group_vars** | ||
```yaml | ||
scheduler_enable: true | ||
``` | ||
|
||
#### [Optional] Enable ElasticSearch Activation Store | ||
When you use the new scheduler, it is recommended to use ElasticSearch as an activation store. | ||
|
||
**ansible/environments/local/group_vars** | ||
```yaml | ||
db_activation_backend: ElasticSearch | ||
elastic_cluster_name: <your elasticsearch cluster name> | ||
elastic_protocol: <your elasticsearch protocol> | ||
elastic_index_pattern: <your elasticsearch index pattern> | ||
elastic_base_volume: <your elasticsearch volume directory> | ||
elastic_username: <your elasticsearch username> | ||
elastic_password: <your elasticsearch username> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we use external elasticsearch? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. |
||
``` | ||
|
||
You can also refer to this guide to [deploy OpenWhisk using ElasticSearch](https://github.com/apache/openwhisk/blob/master/ansible/README.md#using-elasticsearch-to-store-activations). | ||
|
||
### Deploying Using CouchDB | ||
- Make sure your `db_local.ini` file is [setup for](#setup) CouchDB then execute: | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -127,6 +127,8 @@ jmx: | |
rmiBasePortController: 16000 | ||
basePortInvoker: 17000 | ||
rmiBasePortInvoker: 18000 | ||
basePortScheduler: 21000 | ||
rmiBasePortScheduler: 22000 | ||
user: "{{ jmxuser | default('jmxuser') }}" | ||
pass: "{{ jmxuser | default('jmxpass') }}" | ||
jvmCommonArgs: "-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.password.file=/home/owuser/jmxremote.password -Dcom.sun.management.jmxremote.access.file=/home/owuser/jmxremote.access" | ||
|
@@ -221,6 +223,8 @@ invoker: | |
keystore: | ||
password: "{{ invoker_keystore_password | default('openwhisk') }}" | ||
name: "{{ __invoker_ssl_keyPrefix }}openwhisk-keystore.p12" | ||
container: | ||
creationMaxPeek: "{{ container_creation_max_peek | default(500) }}" | ||
reactiveSpi: "{{ invokerReactive_spi | default('') }}" | ||
serverSpi: "{{ invokerServer_spi | default('') }}" | ||
|
||
|
@@ -278,6 +282,9 @@ db: | |
invoker: | ||
user: "{{ db_invoker_user | default(lookup('ini', 'db_username section=invoker file={{ playbook_dir }}/db_local.ini')) }}" | ||
pass: "{{ db_invoker_pass | default(lookup('ini', 'db_password section=invoker file={{ playbook_dir }}/db_local.ini')) }}" | ||
scheduler: | ||
user: "{{ db_scheduler_user | default(lookup('ini', 'db_username section=scheduler file={{ playbook_dir }}/db_local.ini')) }}" | ||
pass: "{{ db_scheduler_pass | default(lookup('ini', 'db_password section=scheduler file={{ playbook_dir }}/db_local.ini')) }}" | ||
artifact_store: | ||
backend: "{{ db_artifact_backend | default('CouchDB') }}" | ||
activation_store: | ||
|
@@ -435,8 +442,9 @@ metrics: | |
|
||
user_events: "{{ user_events_enabled | default(false) | lower }}" | ||
|
||
durationChecker: | ||
timeWindow: "{{ duration_checker_time_window | default('1 d') }}" | ||
zeroDowntimeDeployment: | ||
enabled: "{{ zerodowntime_deployment_switch | default(true) }}" | ||
solution: "{{ zerodowntime_deployment_solution | default('apicall') }}" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems Another deployment solution is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently, this configuration does not take effect due to some missing parts. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it. |
||
|
||
etcd: | ||
version: "{{ etcd_version | default('v3.4.0') }}" | ||
|
@@ -463,13 +471,63 @@ etcd_connect_string: "{% set ret = [] %}\ | |
{% endfor %}\ | ||
{{ ret | join(',') }}" | ||
|
||
|
||
__scheduler_blackbox_fraction: 0.10 | ||
|
||
watcher: | ||
eventNotificationDelayMs: "{{ watcher_notification_delay | default('5000 ms') }}" | ||
|
||
durationChecker: | ||
timeWindow: "{{ duration_checker_time_window | default('1 d') }}" | ||
|
||
enable_scheduler: "{{ scheduler_enable | default(false) }}" | ||
|
||
scheduler: | ||
protocol: "{{ scheduler_protocol | default('http') }}" | ||
dir: | ||
become: "{{ scheduler_dir_become | default(false) }}" | ||
confdir: "{{ config_root_dir }}/scheduler" | ||
basePort: 14001 | ||
grpc: | ||
basePort: 13001 | ||
tls: "{{ scheduler_grpc_tls | default(false) }}" | ||
maxPeek: "{{ scheduler_max_peek | default(128) }}" | ||
heap: "{{ scheduler_heap | default('2g') }}" | ||
arguments: "{{ scheduler_arguments | default('') }}" | ||
instances: "{{ groups['schedulers'] | length }}" | ||
username: "{{ scheduler_username | default('scheduler.user') }}" | ||
password: "{{ scheduler_password | default('scheduler.pass') }}" | ||
akka: | ||
provider: cluster | ||
cluster: | ||
basePort: 25520 | ||
host: "{{ groups['schedulers'] | map('extract', hostvars, 'ansible_host') | list }}" | ||
bindPort: 3551 | ||
# at this moment all schedulers are seed nodes | ||
seedNodes: "{{ groups['schedulers'] | map('extract', hostvars, 'ansible_host') | list }}" | ||
loglevel: "{{ scheduler_loglevel | default(whisk_loglevel) | default('INFO') }}" | ||
extraEnv: "{{ scheduler_extraEnv | default({}) }}" | ||
dataManagementService: | ||
retryInterval: "{{ scheduler_dataManagementService_retryInterval | default('1 second') }}" | ||
inProgressJobRetentionSecond: "{{ scheduler_inProgressJobRetentionSecond | default('20 seconds') }}" | ||
managedFraction: "{{ scheduler_managed_fraction | default(1.0 - (scheduler_blackbox_fraction | default(__scheduler_blackbox_fraction))) }}" | ||
blackboxFraction: "{{ scheduler_blackbox_fraction | default(__scheduler_blackbox_fraction) }}" | ||
queueManager: | ||
maxSchedulingTime: "{{ scheduler_maxSchedulingTime | default('20 second') }}" | ||
maxRetriesToGetQueue: "{{ scheduler_maxRetriesToGetQueue | default(13) }}" | ||
queue: | ||
# the queue's state Running timeout, e.g. if have no activation comes into queue when Running, the queue state will be changed from Running to Idle and delete the decision algorithm actor | ||
idleGrace: "{{ scheduler_queue_idleGrace | default('20 seconds') }}" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought you said the default was to remove an idle queue after 24 hours? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think 24 hours does not fit all cases. I am ok to change the default value but I believe each downstream will also choose the proper timeout other than the default. The default configuration here means a queue will become idle after 20 seconds and be terminated after another 20 seconds. |
||
# the queue's state Idle timeout, e.g. if have no activation comes into queue when Idle, the queue state will be changed from Idle to Removed | ||
stopGrace: "{{ scheduler_queue_stopGrace | default('20 seconds') }}" | ||
# the queue's state Paused timeout, e.g. if have no activation comes into queue when Paused, the queue state will be changed from Paused to Removed | ||
flushGrace: "{{ scheduler_queue_flushGrace | default('60 seconds') }}" | ||
gracefulShutdownTimeout: "{{ scheduler_queue_gracefulShutdownTimeout | default('5 seconds') }}" | ||
maxRetentionSize: "{{ scheduler_queue_maxRetentionSize | default(10000) }}" | ||
maxRetentionMs: "{{ scheduler_queue_maxRetentionMs | default(60000) }}" | ||
maxBlackboxRetentionMs: "{{ scheduler_queue_maxBlackboxRetentionMs | default(300000) }}" | ||
throttlingFraction: "{{ scheduler_queue_throttlingFraction | default(0.9) }}" | ||
durationBufferSize: "{{ scheduler_queue_durationBufferSize | default(10) }}" | ||
deployment_ignore_error: "{{ scheduler_deployment_ignore_error | default('False') }}" | ||
dataManagementService: | ||
retryInterval: "{{ scheduler_dataManagementService_retryInterval | default('1 second') }}" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
--- | ||
# Remove scheduler containers. | ||
|
||
- name: get scheduler name | ||
set_fact: | ||
scheduler_name: "{{ name_prefix ~ host_group.index(inventory_hostname) }}" | ||
|
||
- name: remove scheduler | ||
docker_container: | ||
name: "{{ scheduler_name }}" | ||
state: absent | ||
ignore_errors: "True" | ||
|
||
- name: remove scheduler log directory | ||
file: | ||
path: "{{ whisk_logs_dir }}/{{ scheduler_name }}" | ||
state: absent | ||
become: "{{ logs.dir.become }}" | ||
|
||
- name: remove scheduler conf directory | ||
file: | ||
path: "{{ scheduler.confdir }}/{{ scheduler_name }}" | ||
state: absent | ||
become: "{{ scheduler.dir.become }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be any guide on setting up etcd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you enable the scheduler by configuring
enable_scheduler: true
, it will be automatically deployed by this.https://github.com/apache/openwhisk/pull/5194/files#diff-2356bb62c87e471ef37b7973eb51e82282ef1131ee7ab4b62d909102de96967cR23