Async jobs add endtime #15

ernjvr · 2018-06-21T10:22:37Z

Description

Overview:
CloudStack has the concept of executing certain API calls asynchronously when they take a long period of time to complete. They will immediately return a job id of the job that will be responsible for executing the command. This job id can be used to query the status of the job by using the queryAsyncJobResult application programming interface (API) call.
Among other fields, the result of this API query returns the 'created' field, which is the timestamp of when an asynchronous job started. There is currently no functional mechanism that captures or persists the end time of when the job has finished. As a result, the above "queryAsyncJobResult" API does not return an 'end_time' field.
QueryAsyncJobResult API changes:
The requirement outlined here is for Cloudstack to capture the job end timestamp of when the asynchronous job has finished and to populate this into the existing database field called 'removed' in the async_job table. Please note the 'removed' field is not currently being used anywhere else in the CloudStack code, and the 'removed' database column is also not currently being populated by any other processes. A new response tag should be added to the queryAsyncJobResult API called 'end_time'. When making a queryAsyncJobResult API request, the value of the database column 'removed' should be mapped to this 'end_time' response tag. The queryAsyncJobResult API field should be called 'end_time' instead of 'removed' because it will be more descriptive to an API user.
Management server process changes:
When an asynchronous job completes it is marked as complete by updating its finished status in the database. New functionality should be added to also update the 'removed' field with the timestamp of when an asynchronous job has completed.
In addition, when the Cloudstack management server is stopped and started again (gracefully or ungracefully), neither this management server nor other running management servers have any knowledge of the true status of the asynchronous jobs that completed during this time it was down. Their statuses are not tracked or updated in the database by any management server during this time, regardless of whether they are still running, finished successfully or finished with an error status. Currently, there is no mechanism to notify other running management servers that a specific management server is stopping or to notify a specific running management server that it should start to monitor/track the currently running asynchronous jobs belonging to the management server being stopped. When a Cloudstack management server starts up, it does not do a blanket delete of the asynchronous jobs it is the owner of. Instead, it finds in the database all the asynchronous jobs it is the owner of, whose statuses are in an 'in_progress' state and updates them to a 'failed' status. At the same time, it should now also mark them as 'removed' by updating the 'removed' field with the current timestamp.
Garbage collection:
CloudStack also has a garbage collector that does database clean-up of asynchronous jobs, whereby it periodically deletes unfinished and completed job records from the async_job database table. It uses the configurable global setting 'job.cancel.threshold.minutes' to cancel jobs that are still in the queue. It also uses a configurable global setting 'job.expire.minutes' that allows a user to specify how long in minutes to keep asynchronous jobs that have not been processed yet, as well as completed asynchronous job records in the database before deleting them. Unfinished jobs that haven't been processed yet and that are older than this expiry time will be expired and deleted by the garbage collector.
Asynchronous jobs that have completed before the timeout threshold will also be deleted. When these jobs complete they are marked as complete by updating their finished status in the database. Currently, the garbage collector uses the 'created' database column to find completed asynchronous job records that are older than the 'job.expire.minutes' global setting's timestamp. It should no longer use the 'created' column to do this. It should use the 'removed' column instead.
Due to the nature of the garbage collector, any reporting that needs to be done on asynchronous jobs, should be done before the garbage collector starts its cleaning up task.

Types of changes

Breaking change (fix or feature that would cause existing functionality to change)
New feature (non-breaking change which adds functionality)
Bug fix (non-breaking change which fixes an issue)
Enhancement (improves an existing feature and functionality)
Cleanup (Code refactoring and cleanup, that may add test cases)

GitHub Issue/PRs

Screenshots (if appropriate):

How Has This Been Tested?

Start a virtual machine to initiate an asynchronous job. Query the cloud.async_job table to find the record that was created in response to the 'start virtual machine' job. When the job finishes, verify that the 'removed' column has been correctly updated with the end timestamp of the relevant record.
Using cloudmonkey, make an API call to queryAsyncJobResult using the uuid as the jobid and verify that the 'endtime' field returns the correct timestamp value of the relevant database record's 'removed' column.
Verify that the garbage collector deletes the correct expired asynchronous job records from the cloud.async_job table.
Hypervisor: KVM

Checklist:

I have read the CONTRIBUTING document.
My code follows the code style of this project.
My change requires a change to the documentation.
I have updated the documentation accordingly.
Testing
I have added tests to cover my changes.
All relevant new and existing integration tests have passed.
A full integration testsuite with all test that can run on my environment has passed.

rohityadavcloud · 2018-06-22T09:25:45Z

api/src/main/java/org/apache/cloudstack/api/response/AsyncJobResponse.java

@@ -75,6 +75,10 @@
    @Param(description = "  the created date of the job")
    private Date created;

+    @SerializedName(ApiConstants.END_TIME)
+    @Param(description = "  the removed date of the job")
+    private Date removed;


Consider - finished, completed?

…need to be hidden (#15) https://shapeblue.atlassian.net/browse/FRO-184 This adds a new API arg only accessible to admins to specify if they want the network's ip address usage hidden. This then saves this setting in the network_details table for a network, and the listUsageRecords API response creator checks for an IP address if it needs to be exported or skipped/hidden. The setting is available only to root admin via the listNetworks API response (details key). Root admin can also update existing networks by using updateNetwork API and passing hideipaddressusage=true|false UI screenshot, that adds this checkbox only for admins creating a shared network: screenshot from 2019-01-17 17-11-39 **note: it's possible for all other kinds of networks to have their ip address usages skipped as well, available via the API.

Fixes #15 Signed-off-by: Rohit Yadav <[email protected]>

* Fixed: IP address of Shared Network VR is not persistent when VR is destroyed/recreated * Clean up DHCP/DNS config on VM expunge and on detaching a VM * increase DHCP lease to ~infinite * To handle deletion w/ expunge of VM with interfaces in Isolated and shared networks * client: fix for jetty session timeout * ui: fix migrate host form no host popup * removed the cloud-plugin-network-vsp dependency * Refactored code * Removed unused imports * Refactored the code * Usage event to store zone id while uploading template * refactored code

ernjvr added 3 commits June 20, 2018 12:03

initial commit: api, service, db access

00ba36a

add AsyncJobJoinDaoTest

b2fa1e3

added licence to AsyncJobJoinDaoTest

687a5c6

ernjvr requested review from rohityadavcloud and DaanHoogland June 21, 2018 10:22

refactor AsyncJobJoinDaoTest

dc1de82

rohityadavcloud reviewed Jun 22, 2018

View reviewed changes

ernjvr closed this Jun 25, 2018

ernjvr deleted the async_jobs_add_endtime branch June 25, 2018 14:55

rohityadavcloud added a commit that referenced this pull request Jan 20, 2021

Implement resource pagination on table #15

1748e79

Fixes #15 Signed-off-by: Rohit Yadav <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Async jobs add endtime #15

Async jobs add endtime #15

Uh oh!

ernjvr commented Jun 21, 2018

Uh oh!

rohityadavcloud Jun 22, 2018

Uh oh!

Uh oh!

Async jobs add endtime #15

Async jobs add endtime #15

Uh oh!

Conversation

ernjvr commented Jun 21, 2018

Description

Types of changes

GitHub Issue/PRs

Screenshots (if appropriate):

How Has This Been Tested?

Checklist:

Uh oh!

rohityadavcloud Jun 22, 2018

Choose a reason for hiding this comment

Uh oh!

Uh oh!