Skip to content

Async jobs add endtime #15

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Async jobs add endtime #15

wants to merge 4 commits into from

Conversation

ernjvr
Copy link

@ernjvr ernjvr commented Jun 21, 2018

Description

Overview:
CloudStack has the concept of executing certain API calls asynchronously when they take a long period of time to complete. They will immediately return a job id of the job that will be responsible for executing the command. This job id can be used to query the status of the job by using the queryAsyncJobResult application programming interface (API) call.
Among other fields, the result of this API query returns the 'created' field, which is the timestamp of when an asynchronous job started. There is currently no functional mechanism that captures or persists the end time of when the job has finished. As a result, the above "queryAsyncJobResult" API does not return an 'end_time' field.
QueryAsyncJobResult API changes:
The requirement outlined here is for Cloudstack to capture the job end timestamp of when the asynchronous job has finished and to populate this into the existing database field called 'removed' in the async_job table. Please note the 'removed' field is not currently being used anywhere else in the CloudStack code, and the 'removed' database column is also not currently being populated by any other processes. A new response tag should be added to the queryAsyncJobResult API called 'end_time'. When making a queryAsyncJobResult API request, the value of the database column 'removed' should be mapped to this 'end_time' response tag. The queryAsyncJobResult API field should be called 'end_time' instead of 'removed' because it will be more descriptive to an API user.
Management server process changes:
When an asynchronous job completes it is marked as complete by updating its finished status in the database. New functionality should be added to also update the 'removed' field with the timestamp of when an asynchronous job has completed.
In addition, when the Cloudstack management server is stopped and started again (gracefully or ungracefully), neither this management server nor other running management servers have any knowledge of the true status of the asynchronous jobs that completed during this time it was down. Their statuses are not tracked or updated in the database by any management server during this time, regardless of whether they are still running, finished successfully or finished with an error status. Currently, there is no mechanism to notify other running management servers that a specific management server is stopping or to notify a specific running management server that it should start to monitor/track the currently running asynchronous jobs belonging to the management server being stopped. When a Cloudstack management server starts up, it does not do a blanket delete of the asynchronous jobs it is the owner of. Instead, it finds in the database all the asynchronous jobs it is the owner of, whose statuses are in an 'in_progress' state and updates them to a 'failed' status. At the same time, it should now also mark them as 'removed' by updating the 'removed' field with the current timestamp.
Garbage collection:
CloudStack also has a garbage collector that does database clean-up of asynchronous jobs, whereby it periodically deletes unfinished and completed job records from the async_job database table. It uses the configurable global setting 'job.cancel.threshold.minutes' to cancel jobs that are still in the queue. It also uses a configurable global setting 'job.expire.minutes' that allows a user to specify how long in minutes to keep asynchronous jobs that have not been processed yet, as well as completed asynchronous job records in the database before deleting them. Unfinished jobs that haven't been processed yet and that are older than this expiry time will be expired and deleted by the garbage collector.
Asynchronous jobs that have completed before the timeout threshold will also be deleted. When these jobs complete they are marked as complete by updating their finished status in the database. Currently, the garbage collector uses the 'created' database column to find completed asynchronous job records that are older than the 'job.expire.minutes' global setting's timestamp. It should no longer use the 'created' column to do this. It should use the 'removed' column instead.
Due to the nature of the garbage collector, any reporting that needs to be done on asynchronous jobs, should be done before the garbage collector starts its cleaning up task.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

GitHub Issue/PRs

Screenshots (if appropriate):

How Has This Been Tested?

  • Start a virtual machine to initiate an asynchronous job. Query the cloud.async_job table to find the record that was created in response to the 'start virtual machine' job. When the job finishes, verify that the 'removed' column has been correctly updated with the end timestamp of the relevant record.
  • Using cloudmonkey, make an API call to queryAsyncJobResult using the uuid as the jobid and verify that the 'endtime' field returns the correct timestamp value of the relevant database record's 'removed' column.
  • Verify that the garbage collector deletes the correct expired asynchronous job records from the cloud.async_job table.
  • Hypervisor: KVM

Checklist:

  • I have read the CONTRIBUTING document.
  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
    Testing
  • I have added tests to cover my changes.
  • All relevant new and existing integration tests have passed.
  • A full integration testsuite with all test that can run on my environment has passed.

@@ -75,6 +75,10 @@
@Param(description = " the created date of the job")
private Date created;

@SerializedName(ApiConstants.END_TIME)
@Param(description = " the removed date of the job")
private Date removed;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider - finished, completed?

@ernjvr ernjvr closed this Jun 25, 2018
@ernjvr ernjvr deleted the async_jobs_add_endtime branch June 25, 2018 14:55
PaulAngus pushed a commit that referenced this pull request Jan 31, 2019
…need to be hidden (#15)

https://shapeblue.atlassian.net/browse/FRO-184

This adds a new API arg only accessible to admins to specify if they
want the network's ip address usage hidden. This then saves this setting
in the network_details table for a network, and the listUsageRecords
API response creator checks for an IP address if it needs to be exported
or skipped/hidden.

The setting is available only to root admin via the listNetworks API response (details key).
Root admin can also update existing networks by using updateNetwork API and passing hideipaddressusage=true|false

UI screenshot, that adds this checkbox only for admins creating a shared network:
screenshot from 2019-01-17 17-11-39

**note: it's possible for all other kinds of networks to have their ip address usages skipped as well, available via the API.
rohityadavcloud added a commit that referenced this pull request Jan 20, 2021
shwstppr pushed a commit that referenced this pull request Jul 27, 2021
* Fixed: IP address of Shared Network VR is not persistent when VR is destroyed/recreated

* Clean up DHCP/DNS config on VM expunge and on detaching a VM

* increase DHCP lease to ~infinite

* To handle deletion w/ expunge of VM with interfaces in Isolated and shared networks

* client: fix for jetty session timeout

* ui: fix migrate host form no host popup

* removed the cloud-plugin-network-vsp dependency

* Refactored code

* Removed unused imports

* Refactored the code

* Usage event to store zone id while uploading template

* refactored code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants