Skip to content

fix: filemanager secret cache #928

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 4, 2025
Merged

Conversation

mmalenic
Copy link
Member

Closes #925

Changes

  • Adds caching layer to all get_secret_value calls that persists within a Lambda execution environment.

@mmalenic mmalenic self-assigned this Mar 27, 2025
@mmalenic mmalenic added filemanager an issue relating to the filemanager fix labels Mar 27, 2025
@reisingerf
Copy link
Member

Sorry, not a Rust guy... perhaps @brainstorm should have a look a this?

Also, how does this compare to solutions documented by AWS like:
https://github.com/aws/aws-secretsmanager-agent/tree/main/aws_secretsmanager_caching

@mmalenic mmalenic requested a review from brainstorm March 27, 2025 04:37
@mmalenic
Copy link
Member Author

Also, how does this compare to solutions documented by AWS like:
https://github.com/aws/aws-secretsmanager-agent/tree/main/aws_secretsmanager_caching

That one is using RwLock underneath rather than Mutex which I think is the better option. Otherwise behaviour-wise I think it should be fairly similar as all the time-out and max cache entries logic wouldn't be used. I.e. there is one cache entry and it should live for the whole length of the Lambda execution.

It's probably worth switching to that crate anyway as it already implements the locking.

@@ -28,6 +29,7 @@ async fn main() -> Result<(), Error> {
Arc::new(config),
Arc::new(s3::Client::with_defaults().await),
Arc::new(sqs::Client::with_defaults().await),
Arc::new(Mutex::new(secrets_manager::Client::with_defaults().await)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this where the secret is fetched or the client is initialised?
This code is inside the main function. Does that mean it runs inside the handler?

If so, would simply putting this outside the handler function be easier / better?

Take advantage of execution environment reuse to improve the performance of your function. Initialize SDK clients and database connections outside of the function handler...

(from: https://docs.aws.amazon.com/lambda/latest/dg/rust-handler.html#rust-best-practices)

Copy link
Member Author

@mmalenic mmalenic Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's where the client gets initialized. This is put into the AppState struct which is shared for all requests.

This code is inside the main function. Does that mean it runs inside the handler?

Everything inside the main function before run(app).await should only be initialized once when the Lambda execution loads (i.e. it's already outside of the handler). app in this context is the handler, and it's an axum Router, which is used to process requests. This state is cloned for each request, but that's okay because it still uses the same cache variable which is stored on heap inside a reference counting pointer - Arc.

The Mutex locks the client exclusively so that it's mutable. This is required to modify the state as it has to insert an entry when fetching a secret (or get an existing entry).

The aws_secretsmanager_caching crate does a similar thing except the lock is inside that crate and it's a RwLock. I think this should be better as it doesn't lock exclusively for reads - allowing multiple at the same time.

As an aside, this pattern (i.e. Arc<Mutex<T>> or Arc<RwLock<T>>) allows cloning/modifying shared state across threads from a regular & reference, rather than requiring &mut. E.g. noting the function that aws_secretsmanager_caching makes public has a &self rather than &mut self, even though that function does modify the state of self (through the RwLock). This normally wouldn't be allowed in Rust. I think Mutex/RwLock uses some unsafe code underneath to convince the compiler that this is okay.

Hopefully that makes sense, let me know!

@mmalenic mmalenic added this pull request to the merge queue Apr 4, 2025
Merged via the queue into main with commit 75a44d6 Apr 4, 2025
6 checks passed
@mmalenic mmalenic deleted the fix/filemanager-secret-cache branch April 4, 2025 02:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
filemanager an issue relating to the filemanager fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fix(filemanager): cache secret value for presign user
4 participants