-
Notifications
You must be signed in to change notification settings - Fork 767
Add support for automatic client authentication via TLS certificate fields #1920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: unstable
Are you sure you want to change the base?
Conversation
…g based TLS certifiacte field Signed-off-by: Omkar Mestry <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## unstable #1920 +/- ##
============================================
+ Coverage 71.02% 71.05% +0.02%
============================================
Files 123 123
Lines 65683 65683
============================================
+ Hits 46653 46668 +15
+ Misses 19030 19015 -15
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I didn't do a complete review. For example I didn't look at the tests at all yet.
I'm not sure if the config should allow any field. Maybe only CN makes sense ever and then the config should maybe not be selecting a field. But let's discuss it and think about what we can do.
I found the corresponding features in ETCD and Kubernetes. The both use only common name as username:
-
ETCD: https://etcd.io/docs/v3.4/op-guide/authentication/#using-tls-common-name
As of version v3.2 if an etcd server is launched with the option --client-cert-auth=true, the field of Common Name (CN) in the client’s TLS cert will be used as an etcd user.
-
Kubernetes: https://kubernetes.io/docs/reference/access-authn-authz/authentication/#x509-client-certificates
If a client certificate is presented and verified, the common name of the subject is used as the user name for the request. As of Kubernetes 1.4, client certificates can also indicate a user's group memberships using the certificate's organization fields.
@@ -3361,6 +3361,7 @@ standardConfig static_configs[] = { | |||
createBoolConfig("tls-prefer-server-ciphers", NULL, MODIFIABLE_CONFIG, server.tls_ctx_config.prefer_server_ciphers, 0, NULL, applyTlsCfg), | |||
createBoolConfig("tls-session-caching", NULL, MODIFIABLE_CONFIG, server.tls_ctx_config.session_caching, 1, NULL, applyTlsCfg), | |||
createStringConfig("tls-cert-file", NULL, VOLATILE_CONFIG | MODIFIABLE_CONFIG, EMPTY_STRING_IS_NULL, server.tls_ctx_config.cert_file, NULL, NULL, applyTlsCfg), | |||
createStringConfig("tls-auth-clients-field", NULL, VOLATILE_CONFIG | MODIFIABLE_CONFIG, EMPTY_STRING_IS_NULL, server.tls_ctx_config.client_auth_field, NULL, NULL, applyTlsCfg), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The config should probably be an enum config, so we can validate that only values that we can handle are allowed.
if (!strcasecmp(field, "CN")) | ||
nid = NID_commonName; | ||
else if (!strcasecmp(field, "O")) | ||
nid = NID_organizationName; | ||
// Add more mappings here as needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using "O" makes no sense to me, just like "C" (Country) makes no sense. Fields like that are not unique. If we start supporting it, I suspect we might regret it in the future. @madolson you mentioned "O" in a comment. Was it a serious suggestion or just an example?
Out of the top-level fields, only CN (common name) makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Someone asked for O once inside AWS, but I don't know if they would have been okay with just CN as well. We can always just start with "CN" and see if anyone asks for more in the future? The enum just allows us to add more in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The enum just allows us to add more in the future.
Yes, I agree enums in general are good for extensibility, but I doubt what we want to add in the future are top-level fields. It's more likely that we want to add
- subject alt name: DNS name
- an option to fail the handshake if the client cert doesn't match an existing user
But if you were asked once to use O for username, then I'm OK with it. It's an actual ask from a user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed up with the folks, they did indeed ask for "O". They sign their certificates by team, and they want the auth to happen at the team level (imagine a bunch of apps with multiple services, they want all services to have the same permission). They said "CN" would be fine, they would just need to provision more users, but that would be "OK" for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, I think we shouldn't name the config tls-auth-clients-field
. I prefer we just do like ETCD and use a boolean config. They have a config called client-cert-auth
(true or false) for exactly the same feature: https://etcd.io/docs/v3.4/op-guide/authentication/#using-tls-common-name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about boolean config tls-auth-clients-common-name
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to focus on the fact we are authenticating the user of the client than on what field is being used. tls-auth-clients-user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tls-auth-clients-user
SGTM
|
||
X509 *cert = SSL_get_peer_certificate(conn->ssl); | ||
if (!cert) { | ||
serverLog(LL_WARNING, "TLS: No peer certificate found after handshake"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WARNING is a serious log level. It means the server is in trouble and needs an administrator's help.
Missing cert is not an error. Client certificate can be configured to be optional. I think we shouldn't log anything here, not even a notice.
|
||
char field_value[256]; | ||
if (!getCertFieldByName(cert, field, field_value, sizeof(field_value))) { | ||
serverLog(LL_WARNING, "TLS: Failed to extract field %s from certificate", field); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The should probably just be a NOTICE.
serverLog(LL_NOTICE, "TLS: ACL lookup user: %s", u ? u->name : "NOT FOUND"); | ||
if (u) { | ||
c->user = u; | ||
c->flag.authenticated = true; | ||
serverLog(LL_NOTICE, "TLS: Auto-authenticated client as %s", u->name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging twice here is too verbose IMO. Let's log only the second one, when it has been found and authenticated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If many clients connect successfully, are we're flooding the logs too much if we log here?
Is it better to log only in the NOT FOUND case? It's the exception case that shouldn't happen often. @madolson WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also mark u->name as sensitive in other places, we should add it behind the hide user data flag. I would drop the log level to verbose if we want to keep the second one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. If we redact this log message, there isn't much to log at all. Shall we just skip logging here?
If we want it, it would be something like this:
serverLog(LL_VERBOSE,
"TLS: Auto-authenticated client as %s",
server.hide_user_data_from_log ? "*redacted*" : u->name);
@@ -138,6 +138,7 @@ struct connection { | |||
ConnectionCallbackFunc conn_handler; | |||
ConnectionCallbackFunc write_handler; | |||
ConnectionCallbackFunc read_handler; | |||
sds cert_user; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be on the underlying tlsConnection, not the higher level abstraction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I'm not sure this needs to be on the struct, looks like it could just be stack allocated inside the function itself.
static void TLSAccept(void *_conn) { | ||
tls_connection *conn = (tls_connection *)_conn; | ||
ERR_clear_error(); | ||
int ret = SSL_accept(conn->ssl); | ||
if (ret > 0) { | ||
conn->flags |= TLS_CONN_FLAG_ACCEPT_SUCCESS; | ||
autoAuthenticateClientFromCert(conn); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't do this here, it's executed from the IO threads and the ACL code is not thread safe, we probably need to grab the user and stuff it on the connection to evaluate later.
This PR implements support for automatic client authentication based on a configurable field in the client's TLS certificate.
🔒 Feature Overview:
Introduces a new configuration directive: tls-auth-clients-field.
Allows specifying a certificate field (e.g., CN, O) to map to a Valkey user.
If a matching user is found, the client is automatically authenticated as that user upon TLS handshake.
Falls back to the default user if no match is found.
⚙️ Implementation Details:
Added getCertFieldByName() utility to extract fields from peer certificates.
Added autoAuthenticateClientFromCert() to handle automatic login logic post-handshake.
Integrated automatic authentication into the TLSAccept function after handshake completion.
Updated test suite (tests/integration/tls.tcl) to validate the feature.
✅ Benefits:
Enables smoother integration with mTLS-based infrastructures.
Reduces the need for clients to manually send AUTH commands.
Enhances security by tightly coupling certificate identity with Valkey access control.
🔬 Example:
This will authenticate the client as the user matching the certificate's Common Name.
#1866