Skip to content

Add support for automatic client authentication via TLS certificate fields #1920

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: unstable
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/config.c
Original file line number Diff line number Diff line change
Expand Up @@ -3361,6 +3361,7 @@ standardConfig static_configs[] = {
createBoolConfig("tls-prefer-server-ciphers", NULL, MODIFIABLE_CONFIG, server.tls_ctx_config.prefer_server_ciphers, 0, NULL, applyTlsCfg),
createBoolConfig("tls-session-caching", NULL, MODIFIABLE_CONFIG, server.tls_ctx_config.session_caching, 1, NULL, applyTlsCfg),
createStringConfig("tls-cert-file", NULL, VOLATILE_CONFIG | MODIFIABLE_CONFIG, EMPTY_STRING_IS_NULL, server.tls_ctx_config.cert_file, NULL, NULL, applyTlsCfg),
createStringConfig("tls-auth-clients-field", NULL, VOLATILE_CONFIG | MODIFIABLE_CONFIG, EMPTY_STRING_IS_NULL, server.tls_ctx_config.client_auth_field, NULL, NULL, applyTlsCfg),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config should probably be an enum config, so we can validate that only values that we can handle are allowed.

createStringConfig("tls-key-file", NULL, VOLATILE_CONFIG | MODIFIABLE_CONFIG, EMPTY_STRING_IS_NULL, server.tls_ctx_config.key_file, NULL, NULL, applyTlsCfg),
createStringConfig("tls-key-file-pass", NULL, MODIFIABLE_CONFIG | SENSITIVE_CONFIG, EMPTY_STRING_IS_NULL, server.tls_ctx_config.key_file_pass, NULL, NULL, applyTlsCfg),
createStringConfig("tls-client-cert-file", NULL, VOLATILE_CONFIG | MODIFIABLE_CONFIG, EMPTY_STRING_IS_NULL, server.tls_ctx_config.client_cert_file, NULL, NULL, applyTlsCfg),
Expand Down
1 change: 1 addition & 0 deletions src/connection.h
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@ struct connection {
ConnectionCallbackFunc conn_handler;
ConnectionCallbackFunc write_handler;
ConnectionCallbackFunc read_handler;
sds cert_user;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be on the underlying tlsConnection, not the higher level abstraction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm not sure this needs to be on the struct, looks like it could just be stack allocated inside the function itself.

};

#define CONFIG_BINDADDR_MAX 16
Expand Down
1 change: 1 addition & 0 deletions src/server.h
Original file line number Diff line number Diff line change
Expand Up @@ -1475,6 +1475,7 @@ typedef struct serverTLSContextConfig {
char *client_cert_file; /* Certificate to use as a client; if none, use cert_file */
char *client_key_file; /* Private key filename for client_cert_file */
char *client_key_file_pass; /* Optional password for client_key_file */
char *client_auth_field; /* Field to be used for Authomatic TLS Authenticaton */
char *dh_params_file;
char *ca_cert_file;
char *ca_cert_dir;
Expand Down
58 changes: 58 additions & 0 deletions src/tls.c
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@

#include <openssl/conf.h>
#include <openssl/ssl.h>
#include <openssl/x509.h>
#include <openssl/err.h>
#include <openssl/rand.h>
#include <openssl/pem.h>
Expand Down Expand Up @@ -677,12 +678,69 @@ static void updateSSLState(connection *conn_) {
updatePendingData(conn);
}

static int getCertFieldByName(X509 *cert, const char *field, char *out, size_t outlen) {
if (!cert || !field || !out) return 0;

int nid = -1;

if (!strcasecmp(field, "CN"))
nid = NID_commonName;
else if (!strcasecmp(field, "O"))
nid = NID_organizationName;
// Add more mappings here as needed
Comment on lines +686 to +690
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using "O" makes no sense to me, just like "C" (Country) makes no sense. Fields like that are not unique. If we start supporting it, I suspect we might regret it in the future. @madolson you mentioned "O" in a comment. Was it a serious suggestion or just an example?

Out of the top-level fields, only CN (common name) makes sense to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone asked for O once inside AWS, but I don't know if they would have been okay with just CN as well. We can always just start with "CN" and see if anyone asks for more in the future? The enum just allows us to add more in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enum just allows us to add more in the future.

Yes, I agree enums in general are good for extensibility, but I doubt what we want to add in the future are top-level fields. It's more likely that we want to add

  • subject alt name: DNS name
  • an option to fail the handshake if the client cert doesn't match an existing user

But if you were asked once to use O for username, then I'm OK with it. It's an actual ask from a user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed up with the folks, they did indeed ask for "O". They sign their certificates by team, and they want the auth to happen at the team level (imagine a bunch of apps with multiple services, they want all services to have the same permission). They said "CN" would be fine, they would just need to provision more users, but that would be "OK" for them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, I think we shouldn't name the config tls-auth-clients-field. I prefer we just do like ETCD and use a boolean config. They have a config called client-cert-auth (true or false) for exactly the same feature: https://etcd.io/docs/v3.4/op-guide/authentication/#using-tls-common-name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about boolean config tls-auth-clients-common-name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to focus on the fact we are authenticating the user of the client than on what field is being used. tls-auth-clients-user

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tls-auth-clients-user SGTM

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zuiderkwast so we will still be now using the field name right like CN and O, your suggestion is to just change the name of the config to tls-auth-clients-user from tls-auth-clients-field right ?


if (nid == -1) return 0;

X509_NAME *subject = X509_get_subject_name(cert);
if (!subject) return 0;

return X509_NAME_get_text_by_NID(subject, nid, out, outlen) > 0;
}

static void autoAuthenticateClientFromCert(tls_connection *conn) {
if (!conn || !SSL_is_init_finished(conn->ssl)) return;

const char *field = server.tls_ctx_config.client_auth_field;
if (!field) return;

serverLog(LL_NOTICE, "TLS: Config Auto Auth Field = %s", field);

X509 *cert = SSL_get_peer_certificate(conn->ssl);
if (!cert) {
serverLog(LL_WARNING, "TLS: No peer certificate found after handshake");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING is a serious log level. It means the server is in trouble and needs an administrator's help.

Missing cert is not an error. Client certificate can be configured to be optional. I think we shouldn't log anything here, not even a notice.

return;
}

char field_value[256];
if (!getCertFieldByName(cert, field, field_value, sizeof(field_value))) {
serverLog(LL_WARNING, "TLS: Failed to extract field %s from certificate", field);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The should probably just be a NOTICE.

X509_free(cert);
return;
}

serverLog(LL_NOTICE, "TLS: Config Auto Auth %s = %s", field, field_value);
conn->c.cert_user = sdsnew(field_value);
X509_free(cert);

client *c = connGetPrivateData(&conn->c);
if (!c) return;

user *u = ACLGetUserByName(conn->c.cert_user, sdslen(conn->c.cert_user));
serverLog(LL_NOTICE, "TLS: ACL lookup user: %s", u ? u->name : "NOT FOUND");
if (u) {
c->user = u;
c->flag.authenticated = true;
serverLog(LL_NOTICE, "TLS: Auto-authenticated client as %s", u->name);
Comment on lines +729 to +733
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging twice here is too verbose IMO. Let's log only the second one, when it has been found and authenticated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If many clients connect successfully, are we're flooding the logs too much if we log here?

Is it better to log only in the NOT FOUND case? It's the exception case that shouldn't happen often. @madolson WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also mark u->name as sensitive in other places, we should add it behind the hide user data flag. I would drop the log level to verbose if we want to keep the second one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. If we redact this log message, there isn't much to log at all. Shall we just skip logging here?

If we want it, it would be something like this:

        serverLog(LL_VERBOSE,
                  "TLS: Auto-authenticated client as %s",
                  server.hide_user_data_from_log ? "*redacted*" : u->name);

}
}

static void TLSAccept(void *_conn) {
tls_connection *conn = (tls_connection *)_conn;
ERR_clear_error();
int ret = SSL_accept(conn->ssl);
if (ret > 0) {
conn->flags |= TLS_CONN_FLAG_ACCEPT_SUCCESS;
autoAuthenticateClientFromCert(conn);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't do this here, it's executed from the IO threads and the ACL code is not thread safe, we probably need to grab the user and stuff it on the connection to evaluate later.

Copy link
Author

@omanges omanges Apr 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking of moving the authentication logic to clientAcceptHandler in networking.c and for that we would need to have cert_user variable set in connection object which can be used in clientAcceptHandler

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be on the tls connection object, since it's not relevant for TLS disabled. Other than that sounds OK.

Copy link
Author

@omanges omanges Apr 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can only add the cert_user field on the connection object level, rather than TLS because as you said we need to make the ACL call in main thread and we need to make sure the TLS handshake is completed and client is ready, so I think we can that in clientAcceptHandler safely in networking.c but the problem is we don't have access to tls_connectionoutside TLS logic hence we need it at connection level. Does that sound correct or am I understanding something incorrectly ?

} else if (handleSSLReturnCode(conn, ret)) {
conn->flags |= TLS_CONN_FLAG_ACCEPT_ERROR;
}
Expand Down
30 changes: 30 additions & 0 deletions tests/unit/tls.tcl
Original file line number Diff line number Diff line change
Expand Up @@ -154,5 +154,35 @@ start_server {tags {"tls"}} {
r config set tls-key-file-pass 1234
r config set tls-key-file $keyfile_encrypted
}

test {TLS: Auto-authenticate using tls-auth-clients-field (CN)} {
# Ensure the user exists
r ACL SETUSER client on >clientpass allcommands allkeys

# Ensure feature is off by default
r CONFIG SET tls-auth-clients-field ""

# Client should not be authenticated automatically
set s [valkey [srv 0 host] [srv 0 port] 0 1]
::tls::import [$s channel]
$s AUTH client clientpass
assert_match {OK} [$s PING]
$s close

# Enable feature
r CONFIG SET tls-auth-clients-field CN

# With correct client cert having CN=client and an ACL user named 'client'
set s [valkey [srv 0 host] [srv 0 port] 0 1]
::tls::import [$s channel]

# Now no explicit AUTH is required
assert_equal "PONG" [$s PING]
$s close

# Reset config
r CONFIG SET tls-auth-clients-field ""
}

}
}
Loading