Custom TLS verification logic #30

tegefaulkes · 2023-06-09T03:20:48Z

Specification

We need the ability to override the TLS verification logic. This is essential for Polykey to check the identity of a connection by verifying the cert chain. We need to fail any connection that can't prove it's the Node we're trying to connect to.

The boring SslContextBuilder has a method called set_verify_callback which we can use to set a verify function. We need to translate a rust function into an async call we provide in the js space.

The method we provide it will be given the normal verification result as well as a x509 store reference we can use to inspect the cert chain. We return a bool corresponding to our verification result. This result will override the normal verification result.

The set_verify_callback also sets the verification mode. I think this would override what set_verify sets for the mode.

Additional context

Tasks

1. Work out how to call an js async function from the rust code with napi.
2. Work out the details of how set_verify_callback works.
3. Add a way to set the verification callback when creating a quiche config.
4. Add tests for custom verification.

The text was updated successfully, but these errors were encountered:

tegefaulkes · 2023-06-09T04:51:04Z

Few observations about the verify_peer_callback

The callback is called for each cert in the chain.
We're given the cert store containing all the certs to verify
The current cert is gotten with current_cert()
The chain is gotten with chain()
The order of the certs seems to be root cert to leaf. So if a CA cert was provided, that would be checked first and then the certs the other side provided.

Since the verify callback will be called multiple times, we'll have to handle that some how. I think Polykey verification logic expects to be run once over the whole chain. We can shoe horn that in by caching the result of the first run and returning that for the subsequent calls.

CMCDragonkai · 2023-06-09T06:10:36Z

Remember that callback will have to be synchronous function. NAPI will be executing it from the rust side.

tegefaulkes · 2023-06-14T04:48:07Z

I've been prototyping how to handle the callback. It has been a frustrating process since none of this is documented very well. I've identified 4-ish ways of calling a callback.

Defining a callback with fn some_fun<T: Fn() -> Result<String>(callback: T){} as the type. This was the most promising.
Defining a callback with fn some_fun(env: Env, callback: JsFunction){}.
Using the CallContext with fn some_fun(ctx: CallContext){}.
Using a ThreadsafeFunction with method 2.

Method 1.

This should be the simplest method. The callback can be defined and used as so.

#[napi]
fn test_callback<T: Fn(String, String) -> Result<bool>>(callback: T) {
  let result = callback("Hello".to_string(), "olleh".to_string()).unwrap();
  println!("result: {}", result);
}

It will work synchronously and should be thread-safe? I'm not sure. The problem with this however is that it doesn't work in a class factory method. @CMCDragonkai has made an issue for it at napi-rs/napi-rs#1374. So I don't think we can use this method here.

Method 2.

Method 2 is similar to 1 but uses JsValues when calling the callback. It depends on env to create new values e.g. env.create_string("hello"). This is not thread safe due to it's dependence on Env to create the JsValues.

We need thread safety to call the callback within the boring validate function.

#[napi]
pub fn callback_two(env: Env, callback: JsFunction) -> napi::Result<()> {
  let result = callback.call(None, &vec![env.create_string("hello").unwrap()]);
  let val = result
    .unwrap()
    .coerce_to_number()
    .unwrap()
    .get_int64()
    .unwrap();
  println!("result was: {}", val);
  Ok(())
}

Method 3.

I can't create a working example of this. The documentation gives examples of this using functions using the #[js_function] macro. But I can't find any info on napi functions defined this way. Defining a function this way doesn't result in a method on the quiche native module.

I don't think this method would work, I don't think it's thread-safe?

Method 4.

I think due to the thread safety, we have to use a ThreadsafeFunction when using the callback. A ThreadsafeFunction can be created and used as such.

#[napi]
pub fn call_threadsafe_function(callback: JsFunction) -> Result<()> {
  let tsfn: ThreadsafeFunction<(u32, String), ErrorStrategy::CalleeHandled> = callback
    .create_threadsafe_function(0, |ctx| {
      let(num, s) = ctx.value;
      println!("value: {}", num);
      println!("value: {}", s);
      ctx.env.create_string("Hello!").map(|v| vec![v])
    })?;
  println!("Making native call");
  thread::spawn(move || {
    tsfn.call(
      Ok((100, "hello!".to_string())),
      ThreadsafeFunctionCallMode::Blocking,
    );
    println!("Result!: {}", result.unwrap());
  });
  Ok(())
}

We should be able to use this within boring's validate_callback function and it will be thread safe. But with the current version of napi there is no way to get the result of calling the ThreadsafeFunction.

The tsfn.call() only returns a status and no value. Looking at the rust docs, the newer version provides call_async and call_with_return_value which both return the result. But we'll need to update the version of napi-rs we're using.

tegefaulkes · 2023-06-14T05:23:06Z

Using tsfn.call_async I can get the result of the callback. But unfortunately I can't do this synchronously. I think it's possible to call an async function synchronously using tokio. but that will have it's own caveats.

CMCDragonkai · 2023-06-14T05:27:26Z

Ideally we should not introduce any rust based IO runtimes like Tokio.

If it cannot be used in a factory method, perhaps it should be set after construction of the rust struct? This is supposed into the config object right? Then why not call a method on the config object or does it have to be part of the boring SSL context?

tegefaulkes · 2023-06-14T07:56:50Z

Acording to https://napi.rs/docs/compat-mode/concepts/thread-safe-function, any function has to be converted to a ThreadsafeFunction if we want to call it between threads. And a ThreadsafeFunction can only be called asynchronously.

This means it's not possible to use any js callback functions within the verify_callback for the following reasons.

We can only use a ThreadsafeFunction across threads.
We can only get the result of a ThreadsafeFunction asynchronously.
The verify_callback we provide to set_verify_callback can only be synchronously.

Note that this is only a problem if we want to get the result of the callback. It is possible to call the callback synchronously and provide information such as the cert PEMs and the internal validation result.

So now I think we need to think about a work-around. Strictly we only need to override the verify failing due to a missing CA cert. The cert chain can be verified on the application level and any errors and connection rejection can be handled there.

The callback can used to signal the secured event and provide the certificate information.

So we can side-step this and move forward with the following steps.

Hard code a verify function to allow self-signed cert chains.
Have the hard-coded verify function call a callback to trigger the securedP event.
Have the application handle any application specific cert validation and manually shutdown the connection with an app level error.

tegefaulkes · 2023-06-15T06:40:50Z

It's not ideal but I have something workable.

I've added a flag to withBoringSslCtx called verify_allow_fail. If both verify_peer and verify_allow_fail are set to true then the peer must send it's certs, but any specific problems with certs are ignored. So if no certs are provided then we get a TLSFail. Providing a self-signed or incomplete chain will not result in a failure.
Added a verify_callback which is called with the details of each cert. It will be called in order of root -> inter -> leaf. It will be called separately for each cert in the chain. This will be useful for knowing when the remote certs have been received. We can get most of this information from alternative means but it's a good event for knowing when the connection is secured.

The information being provided to the callback has the following format. A depth of 0 is the last cert to be verified in the chain.

  {
    preSuccess: true,
    errorMessage: 'ok',
    depth: 0,
    length: 2,
    cert: '-----BEGIN CERTIFICATE-----\n' +
      'MIIDLjCCAhagAwIBAgIRAOP1Hv5+GXJMxyBvMi6IFSAwDQYJKoZIhvcNAQELBQAw\n' +
      'EDEOMAwGA1UEAxMFcnNhQ0EwHhcNMjMwNDE5MDU0ODU4WhcNMjQwNDE4MDU0ODU3\n' +
      'WjAPMQ0wCwYDVQQDEwRyc2ExMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKC\n' +
      'AQEAu9DbhhyDmXQZhcUxmJMbUPd8w1YPnZjNTIDREWMtjRaLmeKylPL1cuL8Wmsj\n' +
      'wvFMbx2FhDjSXsQDzjPmFNIbT3F+i2VmolAvKUHG4JqUyfcN3E+rty1Lytvu3vDV\n' +
      '3OC8i1o0Y89QTXiRJIjfSBktPy3y8hskKJ6nAAbL0MQKRyKpzIV2k6epLd1Xx6dl\n' +
      'RtA4QHihcD5HedbD8OaVRLBbeJ/nx38Kb/vW1G6Uivd9yVdLOi306epKlks60bSW\n' +
      '1U5Ffy/2MkVW35/8JSyoAExOHBNix7LYaNB2ajCjLCjcrQI3AFcGuep8OPIS0MvD\n' +
      'y/srxR/PR1nWl3muhClGChkP+wIDAQABo4GDMIGAMA4GA1UdDwEB/wQEAwIFoDAd\n' +
      'BgNVHSUEFjAUBggrBgEFBQcDAQYIKwYBBQUHAwIwHQYDVR0OBBYEFFFzBNyAicB+\n' +
      'yUT8xh+isRcSixzkMB8GA1UdIwQYMBaAFPl6JPRYyNrrxk30v1MV8LTYwqLLMA8G\n' +
      'A1UdEQQIMAaCBHJzYTEwDQYJKoZIhvcNAQELBQADggEBAEQydSNhxW1v+f5J6v0K\n' +
      'vhWSqJmwTwhxhAvQctnophPPJb2cuyfcc0X1x+JlZFJD4eEHu5qvaYMVv7EY4Is8\n' +
      'xliZbQj4L9x0246q7j44XuRlK67rj4moIoT/hLBMwRM300LIABVUHvZjAy0Fl+hs\n' +
      'OuVl7m1iASHoJ07mTHoKkh4tdSjIYiMoF3Nbag4hu01iJYJIK48NzvrPxCzOb4zh\n' +
      'eMhQG5C/2bijWMSPI112ksCV1PT9skuj4/R0bCuMzdJ/RWxOl1sF9pXACwpi/y6M\n' +
      'dEPebTEWIOJtHB05IMb4nLy5DoOwW6o8ZrCjVZqdNps3LCJG2PrMkYVeKYYC3ecU\n' +
      'Pug=\n' +
      '-----END CERTIFICATE-----\n',
    chain: [
      '-----BEGIN CERTIFICATE-----\n' +
        'MIIDLjCCAhagAwIBAgIRAOP1Hv5+GXJMxyBvMi6IFSAwDQYJKoZIhvcNAQELBQAw\n' +
        'EDEOMAwGA1UEAxMFcnNhQ0EwHhcNMjMwNDE5MDU0ODU4WhcNMjQwNDE4MDU0ODU3\n' +
        'WjAPMQ0wCwYDVQQDEwRyc2ExMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKC\n' +
        'AQEAu9DbhhyDmXQZhcUxmJMbUPd8w1YPnZjNTIDREWMtjRaLmeKylPL1cuL8Wmsj\n' +
        'wvFMbx2FhDjSXsQDzjPmFNIbT3F+i2VmolAvKUHG4JqUyfcN3E+rty1Lytvu3vDV\n' +
        '3OC8i1o0Y89QTXiRJIjfSBktPy3y8hskKJ6nAAbL0MQKRyKpzIV2k6epLd1Xx6dl\n' +
        'RtA4QHihcD5HedbD8OaVRLBbeJ/nx38Kb/vW1G6Uivd9yVdLOi306epKlks60bSW\n' +
        '1U5Ffy/2MkVW35/8JSyoAExOHBNix7LYaNB2ajCjLCjcrQI3AFcGuep8OPIS0MvD\n' +
        'y/srxR/PR1nWl3muhClGChkP+wIDAQABo4GDMIGAMA4GA1UdDwEB/wQEAwIFoDAd\n' +
        'BgNVHSUEFjAUBggrBgEFBQcDAQYIKwYBBQUHAwIwHQYDVR0OBBYEFFFzBNyAicB+\n' +
        'yUT8xh+isRcSixzkMB8GA1UdIwQYMBaAFPl6JPRYyNrrxk30v1MV8LTYwqLLMA8G\n' +
        'A1UdEQQIMAaCBHJzYTEwDQYJKoZIhvcNAQELBQADggEBAEQydSNhxW1v+f5J6v0K\n' +
        'vhWSqJmwTwhxhAvQctnophPPJb2cuyfcc0X1x+JlZFJD4eEHu5qvaYMVv7EY4Is8\n' +
        'xliZbQj4L9x0246q7j44XuRlK67rj4moIoT/hLBMwRM300LIABVUHvZjAy0Fl+hs\n' +
        'OuVl7m1iASHoJ07mTHoKkh4tdSjIYiMoF3Nbag4hu01iJYJIK48NzvrPxCzOb4zh\n' +
        'eMhQG5C/2bijWMSPI112ksCV1PT9skuj4/R0bCuMzdJ/RWxOl1sF9pXACwpi/y6M\n' +
        'dEPebTEWIOJtHB05IMb4nLy5DoOwW6o8ZrCjVZqdNps3LCJG2PrMkYVeKYYC3ecU\n' +
        'Pug=\n' +
        '-----END CERTIFICATE-----\n',
      '-----BEGIN CERTIFICATE-----\n' +
        'MIIC8DCCAdigAwIBAgIRAJCwiiQ3TTulJhS9EK55cXUwDQYJKoZIhvcNAQELBQAw\n' +
        'EDEOMAwGA1UEAxMFcnNhQ0EwHhcNMjMwNDE5MDU0ODEyWhcNMjQwNDE4MDU0ODEy\n' +
        'WjAQMQ4wDAYDVQQDEwVyc2FDQTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoC\n' +
        'ggEBAOPvjy4GlvFbX7iCAyt/jd0+eHlEFx/zMFwBPLZmqIA3HPJwj2cp5TY5NEWk\n' +
        'MM42qQLzCOmO/K4f9m6WUZw1J9tKGIEc7b+RQ/JisHz8iiFpEDayROrV4Cs/Nd0r\n' +
        'wvTautTXl6bd37rI4V8o0yHB0E/WcazAUJQDwvl7bVL942qB86LQqsEfxMIM7Ddq\n' +
        'S+9dyr67RG1MMw5Qm0889l3yDKkKOKy3I6w4dAdfCZSeZFWXl7GeIVPj1Rc6Fxt8\n' +
        'Tg79Cm5lgMnmqzGBZUS4eitJq6BhTyQfoylHayCkpqhcojVDbcN5D1bH/YTUh/38\n' +
        'klDaOwqCajpdu6VUr1eRGrl55A0CAwEAAaNFMEMwDgYDVR0PAQH/BAQDAgEGMBIG\n' +
        'A1UdEwEB/wQIMAYBAf8CAQEwHQYDVR0OBBYEFPl6JPRYyNrrxk30v1MV8LTYwqLL\n' +
        'MA0GCSqGSIb3DQEBCwUAA4IBAQDIACzucf/ePefGgLVTWMVvZdfW9cXuf0Ib3zWC\n' +
        '6RYbA+cBBIgn0SGToY613vIeKV1uPhFpzYJd+3eFVj9x1gVHTK7Hk6o5Tf780ZKN\n' +
        'd4TkoTp9WER/boaOB+77nH9WopvE+ien1GM2HTgOBUrClPGCZ2OAJBq4xxDDeU2T\n' +
        'rxlRD06692CdVh5uSUtCIsqoCUTi7s+C9p5WG1gOVIkafFmyR2i28E2n0rKlNzXQ\n' +
        'UTo3936xFx8cdIJVzvPFOvMhKrOTzCCcMD8NiXXHlX51HMy3fua16W8dfLKhJp4N\n' +
        '1PFD57chAacE/HKlnB4LugDLCl+MBhpuyxt+Ifk0geqeV15p\n' +
        '-----END CERTIFICATE-----\n'
    ]
  }

The callback has the type (err, data) => void and will not override any verification result.

Now to do verification on the application level we can do the following.

start connection with verify_peer: true, verify_allow_fail: true to force certs but allow failure.
wait for established
wait for secured signified by the callback been called with depth === 0 indicating the whole cert chain should be available. (server certs are available immediately, client certs on a delay.)
Verify the peer's cert chain.
On failure, force close the connection with custom error.

I think this is very similar to the validation process the proxy did.

tegefaulkes · 2023-06-16T02:29:25Z

One problem with this is that the connection can establish before TLS validation is completed. On the application level we need to make sure not to start any streams until the following conditions.

Connection is established
Connection is secured, signalled by the verificationCallback being called,
Verify the remote certificate, force the connection close if verification fails.
Wait for the peer to verify the connection and potentially close.

I'm not sure of a good way to know that the peer has verified, possibly the first stream can signal this?

CMCDragonkai · 2023-06-16T02:34:40Z

We talked about this before. There is a trick about the exact quic packets that is received. On the whiteboard somewhere.

tegefaulkes · 2023-06-16T02:53:07Z

It seems that the verify callback is called for each error in the chain. At minimum that means the callback the called at least once for each cert in the chain if it succeeded with error message 'ok'. Or multiple times if the cert had an error.

This means that it's not clear that any callback is the last one to be made on the chain. Unless it's depth 0 and succeeded with an 'ok' message. I'll need to add a done flag or done callback to signal it. But I need a way in the internal callback to know. I'll do some more digging.

This can be very useful. It means we can override specific verification errors such as 'self signed certificate in certificate chain', 'unable to get local issuer certificate' or 'unable to verify the first certificate'.

I think given this, we can re-create our Polykey custom verification using a combination of two primitive overrides that we can hard code.

Override the result of any specific error code to force it to succeed. Errors such as self signed cert can be allowed.
Provide an optional array of public keys (NodeIds) and fail if the chain does not contain any keys in that list.

This should make verification failures part of quiches internal logic. but it doesn't solve a timing problem with the client connections. The client doesn't know when the server has finished verifying it. So in reality for any connections there are 3 events for when the connection can be considered fully established.

Connection established returns true from quiche.
Server TLS validation completes.
Client TLS validation completes.
We won't expect and TLSFAIL events here and can consider the connection successfully opened.

If I recall, the servers certs are verified early in the connection and happens before the connection is established?
But the client certs are verified after the connection is established and can result in the connection failing with TLSFail after we think the connection is open.

So we still need a way to know from both sides that both TLS verification steps have completed. This is a problem weather we hard-code the verification logic or handle it on the application level. Worst case we'll need to communicate this somehow on the application level. This could just be opening the first stream after verifying locally.

tegefaulkes · 2023-06-16T03:02:13Z

We talked about this before. There is a trick about the exact quic packets that is received. On the whiteboard somewhere.

Ah I remember that. Ok so we'll have a way of knowing that event on both ends.

There is one thing I need your feedback on. I have outlined two was of handling the verification now.

Application level where we allow the hard coded verification to fail, do our own verification and force the connection to fail with an error.
hard-coded level where we can configure some overrides in the boring verification_callback and re-create our application level verification logic.
Combination of the two where we have the freedom of the hard-coded overrides and the application level.

Option 2. likely works better with the connection lifecycle events. but it's simple enough to do option 3.

tegefaulkes · 2023-06-16T03:11:19Z

Useful reference for verification error messages.
https://github.com/google/boringssl/blob/e1b8685770d0e82e5a4a3c5d24ad1602e05f2e83/crypto/x509/x509_txt.c

Error codes are also defined in https://github.com/google/boringssl/blob/e1b8685770d0e82e5a4a3c5d24ad1602e05f2e83/include/openssl/x509.h#L2654

So an error such as 'self signed certificate in certificate chain' refers to...
'self signed certificate in certificate chain' | X509_V_ERR_SELF_SIGNED_CERT_IN_CHAIN | 19.

tegefaulkes · 2023-06-16T04:07:34Z

Based on the discussion just now. We're going to cut out the verify_callback and allow failure override and handle the custom TLS verification on the application level. It needs the following changes

Connection needs to take a verification callback, this will be run as soon as the remote certs are available.
The connection needs to know when the remote certs should be available. There should be a type of incoming packet where they should be available.
If the verification callback fails, we need to force close the connection with an error resembling a TLSFail.

I'll need to look at the new native tests to divine the stage when certs should be available. Hopefully this doesn't vary based on certificate type or chain size.

tegefaulkes · 2023-06-16T05:39:25Z

Client certs are not sent unless we try to verify the client. So verifyPeer for the server must be true for certs to be sent. So we'll have to override the verify callback anyway so we can set verifyPeer to true so that the client's certs are requested, but override the results.

CMCDragonkai · 2023-06-16T05:42:32Z

How about this? Set verify peer to be true but then override the callback to return true if the peer's certificate is fine without the identity check. Then apply the application level check on top. Only do this if one gives a custom JS callback in the first place.

CMCDragonkai · 2023-06-16T05:43:03Z

That way the library is compatible outside PK and still works for what we need inside PK.

CMCDragonkai · 2023-06-16T05:43:33Z

Client certs are not sent unless we try to verify the client. So verifyPeer for the server must be true for certs to be sent. So we'll have to override the verify callback anyway so we can set verifyPeer to true so that the client's certs are requested, but override the results.

Wait I don't get this. How does the client know not to send the client server if server doesn't enable verify peer?

tegefaulkes · 2023-06-16T07:03:48Z

After reviewing the tests connection lifecycle tests I can see that After each side verifies the TLS and it succeeds, the send a short frame. If the TLS failed, it sends a handshake frame with the error instead. This means we can use receiving a short frame as an indication that the peer succeeded all TLS checks.

The peer certs are available after receiving the initial frame. So the custom TLS verification can be done after receiving the initial frame. This needs to be done synchronously before sending the next frame out.

If a connection receives a short frame then we can reasonably assume the peer will not fail any verification and the connection is fully established.

Two changes need to be done

Add the verifyCallback to the connection creation. This must be called after receiving the initial frame from the peer. Any failure of this callback should immediately close the connection with a descriptive error.
Add a secured promise that resolves when a short frame is first received. or reject with any error such as the TLS fail.

tegefaulkes · 2023-06-19T08:56:56Z

Ok, The custom TLS verification callback has been implemented. It is a simple callback that is called with the remote cert chain in PEM format (certs: Array<string>) => void. If the TLS fails then we throw an error. This is caught and translated into closing the connection.

Now that I think about it, It could just return {code: number, message: string} | undefined. No need to do anything with catching errors here. In any case I need to review how this will work with mapping errors to codes and vice versa.

We have a working secured event that triggers when the peer has verified without error. It is triggered by the first short header received that doesn't trigger the draining state. Creating the client awaits the secure event before resolving, resulting in an error if it failed.

I need to update this so it only triggers a short frame has be received AND sent. Otherwise the server side connection can trigger the secured event even when client verification fails. In this case the client accepts the server's certs and sends the short frame. Ideally the connection is secure when both sides give the thumbs up. It's a simple change.

I added some tests for demonstrating custom TLS success and failure. I also fixed two skipped tests that depended on the secured event.

Some things that still need to be done.

Secure event should only trigger after both sides have secured without error.
use a proper error code for failing TLS verification. Either pretend to be a normal TLS failure or use custom codes and error mapping. I think we just use the normal TLS fail code here? either 304 or 372?
General review.

One last thing. I've been working off of the staging branch here but the changes need to be applied to the PR. There are two ways I can go about this.

Apply the change to the staging branch so we have a working commit with the new feature. This could be pre-published and used. I'll have to re-base the PR on the new feature.
Transplant the changes into the PR directly.

I'm leaning towards option 1 here. It keeps better history but is a little more annoying to PR. Not too bad though.

tegefaulkes · 2023-06-19T08:58:17Z

When this is done I can go two ways.

Proceed with custom TLS on Polykey.
Take over the js-quic refactoring PR with the monitor and life cycle changes.

Extra note: I need to update js-async-init version for locks dependency, same for polykey. async-init should manually publish.

tegefaulkes · 2023-06-20T03:09:27Z

There may be a problem with OKP certs. If you fail the custom TLS verification callback on the client side, the server side doesn't respect the close frame with the error and will time out instead. I'm not sure why this happens, but it may be another issue with quiche.

I'll dig into it more after transplanting the changes into the main PR.

CMCDragonkai · 2023-06-20T03:44:28Z

Can you check that the normal quiche based TLS failure works and examine the received failure code on the close frame and ensure that you are doing the exact code and whatever other information when you force a close frame from JS-based TLS failure. Based on my native tests the OKP works fine when quiche fails TLS normally.

CMCDragonkai · 2023-06-20T03:45:22Z

Can you check that the normal quiche based TLS failure works and examine the received failure code on the close frame and ensure that you are doing the exact code and whatever other information when you force a close frame from JS-based TLS failure. Based on my native tests the OKP works fine when quiche fails TLS normally.

The only difference would be what the close frame/packet looks like to the other side. And I'm sure that if this is made exactly equal, we the same behaviour is expected since programs are deterministic.

tegefaulkes · 2023-06-20T04:09:32Z

Unless we manually craft the packet with the error we can't get the same result. A normal TLS failure results in a handshake frame containing the failure. Us closing the connection with the same error code results in a CC frame to close the connection.

CMCDragonkai · 2023-06-20T05:26:50Z

That's strange, I thought during TLS failure we still get a CC frame? Did you check this on Wireshark?

If truly there's no CC frame, then you want to force close if that what is happening. Check the native tests that I've done, there is certain conditions where it does wait to timeout.

Alternatively we can craft the packet if we have all the information to send into the UDP socket.

CMCDragonkai · 2023-06-20T05:27:32Z

It's also because when the other side checks the actual error object, it gets the TLS code and other information. Where does information come from if not from a close frame?

tegefaulkes · 2023-06-20T07:44:52Z

I need to test some expectation using the native connection tests. Based on the discussion, it seems that the handshake needs to complete before we can close the connection.

What I was doing was closing the connection immediately after receiving the peer certs and verifying them. What I should do is wait for the handshake to complete before verifying and closing the connection.

This means receiving and sending the initial frames.

I'll need to expand the native tests to check this expectation for each type of cert.

tegefaulkes · 2023-06-21T03:00:11Z

Curious, I wasn't aware of this but it seems that connection.timeout() is actually aware of time passing. Calling it multiple times will return an updated time. Calling connection.onTimeout() will not cause a state transition unless the time has actually passed.

  console.log
    
    ----client before timeout----
    established: true,
    draining: true,
    closed: false,
    resumed: false,
    earlyData: false,
    peerCerts: Avaliable,
    timeout: 404,

      at Object.log (tests/native/quiche.tls.test.ts:2732:15)

  console.log
    
    ----client after sleeping 50ms----
    established: true,
    draining: true,
    closed: false,
    resumed: false,
    earlyData: false,
    peerCerts: Avaliable,
    timeout: 353,

      at Object.log (tests/native/quiche.tls.test.ts:2735:15)

  console.log
    
    ----client after sleeping timeout----
    established: true,
    draining: true,
    closed: true,
    resumed: false,
    earlyData: false,
    peerCerts: Avaliable,
    timeout: null,

      at Object.log (tests/native/quiche.tls.test.ts:2738:15)

tegefaulkes · 2023-06-21T04:40:06Z

I've updated the native TLS tests in feature-key-gen branch. The earliest point we can close the connection from the application using connection.close is when the following conditions have been met.

Connection is established.
Connection has sent after establishing.

As an example, for a RSA connection the events for client <-> server will be

initial <-
initial ->
handshake <-
client established
handshake ->
server established
client or server can close connection

For a Ed25519

initial <-
client established
initial ->
server established
client or server can close connection

CMCDragonkai · 2023-06-21T05:29:30Z

I think that is why it has to be called multiple times. And also even when 0, you still must call it. 21 June 2023 13:00:24 Brian Botha ***@***.***>:

…

Curious, I wasn't aware of this but it seems that *connection.timeout()* is actually aware of time passing. Calling it multiple times will return an updated time. Calling *connection.onTimeout()* will not cause a state transition unless the time has actually passed. * console.log ----client before timeout---- established: true, draining: true, closed: false, resumed: false, earlyData: false, peerCerts: Avaliable, timeout: 404, at Object.log (tests/native/quiche.tls.test.ts:2732:15) console.log ----client after sleeping 50ms---- established: true, draining: true, closed: false, resumed: false, earlyData: false, peerCerts: Avaliable, timeout: 353, at Object.log (tests/native/quiche.tls.test.ts:2735:15) console.log ----client after sleeping timeout---- established: true, draining: true, closed: true, resumed: false, earlyData: false, peerCerts: Avaliable, timeout: null, at Object.log (tests/native/quiche.tls.test.ts:2738:15) * — Reply to this email directly, view it on GitHub[#30 (comment)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AAE4OHMRZOECMB2SYIL7SYLXMJPUPANCNFSM6AAAAAAZACLQJU]. You are receiving this because you were mentioned.[Tracking image][https://github.com/notifications/beacon/AAE4OHOCMBB2JVAWWXHLJ5TXMJPUPA5CNFSM6AAAAAAZACLQJWWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTS7LYJNM.gif]

CMCDragonkai · 2023-06-21T05:30:18Z

In particular you have to call it wait 0, and call `onTimeout` possibly multiple times. 21 June 2023 15:29:22 Roger Qiu ***@***.***>:

…

I think that is why it has to be called multiple times. And also even when 0, you still must call it. 21 June 2023 13:00:24 Brian Botha ***@***.***>: > > Curious, I wasn't aware of this but it seems that *connection.timeout()* is actually aware of time passing. Calling it multiple times will return an updated time. Calling *connection.onTimeout()* will not cause a state transition unless the time has actually passed. > > * console.log > > ----client before timeout---- > established: true, > draining: true, > closed: false, > resumed: false, > earlyData: false, > peerCerts: Avaliable, > timeout: 404, > > at Object.log (tests/native/quiche.tls.test.ts:2732:15) > > console.log > > ----client after sleeping 50ms---- > established: true, > draining: true, > closed: false, > resumed: false, > earlyData: false, > peerCerts: Avaliable, > timeout: 353, > > at Object.log (tests/native/quiche.tls.test.ts:2735:15) > > console.log > > ----client after sleeping timeout---- > established: true, > draining: true, > closed: true, > resumed: false, > earlyData: false, > peerCerts: Avaliable, > timeout: null, > > at Object.log (tests/native/quiche.tls.test.ts:2738:15) > > > * > — > Reply to this email directly, view it on GitHub[#30 (comment)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AAE4OHMRZOECMB2SYIL7SYLXMJPUPANCNFSM6AAAAAAZACLQJU]. > You are receiving this because you were mentioned.[Tracking image][https://github.com/notifications/beacon/AAE4OHOCMBB2JVAWWXHLJ5TXMJPUPA5CNFSM6AAAAAAZACLQJWWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTS7LYJNM.gif] >

tegefaulkes · 2023-06-21T08:38:38Z

Ok, Looks like everything is working now. The main changes were.

custom verification is run after the first outgoing packet where established is true and certs are available. This ensures all the required handshake negotiation has completed.
The securedP event is resolved after the first short is sent and received, connection is not draining and at least 1 packet has been received after the previous conditions. The first received packet after the shorts should be a CC or ping frame.
A ping frame is sent after verification succeeds to ensure a frame is always sent after a short frame. There are some cases with certain certs where that was not the case.

I'll need to review and clean the changes but I think this is addressed now. Changes are still in the custom_verification branch just to have a commit for reference when transplanting the changes to the PR.

tegefaulkes · 2023-06-22T01:28:10Z

Just looked at the spec, I really need to update it with details of the new changes.

CMCDragonkai · 2023-07-06T06:44:39Z

Closed by #26.

tegefaulkes added the development Standard development label Jun 9, 2023

tegefaulkes self-assigned this Jun 9, 2023

tegefaulkes mentioned this issue Jun 27, 2023

Create QUIC library that can be exposed to JS and uses the Node dgram module #1

Closed

7 tasks

CMCDragonkai closed this as completed Jul 6, 2023

CMCDragonkai added the r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices label Jul 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom TLS verification logic #30

Custom TLS verification logic #30

tegefaulkes commented Jun 9, 2023 •

edited

Loading

tegefaulkes commented Jun 9, 2023 •

edited

Loading

CMCDragonkai commented Jun 9, 2023

tegefaulkes commented Jun 14, 2023

tegefaulkes commented Jun 14, 2023

CMCDragonkai commented Jun 14, 2023

tegefaulkes commented Jun 14, 2023

tegefaulkes commented Jun 15, 2023

tegefaulkes commented Jun 16, 2023

CMCDragonkai commented Jun 16, 2023

tegefaulkes commented Jun 16, 2023 •

edited

Loading

tegefaulkes commented Jun 16, 2023 •

edited

Loading

tegefaulkes commented Jun 16, 2023 •

edited

Loading

tegefaulkes commented Jun 16, 2023

tegefaulkes commented Jun 16, 2023

CMCDragonkai commented Jun 16, 2023

CMCDragonkai commented Jun 16, 2023

CMCDragonkai commented Jun 16, 2023

tegefaulkes commented Jun 16, 2023

tegefaulkes commented Jun 19, 2023

tegefaulkes commented Jun 19, 2023 •

edited

Loading

tegefaulkes commented Jun 20, 2023

CMCDragonkai commented Jun 20, 2023

CMCDragonkai commented Jun 20, 2023

tegefaulkes commented Jun 20, 2023

CMCDragonkai commented Jun 20, 2023

CMCDragonkai commented Jun 20, 2023

tegefaulkes commented Jun 20, 2023

tegefaulkes commented Jun 21, 2023

tegefaulkes commented Jun 21, 2023

CMCDragonkai commented Jun 21, 2023 via email

CMCDragonkai commented Jun 21, 2023 via email

tegefaulkes commented Jun 21, 2023

tegefaulkes commented Jun 22, 2023

CMCDragonkai commented Jul 6, 2023 •

edited

Loading

Custom TLS verification logic #30

Custom TLS verification logic #30

Comments

tegefaulkes commented Jun 9, 2023 • edited Loading

Specification

Additional context

Tasks

tegefaulkes commented Jun 9, 2023 • edited Loading

CMCDragonkai commented Jun 9, 2023

tegefaulkes commented Jun 14, 2023

Method 1.

Method 2.

Method 3.

Method 4.

tegefaulkes commented Jun 14, 2023

CMCDragonkai commented Jun 14, 2023

tegefaulkes commented Jun 14, 2023

tegefaulkes commented Jun 15, 2023

tegefaulkes commented Jun 16, 2023

CMCDragonkai commented Jun 16, 2023

tegefaulkes commented Jun 16, 2023 • edited Loading

tegefaulkes commented Jun 16, 2023 • edited Loading

tegefaulkes commented Jun 16, 2023 • edited Loading

tegefaulkes commented Jun 16, 2023

tegefaulkes commented Jun 16, 2023

CMCDragonkai commented Jun 16, 2023

CMCDragonkai commented Jun 16, 2023

CMCDragonkai commented Jun 16, 2023

tegefaulkes commented Jun 16, 2023

tegefaulkes commented Jun 19, 2023

tegefaulkes commented Jun 19, 2023 • edited Loading

tegefaulkes commented Jun 20, 2023

CMCDragonkai commented Jun 20, 2023

CMCDragonkai commented Jun 20, 2023

tegefaulkes commented Jun 20, 2023

CMCDragonkai commented Jun 20, 2023

CMCDragonkai commented Jun 20, 2023

tegefaulkes commented Jun 20, 2023

tegefaulkes commented Jun 21, 2023

tegefaulkes commented Jun 21, 2023

CMCDragonkai commented Jun 21, 2023 via email

CMCDragonkai commented Jun 21, 2023 via email

tegefaulkes commented Jun 21, 2023

tegefaulkes commented Jun 22, 2023

CMCDragonkai commented Jul 6, 2023 • edited Loading

tegefaulkes commented Jun 9, 2023 •

edited

Loading

tegefaulkes commented Jun 9, 2023 •

edited

Loading

tegefaulkes commented Jun 16, 2023 •

edited

Loading

tegefaulkes commented Jun 16, 2023 •

edited

Loading

tegefaulkes commented Jun 16, 2023 •

edited

Loading

tegefaulkes commented Jun 19, 2023 •

edited

Loading

CMCDragonkai commented Jul 6, 2023 •

edited

Loading