-
Notifications
You must be signed in to change notification settings - Fork 131
Cubic #1074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cubic #1074
Conversation
neqo-transport/src/cc/classic_cc.rs
Outdated
let bytes = | ||
self.cc_algorithm | ||
.on_packets_acked(self.congestion_window, new_acked, min_rtt, now); | ||
// If enough creadit has been accumulated already, apply them gently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how the linux kernel is handling this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// If enough creadit has been accumulated already, apply them gently. | |
// If enough credit has been accumulated already, apply them gradually. |
while self.tcp_acked_bytes > tcp_cnt { | ||
self.tcp_acked_bytes -= tcp_cnt; | ||
self.estimated_tcp_cwnd += MAX_DATAGRAM_SIZE_F64; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have decided to count bytes instead of a proportional increase.
(side effect: it was easier to calculate rounds in tcp phase in tests.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine, but this really is the dark arts.
neqo-transport/src/cc/cubic.rs
Outdated
|
||
// Limit increas to max 1 MSS per 2 ack packets. | ||
cnt = cnt.max(2.0 * MAX_DATAGRAM_SIZE_F64); | ||
cnt as usize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This 'as' can be fix when FloatToInt is available in rust stable. Other way to avoid this is to count bytes in f64 as well(in classic_cc).
neqo-transport/src/cc/tests/cubic.rs
Outdated
// from this n = sqrt(CUBIC_ALPHA^3/ (CUBIC_C * RTT^3)). | ||
let num_tcp_increases = (CUBIC_ALPHA.powi(3) / (CUBIC_C * RTT.as_secs_f64().powi(3))) | ||
.sqrt() | ||
.round() as u64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to leave it as 'as u64'. This can be avoid by using while loop, but I think it is not worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a constant value, right? And is this .round() or .floor() ?
0b1ff62
to
4e9211b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plumbing here looks pretty good, but I have some questions about the tests and the main Cubic function.
I haven't reviewed the cubic-specific tests in detail. I want to understand what is going on more first.
@@ -76,8 +76,18 @@ impl State { | |||
} | |||
|
|||
pub trait WindowAdjustment: Display + Debug { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add commentary explaining what each of these functions are expected to do. Particularly what the return values are expected to produce.
neqo-transport/src/cc/classic_cc.rs
Outdated
let bytes = | ||
self.cc_algorithm | ||
.on_packets_acked(self.congestion_window, new_acked, min_rtt, now); | ||
// If enough creadit has been accumulated already, apply them gently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// If enough creadit has been accumulated already, apply them gently. | |
// If enough credit has been accumulated already, apply them gradually. |
neqo-transport/src/cc/classic_cc.rs
Outdated
if self.acked_bytes >= bytes { | ||
self.acked_bytes = 0; | ||
self.congestion_window += MAX_DATAGRAM_SIZE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not knowing what "bytes" is here, this is not something I can understand from context.
neqo-transport/src/cc/classic_cc.rs
Outdated
if cc.cwnd() == CWND_INITIAL / 2 | ||
|| cc.cwnd() == CWND_INITIAL * CUBIC_BETA_USIZE_QUOTIENT / CUBIC_BETA_USIZE_DIVISOR | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have the algorithm, so calculate the expected value:
if cc.cwnd() == CWND_INITIAL / 2 | |
|| cc.cwnd() == CWND_INITIAL * CUBIC_BETA_USIZE_QUOTIENT / CUBIC_BETA_USIZE_DIVISOR | |
{ | |
let expected_cwnd = match cc_alg { | |
CongestionControlAlgorithm::NewReno => CWND_INITIAL / 2, | |
CongestionControlAlgorithm::Cubic => CWND_INITIAL * CUBIC_BETA_USIZE_QUOTIENT / CUBIC_BETA_USIZE_DIVISOR, | |
}; | |
if cc.cwnd() == expected_cwnd { |
neqo-transport/src/cc/classic_cc.rs
Outdated
assert!(persistent_congestion( | ||
CongestionControlAlgorithm::Cubic, | ||
&[lost(1, true, ZERO), lost(2, false, RTT), lost(3, true, PC),] | ||
)); | ||
assert!(persistent_congestion( | ||
CongestionControlAlgorithm::Cubic, | ||
&[lost(1, true, ZERO), lost(2, true, RTT), lost(3, true, PC),] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like you are just repeating the inputs with different CC. Can you move that to a loop inside persistent_congestion()
instead?
while self.tcp_acked_bytes > tcp_cnt { | ||
self.tcp_acked_bytes -= tcp_cnt; | ||
self.estimated_tcp_cwnd += MAX_DATAGRAM_SIZE_F64; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine, but this really is the dark arts.
neqo-transport/src/cc/cubic.rs
Outdated
let mut cnt = if target > curr_cwnd_f64 { | ||
curr_cwnd_f64 / (target - curr_cwnd_f64) * MAX_DATAGRAM_SIZE_F64 | ||
} else { | ||
100.0 * curr_cwnd_f64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this constant come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is from the linux implementation. It is a corner case, I do not know how to trigger it. This should cause a very small increase in cwnd.
I will explain cnt than this will be more understandable:
cnt in NewReno is always cwnd, i.e. a congestion controller need to receive cwnd of acked bytes to increase cwnd by 1 MSS. For cubic it is calculated depending on W_cubic and tcp estimate, but is still "how many acked bytes it needs to receive to increase cnwd by 1 MSS.
This is leaned on the linux implementation. I may change it to be more similar to the RFC, I think that will make increase slightly slower(tiny, tiny slower).
neqo-transport/src/cc/cubic.rs
Outdated
} | ||
} | ||
|
||
// Limit increas to max 1 MSS per 2 ack packets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Limit increas to max 1 MSS per 2 ack packets. | |
// Limit increase to max 1 MSS per 2 ack packets. |
neqo-transport/src/cc/cubic.rs
Outdated
let mut cnt = if target > curr_cwnd_f64 { | ||
curr_cwnd_f64 / (target - curr_cwnd_f64) * MAX_DATAGRAM_SIZE_F64 | ||
} else { | ||
100.0 * curr_cwnd_f64 | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really following the point of cnt
here. As we get larger windows that amount is going to be small. So the resulting value is going to be quite large on the upper branch. How does 100 relate to the MTU?
If this is the increase in the window, this seems to be backwards. RFC 8312 says that you increase the window by (W_cubic(t+RTT) - cwnd)/cwnd
, but this equation is the inverse of that: cwnd/(W_cubic(t+RTT) - cwnd)
.
See https://ntap.github.io/rfc8312bis/draft-eggert-tcpm-rfc8312bis.html#name-convex-region
BTW, if cnt
is the increase to the congestion window, then call it cwnd_increase
or something like that. As the output of this function is not documented, a more descriptive variable would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is generally looking solid. I have a bunch of comments, but mostly this will be about cleanup and making the code easier to understand. Cubic is pretty simple, but it has a history of terrible code and documentation. I'd rather not have that continue in our code.
neqo-transport/src/cc/classic_cc.rs
Outdated
} | ||
self.acked_bytes += new_acked; | ||
if self.acked_bytes >= byte_cnt { | ||
let d = self.acked_bytes / byte_cnt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably needs a comment to explain that you are deliberately counting the number of whole multiples of byte_cnt.
let byte_cnt = | ||
self.cc_algorithm | ||
.on_packets_acked(self.congestion_window, new_acked, min_rtt, now); | ||
// If enough credit has been accumulated already, apply them gradually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not skip this tweak and just cap the value of 'd' then throw out any significant excess? For instance, two increases per ACK should be plenty:
self.acked_bytes += new_acked;
if self.acked_bytes > bytes_for_increase {
self.acked_bytes -= bytes_for_increase;
self.congestion_window += MAX_DATAGRAM_SIZE; // or is this the current MTU?
}
// The number of bytes we require can go down over time with Cubic.
// That might result in an excessive rate of increase, so limit the number of unused
// acknowledged bytes after increasing the congestion window twice.
self.acked_bytes = min(bytes_for_increase, self.acked_bytes);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer leaving at is this. I find it more understandable: If a rate suddenly changes the old accumulative bytes should increase rate at most by 1MSS. Newly accumulated byte are respected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with this code is that it is unclear that it limits the increase. It looks like it increases the rate by far more than 1MSS. If there is left over bytes, you get 1 increase. And then you have d*MSS. You have to look at the congestion controller to understand that this is never going to be anything other than 0 or 1.
(You don't need two iterations of the acked_bytes > bytes_for_increase
in the above. One is enough. I've edited for that.)
So I find the above far clearer about the important property: that we only increase by at most 1MSS.
neqo-transport/src/cc/classic_cc.rs
Outdated
for cc_alg in &[ | ||
CongestionControlAlgorithm::NewReno, | ||
CongestionControlAlgorithm::Cubic, | ||
] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other way to do this is:
persistent_congestion_by_algorithm(CongestionControlAlgorithm::NewReno, lost_packets, persistent_expected, CWND_INITIAL / 2);
persistent_congestion_by_algorithm(CongestionControlAlgorithm::Cubic, lost_packets, persistent_expected, CWND_INITIAL * CUBIC_BETA_USIZE_QUOTIENT / CUBIC_BETA_USIZE_DIVISOR);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will leave as it is. I implemented something similar before, but some of the test functions that use the function got a bit hard to read because the function is called with a different cc algorithm and multiple different configurations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we aren't talking past each other, so I will propose a change.
neqo-transport/src/cc/cubic.rs
Outdated
} | ||
|
||
// Limit increase to max 1 MSS per 2 ack packets. | ||
cnt = cnt.max(2.0 * MAX_DATAGRAM_SIZE_F64); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this last time around, but it might be worth pointing out that this limit is a direct result of the limit of 1.5cwnd in the target calculation. You can draw a straight line from that to this, but it's non-obvious. Also, I'm not sure that 2 is necessarily the only value we need to consider here, so it needs a constant.
I say that because there are cases where an early loss causes the connection to go into congestion avoidance early. In those cases, I don't see why you wouldn't want to return to slow start proper once you have confirmed that the loss is truly not the result of hitting path throughput limits. That argues for a constant value of 1 here.
BTW, this changes the function into an exponential function. The increase goes from linear (the TCP region), to cubic (the W_max calculation), to exponential. It's not really accurate to call it cubic, other than to note that a cubic function is used to smooth the transition from linear to exponential.
- cwnd does not need to be recalculated when the controller is app-limited, - For Cubic, ca_epoch_start needs to be reset, otherwise it will continue to grow.
0762285
to
04dd00e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do this. I have some suggestions, but nothing that can block landing.
Land once you are happy with it.
let byte_cnt = | ||
self.cc_algorithm | ||
.on_packets_acked(self.congestion_window, new_acked, min_rtt, now); | ||
// If enough credit has been accumulated already, apply them gradually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with this code is that it is unclear that it limits the increase. It looks like it increases the rate by far more than 1MSS. If there is left over bytes, you get 1 increase. And then you have d*MSS. You have to look at the congestion controller to understand that this is never going to be anything other than 0 or 1.
(You don't need two iterations of the acked_bytes > bytes_for_increase
in the above. One is enough. I've edited for that.)
So I find the above far clearer about the important property: that we only increase by at most 1MSS.
neqo-transport/src/cc/classic_cc.rs
Outdated
for cc_alg in &[ | ||
CongestionControlAlgorithm::NewReno, | ||
CongestionControlAlgorithm::Cubic, | ||
] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we aren't talking past each other, so I will propose a change.
neqo-transport/src/cc/classic_cc.rs
Outdated
fn persistent_congestion(lost_packets: &[SentPacket], persistent_expected: bool) { | ||
for cc_alg in &[ | ||
CongestionControlAlgorithm::NewReno, | ||
CongestionControlAlgorithm::Cubic, | ||
] { | ||
let mut cc = congestion_control(*cc_alg); | ||
for p in lost_packets { | ||
cc.on_packet_sent(p); | ||
} | ||
|
||
cc.on_packets_lost(Some(now()), None, PTO, lost_packets); | ||
|
||
let expected_cwnd_no_persistent_congestion = match cc_alg { | ||
CongestionControlAlgorithm::NewReno => CWND_INITIAL / 2, | ||
CongestionControlAlgorithm::Cubic => { | ||
CWND_INITIAL * CUBIC_BETA_USIZE_QUOTIENT / CUBIC_BETA_USIZE_DIVISOR | ||
} | ||
}; | ||
let persistent = if cc.cwnd() == expected_cwnd_no_persistent_congestion { | ||
false | ||
} else if cc.cwnd() == CWND_MIN { | ||
true | ||
} else { | ||
panic!("unexpected cwnd"); | ||
}; | ||
assert_eq!(persistent, persistent_expected); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn persistent_congestion(lost_packets: &[SentPacket], persistent_expected: bool) { | |
for cc_alg in &[ | |
CongestionControlAlgorithm::NewReno, | |
CongestionControlAlgorithm::Cubic, | |
] { | |
let mut cc = congestion_control(*cc_alg); | |
for p in lost_packets { | |
cc.on_packet_sent(p); | |
} | |
cc.on_packets_lost(Some(now()), None, PTO, lost_packets); | |
let expected_cwnd_no_persistent_congestion = match cc_alg { | |
CongestionControlAlgorithm::NewReno => CWND_INITIAL / 2, | |
CongestionControlAlgorithm::Cubic => { | |
CWND_INITIAL * CUBIC_BETA_USIZE_QUOTIENT / CUBIC_BETA_USIZE_DIVISOR | |
} | |
}; | |
let persistent = if cc.cwnd() == expected_cwnd_no_persistent_congestion { | |
false | |
} else if cc.cwnd() == CWND_MIN { | |
true | |
} else { | |
panic!("unexpected cwnd"); | |
}; | |
assert_eq!(persistent, persistent_expected); | |
} | |
} | |
fn persistent_congestion_by_algorithm(cc_alg: CongestionControlAlgorithm, reduced_cwnd: usize, lost_packets: &[SentPacket], persistent_expected: bool) { | |
let mut cc = congestion_control(*cc_alg); | |
for p in lost_packets { | |
cc.on_packet_sent(p); | |
} | |
cc.on_packets_lost(Some(now()), None, PTO, lost_packets); | |
let persistent = if cc.cwnd() == reduced_cwnd { | |
false | |
} else if cc.cwnd() == CWND_MIN { | |
true | |
} else { | |
panic!("unexpected cwnd"); | |
}; | |
assert_eq!(persistent, persistent_expected); | |
} | |
fn persistent_congestion(lost_packets: &[SentPacket], persistent_expected: bool) { | |
persistent_congestion_by_algorithm(CongestionControlAlgorithm::NewReno, CWND_INITIAL / 2, lost_packets, persistent_expected); | |
persistent_congestion_by_algorithm(CongestionControlAlgorithm::Cubic, CWND_INITIAL * CUBIC_BETA_USIZE_QUOTIENT / CUBIC_BETA_USIZE_DIVISOR, lost_packets, persistent_expected); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were talking past each other. I will take this suggestion.
now: Instant, | ||
) -> usize; | ||
/// This function is called when a congestion event has beed detected and it | ||
/// returns new (decreased) values of `curr_cwnd` and `acked_bytes`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// returns new (decreased) values of `curr_cwnd` and `acked_bytes`. | |
/// returns new (decreased) values of `curr_cwnd` and `acked_bytes`. | |
/// This value can be very small; the calling code is responsible for ensuring that the | |
/// congestion window doesn't drop below the minimum of `CWND_MIN`. |
neqo-transport/src/cc/cubic.rs
Outdated
/// occurs before reaching the previous `W_max`. | ||
pub const CUBIC_FAST_CONVERGENCE: f64 = 0.85; // (1.0 + CUBIC_BETA) / 2.0; | ||
|
||
const MIN_INCREASE: f64 = 2.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const MIN_INCREASE: f64 = 2.0; | |
/// The minimum number of multiples of the datagram size that need | |
/// to be received to cause an increase in the congestion window. | |
/// When there is no loss, Cubic can return to exponential increase, but | |
/// this value reduces the magnitude of the resulting growth by a constant factor. | |
/// A value of 1.0 would mean a return to the rate used in slow start. | |
const EXPONENTIAL_GROWTH_REDUCTION: f64 = 2.0; |
neqo-transport/src/cc/cubic.rs
Outdated
// Limit increase to max 1 MSS per MIN_INCREASE ack packets. | ||
// This is basicaly limiting target to (1 + 1/MIN_INCREASE)cwnd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Limit increase to max 1 MSS per MIN_INCREASE ack packets. | |
// This is basicaly limiting target to (1 + 1/MIN_INCREASE)cwnd. | |
// Limit increase to max 1 MSS per EXPONENTIAL_GROWTH_REDUCTION ack packets. | |
// This effectively limits target_cwnd to `(1 + 1/EXPONENTIAL_GROWTH_REDUCTION)*cwnd`. |
neqo-transport/src/cc/cubic.rs
Outdated
let curr_cwnd_f64 = convert_to_f64(curr_cwnd); | ||
// Fast Convergence | ||
// If congestion event occurs before the maximum congestion window before the last congestion event, | ||
// we reduce the the maximum congestion window and therby w_max. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// we reduce the the maximum congestion window and therby w_max. | |
// we reduce the the maximum congestion window and thereby W_max. |
or w_max()
.
neqo-transport/src/cc/tests/cubic.rs
Outdated
assert_eq!(cubic.cwnd(), CWND_INITIAL_10); | ||
} | ||
|
||
fn assert_within<T: Sub<Output = T> + PartialOrd>(value: T, expected: T, margin: &T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can make T: Sub<...> + PartialOrd + Copy
and avoid taking a reference for margin
.
neqo-transport/src/cc/tests/cubic.rs
Outdated
|
||
let cwnd_rtt_start = cubic.cwnd(); | ||
// cwnd_rtt_start has change, therefore calculate new time_increase (the time | ||
// between acks if they are ideally paced over a RTT.). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// between acks if they are ideally paced over a RTT.). | |
// between acks if they are ideally paced over a RTT). |
} | ||
|
||
#[test] | ||
fn tcp_phase() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that I can live with this. It's not ideal, but it's also not that long relative to some of the other tests.
No description provided.