Skip to content

Introduce OffersMessageFlow #3412

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

shaavan
Copy link
Member

@shaavan shaavan commented Nov 18, 2024

This PR introduces a new data structure to separate the functions and fields related to Offers Message from ChannelManager.

The change ensures a clear separation of responsibilities for BOLT12 messages, improving modularity and maintainability.

TODO:

  • Refine commit messages
  • Add detailed documentation

@shaavan
Copy link
Member Author

shaavan commented Nov 19, 2024

Updated from pr3412.01 to pr3412.02 (diff):

Changes:

  1. Fix ci
  2. Update commit messages

@jkczyz jkczyz self-requested a review November 19, 2024 15:36
Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I reviewed this offline with @shaavan. So there will be some changes to the OffersMessageFlow trait parameterization to keep ChannelManager specifics from leaking.

return Err(Bolt12SemanticError::UnsupportedChain);
}

let _persistence_guard = PersistenceNotifierGuard::notify_on_drop(self);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheBlueMatt Do you know why we need the PersistenceNotifierGuard here? This code sends an invoice for a refund (i.e., an inbound payment). IIUC, we don't persist pending_offers_messages, so it seems we don't need it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like we need one to me either. There's no state changed at all, so serialization cannot possibly be required.

@shaavan
Copy link
Member Author

shaavan commented Nov 29, 2024

Updated from pr3412.02 to pr3412.03 (diff):

Addressed @jkczyz comments:

Changes:

  1. Update & Expand Documentation.
  2. Move dnssec code in flow.rs
  3. Simplified OffersMessageFlow trait parametrisation.
  4. Move OffersMessageFlow only function from offersmessagecommons to flow.rs

@shaavan
Copy link
Member Author

shaavan commented Nov 30, 2024

Updated from pr3412.03 to pr3412.04 (diff):

Changes:

  1. Rebase on main.
  2. Fix CI.

@shaavan
Copy link
Member Author

shaavan commented Dec 6, 2024

Updated from pr3412.04 to pr3412.05 (diff):

Addressed @jkczyz's (offline) comments.

Changes:

  1. Moved the OffersMessageCommons trait definition to flow.rs, as it is a more logical location for its definition.
  2. Temporarily removed async Bolt12 message support, with the reasoning documented in the corresponding commit message.
  3. Introduced network_pubkey and extended_key fields in OffersMessageFlow to streamline the overall refactoring by reducing code movement.

@shaavan shaavan force-pushed the offersmessageflow branch 2 times, most recently from 4911f9f to 9d0c511 Compare December 6, 2024 16:00
@shaavan
Copy link
Member Author

shaavan commented Dec 6, 2024

Updated from pr3412.05 to pr3412.06 (diff):

Changes:

  1. Rebase on main to resolve merge conflicts
  2. Fix CI

Copy link

codecov bot commented Dec 6, 2024

Codecov Report

Attention: Patch coverage is 83.94161% with 132 lines in your changes missing coverage. Please review.

Project coverage is 88.63%. Comparing base (ec19ba1) to head (928e741).

Files with missing lines Patch % Lines
lightning/src/offers/flow.rs 80.31% 110 Missing and 15 partials ⚠️
lightning/src/ln/channelmanager.rs 95.38% 3 Missing ⚠️
lightning/src/ln/offers_tests.rs 96.73% 3 Missing ⚠️
lightning/src/onion_message/offers.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3412      +/-   ##
==========================================
- Coverage   88.65%   88.63%   -0.02%     
==========================================
  Files         149      150       +1     
  Lines      116689   116764      +75     
  Branches   116689   116764      +75     
==========================================
+ Hits       103447   103491      +44     
- Misses      10734    10765      +31     
  Partials     2508     2508              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@shaavan
Copy link
Member Author

shaavan commented Dec 11, 2024

Updated from pr3412.06 to pr3412.07 (diff):
Addressed @TheBlueMatt comment

Changes:

  1. Corrected code for cfg[(async_payments)]
  2. Remove stale todo. reference
  3. Fix CI

@shaavan
Copy link
Member Author

shaavan commented Dec 12, 2024

Updated from pr3412.07 to pr3412.08 (diff):

Changes:

  1. Rebase on main, to resolve merge conflicts.

@shaavan
Copy link
Member Author

shaavan commented Dec 12, 2024

Updated from pr3412.08 to pr3412.09 (diff):

Changes:

  1. Introduce create_blinded_paths in flow.rs
  2. This allows removing additional function from OffersMessageCommons trait, and hence simplifying it.

@shaavan
Copy link
Member Author

shaavan commented Dec 16, 2024

Updated from pr3412.09 to pr3412.10 (diff):

Changes:

  1. Rebase on main to resolve merge conflicts.

@shaavan
Copy link
Member Author

shaavan commented Dec 17, 2024

Updated from pr3412.10 to pr3412.11 (diff):

Changes:

  1. Remove redundant send_payment_for_bolt12_invoice function.
  2. Revert mistakenly added accept_mpp_keysend

Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned offline yesterday, but we should try to arrange the commits such that changes introduced to the trait in on commit aren't later reverted in another. It might not be entirely possible but we should at least minimize this.

Comment on lines 54 to 106
fn sign_bolt12_invoice(
&self, invoice: &UnsignedBolt12Invoice,
) -> Result<schnorr::Signature, ()>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should OffersMessageFlow take a NodeSigner instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial reasoning was that since OffersMessageFlow only uses one function from NodeSigner (sign_bolt12_invoice), I opted to call it from the Commons trait. With this approach, I aimed to keep OffersMessageFlow construction easier, by keeping limited parameterisation.

But do let me know, what you think of this reasoning. And let me know if we should still experiment with NodeSigner parameterisation. I would love to give it a shot!

Comment on lines 9969 to 9972
let first_hops = self.list_usable_channels();
let payee_node_id = self.get_our_node_id();
let max_cltv_expiry = self.best_block.read().unwrap().height + CLTV_FAR_FAR_AWAY
+ LATENCY_GRACE_PERIOD_BLOCKS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a method on the trait for this instead of create_blinded_payment_paths?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to understand your point better. Are you suggesting we could parameterize Flow with Router, or is there a different approach you’re pointing to?

Comment on lines 10029 to 9612
fn create_blinded_paths(&self, context: MessageContext) -> Result<Vec<BlindedMessagePath>, ()> {
let recipient = self.get_our_node_id();
let secp_ctx = &self.secp_ctx;

let peers = self.per_peer_state.read().unwrap()
.iter()
.map(|(node_id, peer_state)| (node_id, peer_state.lock().unwrap()))
.filter(|(_, peer)| peer.is_connected)
.filter(|(_, peer)| peer.latest_features.supports_onion_messages())
.map(|(node_id, _)| *node_id)
.collect::<Vec<_>>();

self.message_router
.create_blinded_paths(recipient, context, peers, secp_ctx)
.and_then(|paths| (!paths.is_empty()).then(|| paths).ok_or(()))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here. That would let us move the MessageRouter parameter from ChannelManager to OffersMessageFlow. Are there any other uses of MessageRouter in ChannelManager that would prevent this? If so, we should consider modifying the new trait to avoid that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With pr3412.12, I updated the trait to call ChannelManager’s implementation of create_blinded_paths instead of moving the function.

That said, I noticed create_blinded_paths is still required in ChannelManager for initiate_async_payment, so removing MessageRouter parameterization from ChannelManager might not be feasible.

@shaavan
Copy link
Member Author

shaavan commented Feb 3, 2025

Updated from pr3412.12 to pr3412.13 (diff):

Changes:

  1. Rebase on main, to resolve merge conflicts.
  2. Moved two new added functions create_async_receive_offer_builder and create_static_invoice_builder to flow.rs

@shaavan
Copy link
Member Author

shaavan commented Feb 6, 2025

Updated from pr3412.13 to pr3412.14 (diff):
Addressed @jkczyz's (offline) comments:

  • Moved AsyncPaymentMessageHandler to flow.rs.
  • Freed ChannelManager from MessageRouter usage.

Note:
I haven’t removed MessageRouter as a parameter yet, as that would be a bigger change. I’d love to discuss the change before making that update!

@jkczyz
Copy link
Contributor

jkczyz commented Feb 11, 2025

Let's add flow.rs to the rustfmt_excluded_files such that it's easier to review each commit. Then we can remove and run rustfmt for the last commit.

@jkczyz
Copy link
Contributor

jkczyz commented Feb 11, 2025

Looks like there are some merge conflicts, so let's rebase, too.

@shaavan
Copy link
Member Author

shaavan commented Feb 15, 2025

Updated from pr3412.14 to pr3412.15 (diff):
Addressed @jkczyz's comments:

  1. Restructure commits.
  2. Apply rust formatting at the end of all movement, to make it easier to verify changes.
  3. Restructure Commons function to be more logical in ordering.

This commit temporarily removes support for async BOLT12
message handling to enable a smoother transition in the
upcoming refactor.

The current implementation of async handling is abrupt,
as it requires delving into the synchronous case and
generating an event mid-flow based on the `manual_handling`
flag. This approach adds unnecessary complexity and coupling.

A later commit will introduce a new struct, `OffersMessageFlow`,
designed to handle and create offer messages. This new struct
will support async handling in a more structured way by allowing
users to implement a parameterized trait for asynchronous message
handling.

Removing the existing async support now ensures a cleaner and more
seamless migration of offer-related code from `ChannelManager` to
`OffersMessageFlow`.
…essageHandler`

To decouple offers and onion message-related logic from `ChannelManager`,
this commit introduces the `message_received` function in
`Offer/OnionMessageHandler`. Previously, `message_received` existed in
`ChannelMessageHandler`, but its sole responsibility was handling invoice
request retries.

By moving this function to `Offer/OnionMessageHandler`, we ensure a cleaner
separation of concerns and align message processing with the appropriate
handler. This transition allows invoice request retries to be managed
directly within the offers/onion message context, laying the groundwork
for further enhancements to message handling.
Following commits will move a lot of code from ChannelManager to Flow.rs
So to make it easier to review the changes, we temporarily introduce
flow.rs to the rustfmt exclusion list.
This commit introduces a new struct, `OffersMessageFlow`,
to extract all offers message-related code out of `ChannelManager`.

By moving this logic into a dedicated struct, it creates
a foundation for separating responsibilities and sets up
a base for further code restructuring in subsequent commits.
A new trait, `OffersMessageCommons`, is introduced to encapsulate
functions that are commonly used by both BOLT12-related functionality
and other parts of `ChannelManager`.

This enables a clean transition of BOLT12 code to `OffersMessageFlow`
by moving shared functions into the new trait, ensuring they remain
accessible to both `ChannelManager` and the refactored BOLT12 code.
This commit introduces the `OffersMessageHandler` implementation
for `OffersMessageFlow`, enabling direct access to offer-specific
functionality through `OffersMessageFlow`.

With `OffersMessageFlow` now serving as the source of `OffersMessageHandler`
implementation, the implementation in `ChannelManager` is no longer needed
and has been safely removed.
1. Previously, `initiate_async_payment` handled both the outbound payment
state update (marking a static invoice as received) and the message
queuing for async payments. This tightly coupled two distinct
responsibilities within `ChannelManager`.

2. This refactor **separates concerns** by moving message queuing into
a separate function, allowing `initiate_async_payment` to focus solely
on updating `pending_outbound_payments`. This paves the way for moving
message queuing to `flow.rs` in upcoming commits, ensuring
`MessageRouter` logic remains independent of `ChannelManager`.

3. **This introduces a behavior change:** Since `pending_outbound_payments`
is now handled separately, we drop PersistenceNotifierGuard before queuing
messages. As a result, `NotifyOption` is no longer called for the message
queuing process.

4. **Why is this safe?** Because `pending_async_payments_messages` is
**not under `total_consistency_lock`**, meaning it does not require
persistence notifications. This aligns with the lock order tree and
maintains correct locking discipline.

By making this change, we improve modularity, enforce a cleaner separation
between `ChannelManager` and `MessageRouter`, and remove unnecessary
persistence dependencies.
@shaavan
Copy link
Member Author

shaavan commented Feb 15, 2025

Updated from pr3412.15 to pr3412.16 (diff):
Addressed @jkczyz's comments:

Changes:

  1. Rebase on main to resolve conflicts

/// The [`ExpandedKey`] for use in encrypting and decrypting inbound payment data.
///
/// [`ExpandedKey`]: crate::ln::inbound_payment::ExpandedKey
pub inbound_payment_key: inbound_payment::ExpandedKey,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bleh, can we avoid this?

});
fn create_inbound_payment(&self, min_value_msat: Option<u64>, invoice_expiry_delta_secs: u32,
min_final_cltv_expiry_delta: Option<u16>) -> Result<(PaymentHash, PaymentSecret), ()> {
self.create_inbound_payment(min_value_msat, invoice_expiry_delta_secs, min_final_cltv_expiry_delta)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to just make the user copy the inbound key into the flow to avoid adding this to the trait?

let payment_secret = inbound_payment::create_for_spontaneous_payment(
&self.inbound_payment_key, amount_msat, relative_expiry_secs, created_at.as_secs(), None
).map_err(|()| Bolt12SemanticError::InvalidAmount)?;
fn add_new_awaiting_invoice(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bleh, I'm really not a fan of exposing the low-level payment state management calls. Instead, can we keep the top-level send initiation functions in ChannelManager, which then insert the pending payment and delegate to the flow handler. From there, the flow handler can bubble results up to the ChannelManager via some kind of OfferEvents (basically just "invoice received"). We'd possibly have a bit of duplicate state across ChannelManager and the flow handler (if the flow handler needs to track pending payments, but honestly it could also maybe be stateless, with the ChannelManager deciding to ignore spurious/duplicate invoices).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this means, basically, storing the OffersMessageHandler in ChannelManager but wrapping it in a trait so that the user can intercept the events (and modify them if they want). We'd need to continue having a currency conversion lookup trait that OffersMessageFlow calls into but we'd be able to keep the state in ChannelManager entirely, I think. eg something like

enum OfferEvents {
  InvoiceReceivedAndPayable,
  ...?
}
trait Flow {
  fns handle_offers_messages...
  fn get_and_clear_events() -> Vec<OfferEvent>;
  fn initiate_offers_payment(offer: Offer) -> Result<InvoiceRequest>;
}
trait CurrencyLookup {
  async? fn lookup_currency_conversion_acceptable_range(currency: &str) -> (min, max)
}
struct OffersMessageFlow<C: CurrencyLookup> {
  conv: C,
}
impl Flow for OffersMessageFlow<C> {
  ...
}

struct ChannelManager<..., F: Flow> {
  ...
  flow: F,
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shaavan and I walked through the alternative approach this morning. He's going to take a shot at playing around with it. I'm not 100% convinced it is any better and is possibly worse, TBH. It seems we need to pass in closures to the utility functions for creating blinded paths (and what not) and then need to figure out what will own the message queues.

If ChannelManager owns the queues, then we could return the message to enqueue. But that leaves open how to do this for the OffersMessageHandler implementation. If both ChannelManager and OffersMessageFlow implement that trait (with the former delegating to the latter), then I suppose intercepting the returned messages could work -- though I'm not sure if that approach is possible for async payments. But that leaves others users like LNDK needing to implement this message enqueueing functionality on their own and parameterize OnionMessenger accordingly. That seems to go against the purpose of the refactor. It also leaves the onus of retries on the user to implement.

Regardless of what owns the queues, it still leaves places in the OffersMessageHandler implementation that need to call back to ChannelManager (or something else). Namely, creating blinded paths just like before, signing invoices, and getting the highest seen timestamp. And a bunch of places for async payments as mentioned above. IIUC, we'd still need to parameterize OffersMessageFlow for these somehow.

Bleh, I'm really not a fan of exposing the low-level payment state management calls. Instead, can we keep the top-level send initiation functions in ChannelManager, which then insert the pending payment and delegate to the flow handler. From there, the flow handler can bubble results up to the ChannelManager via some kind of OfferEvents (basically just "invoice received").

It isn't clear to me how using an event approach is that much different. Either way, the user needs to have some code called. But the event approach forces them to orchestrate everything. Seems we could simply rename add_new_awaiting_invoice to invoice_request_enqueued to avoid exposing any payment state management.

For sending payments, in the alternative approach the event examiner would need to initiate the payment. But if that fails, they would need to create and enqueue an InvoiceError, which seems like should be the responsibility of OffersMessageFlow.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we need to pass in closures to the utility functions for creating blinded paths

I'm not entirely convinced? We'd need to pass the ChannelDetails list of the current ChannelManager (or maybe a closure to fetch it?), but actually calling the router to build the paths can sit on the OffersMessageFlow (since that's really an offers message handling step).

need to figure out what will own the message queues

I'm not convinced this is all that critical. Retry logic is pretty trivial, and LNDK having to reimplement it doesn't seem like the end of the world. Similarly, having queues be duplicated in the OffersMessageFlow and it handling retries also seems fine.

though I'm not sure if that approach is possible for async payments

Hmm, not sure I understand this.

signing invoices

heh, maybe we should remove the ability to sign invoices with the node pubkey, but, really, we can give the OffersMessageFlow a(n optional?) signer reference rather than making this a closure?

getting the highest seen timestamp

This could just be passed into the methods?

It isn't clear to me how using an event approach is that much different. Either way, the user needs to have some code called. But the event approach forces them to orchestrate everything. Seems we could simply rename add_new_awaiting_invoice to invoice_request_enqueued to avoid exposing any payment state management.

With events we would only have to handle "here's the invoice we got, please send", we wouldn't have to also handle "please add a new awaiting-invoice payment", which I think is pretty substantially different in terms of the public API.

For sending payments, in the alternative approach the event examiner would need to initiate the payment. But if that fails, they would need to create and enqueue an InvoiceError, which seems like should be the responsibility of OffersMessageFlow.

Hmm, do we really? If the Invoice is invalid obviously we should respond with an InvoiceError (which the OffersMessageFlow would), but if we fail to send a payment because, for example, we cannot find a path to the recipient, do we really need to send an InvoiceError? Its already the case today that we won't send an InvoiceError if we try and fail to make a payment because the blinded path is invalid/doesn't have enough liquidity, not doing other less common cases seems pretty secondary there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we need to pass in closures to the utility functions for creating blinded paths

I'm not entirely convinced? We'd need to pass the ChannelDetails list of the current ChannelManager (or maybe a closure to fetch it?), but actually calling the router to build the paths can sit on the OffersMessageFlow (since that's really an offers message handling step).

There aren't any ChannelDetails if the user isn't using a ChannelManager. That's why the type parameterization is responsible for creating the paths. It also needs to be used in OffersMessageFlow's implementation of OffersMessageHandler. How would it do that?

Similar problems arise for onion message paths when you don't have a ChannelManager.

need to figure out what will own the message queues

I'm not convinced this is all that critical. Retry logic is pretty trivial, and LNDK having to reimplement it doesn't seem like the end of the world. Similarly, having queues be duplicated in the OffersMessageFlow and it handling retries also seems fine.

Right, so that additional complexity is a cost of this approach. There's arguable more complexity with the event model. What's the perceivable benefits that outweigh them?

though I'm not sure if that approach is possible for async payments

Hmm, not sure I understand this.

Look at initiate_async_payment and its call site. Consider the DNSResolverMessageHandler implementation as well.

Seems like any approach that could do this by having ChannelManager implement OffersMessageHandler, delegating to OffersMessageFlow would be extremely convoluted, if even possible.

signing invoices

heh, maybe we should remove the ability to sign invoices with the node pubkey, but, really, we can give the OffersMessageFlow a(n optional?) signer reference rather than making this a closure?

Then other users would need to implement the entire signer interface just for one method, unless we add a new trait.

getting the highest seen timestamp

This could just be passed into the methods?

Passed into OfferMessageFlow's implementation of OffersMessageHandler? How? Even if we changed the trait method to take another parameter, OnionMessenger is still the originator of the call.

It isn't clear to me how using an event approach is that much different. Either way, the user needs to have some code called. But the event approach forces them to orchestrate everything. Seems we could simply rename add_new_awaiting_invoice to invoice_request_enqueued to avoid exposing any payment state management.

With events we would only have to handle "here's the invoice we got, please send", we wouldn't have to also handle "please add a new awaiting-invoice payment", which I think is pretty substantially different in terms of the public API.

Are you suggesting we store pending invoice requests in OffersMessageFlow? If so, that's an entirely different approach than I thought you had been imagining. So ChannelManager would no longer know about these through PendingOutboundPayments? How would we implement retries then?

Not withstanding it doesn't give the user a way to examine the invoice request, as is needed in #3525.

For sending payments, in the alternative approach the event examiner would need to initiate the payment. But if that fails, they would need to create and enqueue an InvoiceError, which seems like should be the responsibility of OffersMessageFlow.

Hmm, do we really? If the Invoice is invalid obviously we should respond with an InvoiceError (which the OffersMessageFlow would), but if we fail to send a payment because, for example, we cannot find a path to the recipient, do we really need to send an InvoiceError? Its already the case today that we won't send an InvoiceError if we try and fail to make a payment because the blinded path is invalid/doesn't have enough liquidity, not doing other less common cases seems pretty secondary there.

Fair point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current approach is really exposing the "guts" of ChannelManager as a public interface. That's not right.

Sure, some of the trait methods can be cleaned up a bit to avoid this as mentioned in some of my most recent comments. The intention was to never to expose anything to do with ChannelManager. It was rather to define an abstraction for a payment protocol. It seems reasonable that such an abstraction would call to the user's payment management code, as the trait parameterization allows.

The alternative breaks the abstraction and forces users to implement the protocol using a bunch of utility methods. IMO, ideally, the user shouldn't need to know how the protocol defines message passing. But now we are asking them to implement three message handlers, IIUC.

IMO we shouldn't expose anything unless we are comfortable with it being called, and I don't know that half the methods exposed here really make sense for users to call.

I don't follow what you mean here. Are you talking about the OffersMessageCommons trait? The methods there are called by OffersMessageFlow, not by the user. Or something else?

No, no. My thought was that the pay_for_offer method would still live in ChannelManager (which owns an OffersMessageFlow, disintermeidated by a trait). It would first call the OffersMessageFlow send-invreq method, and if that returns success add the pending payment, all in the same lock. Presumably the send-invreq method could return the invreq and the ChannelManager could send/store it for retries (or it could be stored in OffersMessageFlow which could retry a few times then delete it).

I'm not sure I understand the purpose of this refactor then. Or at least how you're envisioning the message handler implementations for each scenario. Could you give a more concrete description of what will implement these traits?

I believe the trait disintermediation accomplishes the "examine before processing" need.

How would this look like in practice? Someone would need to reimplement Flow? What about message handling? What if they aren't using ChannelManager? It isn't necessarily about examining before processing but rather during processing.

I agree that we might go down this path and find that it simply doesn't work, but I don't think that means the current path makes sense. Exposing tons of guts is not a workable API IMO.

Could you be specific about what you don't want exposed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, some of the trait methods can be cleaned up a bit to avoid this as mentioned in some of my most recent comments. The intention was to never to expose anything to do with ChannelManager. It was rather to define an abstraction for a payment protocol. It seems reasonable that such an abstraction would call to the user's payment management code, as the trait parameterization allows.

Right, so I think "an abstraction ... to the user's payment management code" is really not something we can expose. If we directly expose "hey, track this pending payment", it has to actually send the payment. We shouldn't make the pending payment tracking logic a public API - if some downstream code wants to track pending payments, it should do that, letting them directly insert it into ChannelManager's pending payment state means ChannelManager isn't actually sure what the real state of payments is (not to mention we're turning it into a really really bad payment store).

For example, adding add_new_awaiting_invoice needs probably ten paragraphs of documentation describing when it can/should be used, what the persistence semantics of it are (and how the state of a thing you inserted can seemingly-randomly shift on a restart), etc.

The alternative breaks the abstraction and forces users to implement the protocol using a bunch of utility methods. IMO, ideally, the user shouldn't need to know how the protocol defines message passing. But now we are asking them to implement three message handlers, IIUC.

I certainly agree its a bit more work to wire up, but they're using a BOLT12 flow handler, they're probably pretty aware its handling onion messages :).

More generally, I think the best we can do here with a sensible API is something that might well be something that looks more like a set of util methods (but with a cleaner overall wrapper), and I think that's okay?

I don't follow what you mean here. Are you talking about the OffersMessageCommons trait? The methods there are called by OffersMessageFlow, not by the user. Or something else?

Indeed, OffersMessageCommons. add_new_awaiting_invoice is obviously the biggest offender, but its super weird to expose get_hrn_resolver (this really strongly implies the resolver needs to move into the flow) and sign_bolt12_invoice (this really strongly implies the flow needs a reference to the signer somehow, why is it accessing it indirectly?), and the APIs for a number of things really need to change - abandon_payment_with_reason (probably we shouldn't let the user pick an arbitrary reason? should this be confined to something offers-specific? We really only need the "invalid invoice" case or something, right?), etc.

I'm not sure I understand the purpose of this refactor then. Or at least how you're envisioning the message handler implementations for each scenario. Could you give a more concrete description of what will implement these traits?

There's a gap in our API today between the low-level offers objects and high-level payment logic. High level payment logic is somewhat routine, consisting of some fairly straightforward, but error-prone (because you have to remember to check all the things we need to check) steps, followed by actual HTLC sending. We'd like to expose the middle ground where those steps are handled so that the high-level payment logic can be shared more between LNDK and LDK. At the same time, that middle ground API needs to provide just enough hooks that simple interception like async Invoice handling can be done for use-cases like Fedi.

IMO this middle ground should look more like "utilities" that ChannelManager can drive (disintermediated by a trait) rather than something that drives ChannelManager. IOW the top-level ChannelManager API doesn't change - the user would call pay_for_offer and ChannelManager would record the pending outbound payment. It would then delegate the actual handling of figuring out how to convert the Offer into PaymentParameters (really a Bolt12Invoice, but conceptually the payment parameters) to the OffersMessageFlow, but the flow is really just a utility that ChannelManager uses.

Its status as a "utility" means exporting a bit more of the logic to ChannelManager is fine - if its easier to handle InvoiceRequest retries in ChannelManager then the "utility" can just return it for ChannelManager to store and retry.

How would this look like in practice? Someone would need to reimplement Flow? What about message handling? What if they aren't using ChannelManager? It isn't necessarily about examining before processing but rather during processing.

Right, so what kind of hooks the user wants is going to determine exactly where this sits. If they want to check an Invoice before it gets paid, they can let OffersMessageFlow do its work and then hook before the event to send the payment goes back to the ChannelManager, letting it fly only when they're ready for it (I believe this would address the current async bolt 12 invoice handling usecase). If they want to intercept as a part of the process for currency conversion, IMO we should do currency conversion as a first-class thing in the flow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello,

So if I got it right, I think the biggest concern with the current approach (let’s call it the Commons approach) is that it exposes too much of ChannelManager's inner workings in a public API.

On the other hand, if we go with the alternative Events approach, we would be creating utility functions, but users would have to do a lot of work to implement handlers and other components if they choose not to use ChannelManager.

I was trying to find a middle ground to avoid exposing too much internal logic in the Commons approach while still keeping the approach feasible.

Potential Refinements:

  1. For basic functions like get_hrn_resolver, we can directly move the hrn_resolver to OffersMessageFlow.
  2. For abandon_payment_with_reason, we could update it to abandon_payment_with_*offer_specific*_reason.
  3. For sign_bolt12_invoice, we could parameterize the OffersMessageFlow using a trait that implements Bolt12Signer (instead of entire NodeSigner).

Now, regarding the functions that directly interact with pending_outbound_payments:

  • add_new_awaiting_offer
  • release_invoice_requests_awaiting_invoice
  • add_new_awaiting_invoice
  • amt_msats_for_payment_awaiting_offer
  • received_offer

What if we introduce a pub(crate) trait Bolt12PaymentStuff, which contains all these functions, implemented by OutboundPayment and parameterized on OffersMessageFlow? This would allow us to store and retrieve Bolt12 information directly within OutboundPayment while keeping these methods out of the Commons trait (thus avoiding exposure in the public API).

Potential Issue:

The main downside of this approach is that if someone wants to implement their own version of OffersMessageFlow with a custom implementation, they won't be able to re-implement the Bolt12PaymentStuff trait.

I’d love to hear your thoughts on this—do you think this is a feasible direction? Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we introduce a pub(crate) trait Bolt12PaymentStuff, which contains all these functions, implemented by OutboundPayment and parameterized on OffersMessageFlow? This would allow us to store and retrieve Bolt12 information directly within OutboundPayment while keeping these methods out of the Commons trait (thus avoiding exposure in the public API).

I'm not sure I understand how this would work for users who want to not use a ChannelManager. Even ignoring the question of exposing it or not, I don't understand how, for example, LNDK would be able to implement add_new_awaiting_offer - I am fairly confident LND doesn't have an RPC which allows you to add a pending outbound payment without actually sending the payment, so LNDK would be implementing their own payment tracking logic, which I don't think they actually want to/should be doing.

Rather, the API they (I assume) want, would be more like what I described - pass the OffersMessageFlow an Offer and after the OffersMessageFlow does some stuff (including sending/processing onion messages), it bubbles up (as an Event or callback) a blinded path they should send to (presumably in the form of a payable Bolt12Invoice).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... so LNDK would be implementing their own payment tracking logic, which I don't think they actually want to/should be doing.

That’s a fair point!

Rather, the API they (I assume) want, would be more like what I described - pass the OffersMessageFlow an Offer and after the OffersMessageFlow does some stuff (including sending/processing onion messages), it bubbles up (as an Event or callback) a blinded path they should send to (presumably in the form of a payable Bolt12Invoice).

This structure does makes sense to me on paper! Let me give it a shot trying to create a basic proof of concept for this approach to see how it will work in practise. Thanks a lot for the clarification!

Comment on lines +1285 to +1290
let expiration = StaleExpiration::TimerTicks(1);
let retryable_invoice_request = RetryableInvoiceRequest {
invoice_request: invoice_request.clone(),
nonce,
needs_retry: true,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this should live in ChannelManager::add_new_awaiting_invoice.

payer_note,
payment_id,
None,
|invoice_request, nonce| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably don't need to use a closure here. Instead, just call this code after pay_for_offer_intern returns and move the call to enqueue_invoice_request here. pay_for_offer_intern may just need a different return type.

Comment on lines +1347 to +1348
// Temporarily removing this to find the best way to integrate the guard
// let _persistence_guard = PersistenceNotifierGuard::notify_on_drop(self);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be done in ChannelManager::add_new_awaiting_invoice?

Comment on lines +104 to +105
/// Get the current time determined by highest seen timestamp
fn get_current_blocktime(&self) -> Duration;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we only need get_highest_seen_timestamp?

let payment_secret = inbound_payment::create_for_spontaneous_payment(
&self.inbound_payment_key, amount_msat, relative_expiry_secs, created_at.as_secs(), None
).map_err(|()| Bolt12SemanticError::InvalidAmount)?;
fn add_new_awaiting_invoice(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shaavan and I walked through the alternative approach this morning. He's going to take a shot at playing around with it. I'm not 100% convinced it is any better and is possibly worse, TBH. It seems we need to pass in closures to the utility functions for creating blinded paths (and what not) and then need to figure out what will own the message queues.

If ChannelManager owns the queues, then we could return the message to enqueue. But that leaves open how to do this for the OffersMessageHandler implementation. If both ChannelManager and OffersMessageFlow implement that trait (with the former delegating to the latter), then I suppose intercepting the returned messages could work -- though I'm not sure if that approach is possible for async payments. But that leaves others users like LNDK needing to implement this message enqueueing functionality on their own and parameterize OnionMessenger accordingly. That seems to go against the purpose of the refactor. It also leaves the onus of retries on the user to implement.

Regardless of what owns the queues, it still leaves places in the OffersMessageHandler implementation that need to call back to ChannelManager (or something else). Namely, creating blinded paths just like before, signing invoices, and getting the highest seen timestamp. And a bunch of places for async payments as mentioned above. IIUC, we'd still need to parameterize OffersMessageFlow for these somehow.

Bleh, I'm really not a fan of exposing the low-level payment state management calls. Instead, can we keep the top-level send initiation functions in ChannelManager, which then insert the pending payment and delegate to the flow handler. From there, the flow handler can bubble results up to the ChannelManager via some kind of OfferEvents (basically just "invoice received").

It isn't clear to me how using an event approach is that much different. Either way, the user needs to have some code called. But the event approach forces them to orchestrate everything. Seems we could simply rename add_new_awaiting_invoice to invoice_request_enqueued to avoid exposing any payment state management.

For sending payments, in the alternative approach the event examiner would need to initiate the payment. But if that fails, they would need to create and enqueue an InvoiceError, which seems like should be the responsibility of OffersMessageFlow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants