-
Notifications
You must be signed in to change notification settings - Fork 407
Scorer
Next Steps
#1170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
While you address some questions that are not related to liquidity (e.g. temporary channel failure can mean peer offline) I think it is reasonable to look at the liquidity case first. The updates of the probabilities in the liquidity case are so stark that they should cover the peer offline case in most cases reasonably well (unless we really run into sending all the sats that the network might support which is a very hard edge case anyway in which we probably just open a new channel) Thus I think most of the questions and wishes that you are addressing here (but in particular your second bullet point) are explained in the general case for an arbitrary probability function in the probabilistic payment delivery paper and spilled out explicitly for the uniform case that we observed on the network in the optimally reliable and cheap payment flows paper . As papers like those are always written a bit special and technical I will take the time to walk you through the main results that address your concerns Let's start with a bit of Notation for the sake of shorter text I will do it not as formal as in those papers and refer for the details to the sections. also since this is not a paper and just a quick summeray I don't guarantee that all I will write here is free of typos so if in doubt double check with the papers that have been proof read by more than just my eyes. Modelling the Liquidity (channel Balance) as a uniform distributionChapter 3 of the first paper states that we can model the uncertainty of the liquidity in a channel (aka channel balance) as a probability function with a random variable As mentioned before all mainnet experiments hint that we can assume a uniform distribution for the uncertainty of the liquidity / channel balance in which case the failure probability for a channel of capacity Since a success exists if there is no failure we can write the success probability as In Section 2.2. of the Second Paper we Introduce a term that we should already have used in the first Paper which is the Learning from failed and success full attempts (uniform case) on the Uncertainty Network.Let's ignore for a moment the fact that we should forget learned information over time and start to ask the question what can we learn from a failed or successful payment attempt. This is discussed in very general terms with entropy considerations and information gain (Kulback Leibler Divergance) in Section 4.3 of the first paper and very specifically with respect on how to update the Uncertainty Netork at the end of chapter 3 in the Second Paper. Let's say a payment delivery algorithm selects a route and wants to add an HTLC with amount
Of course all those thoughts can generalize more or less easily (at least in principle it is easy but just technical) to cases where more than 1 htlc is in flight and/or has been returned or not and what so ever. @stefanwouldgo and I have created a more or less stable implementation for this in our mainnet tests of optimally reliable payment flows so I can tell you it is certainly possible. Provencance scoresI think measuring a history of successful payments similar to what mission control does is reasonable in general. (I think that is what you addressed in your first bullet point) I have a half ready paper on how to count this properly lying in my drawer. The gist is to do the following two things.
My feeling (!) is that the provenenace scores might be interesting for LSPs but not really for end users wallets. Also I would assume the effect is neglelctable with comparision to the uncertainty rising from the liquidity. Forgetting the learned informationThis is something that we mention as future work in our papers. Basically we need a way to transform the updated probability distribution back to our fully uncertain distribution. This can be down by decying the probed / tested / learned amounts exponentially over time towards zero (the speed of the decay would have to be measured emperically) And obviously one could also do this linearly instead of exponentially. I would have loved to do research about this as I think this is a necessary and an easy apple to pick. In General I suggest to look at Homotopy to learn how to transform the updated probabilities back to the original probabilities. Conclusion and general thoughts:Of course it is needless to say that instead of maximizing probablities which are multiplicative along all channels of a (multi)path. we can switch to negative logs of the probabilities and minimize them which are additive across all involved channels. The main strength of this framework is that it avoids making arbitrarty choices like "introduce a bias if a payment is smaller or larger than a privious attempt" Here we just update the uncertainty in the information theoretic correct sense and the formulars are actually wicket easy for the uniform case. Also all experiments done indicate that uncertainty of liquidity is by far the most dominant feature. So I strongly suggest to use this as a baseline before going into others. While fees are also an obvious feature that have been discussed in the c-lighting pr at ElementsProject/lightning#4771 I think the CLTV (or riskfactor) is actually not helpful (though c-lightning kept it (I guess for nostalgic reasons)). I am currently investigating how to add latency (however this has similar problems like the base fee in the sense that it has a jump from not using a channel to using the channel and thus makes computations of min cost flows expensive) Yet and as you saw on twitter there is indication that bee line distance might be a good proxy for latency especially the current beeline distance on the network is very poor from a latency perspective with many channels crossing oceans Let me know if you should have more questions. |
Say we successfully route Say we fail to route Now after either of those scenarios, say routing More generally, as more payments are successfully routed (or fail to route) through a channel, how does this affect the success probability calculation for the channel? IIUC, an implementation should keep track of the most liquidity Thus, ignoring the 0 and 1 success probability cases, the calculation would be |
In our code, we have modelled these issues using two extra variables per edge: minBalance and inFlight: https://github.com/renepickhardt/mpp-splitter/blob/a90e134e4bd39bd8496d047851a993b1393a1a9d/code/scala/mincostflow/heuristic/MinCostFlow.scala#L159 minBalance is the minimum balance that we know can be routed through this channel, and inFlight is what we know is already used of the balance (typically by an already locked-in htlc). From these we derive effective amount and effective capacity values like this:
and then calculate the probability as |
Does
What exactly does
FWIW, this is the formulation that I came up with:
Where This formula is the same as the one in case (3) where |
In our code base in does not include resolved HTLCs as we just demonstrated the effectiveness for a single payment attampt using min cost flows. If I where to use such techniques as an LSP I would not forget the knowledge after one payment session and would also include resolved HTLCs to some degree for a certain amount of time.
look at point 3 below where I tried to address this.
I think this is the essence of what I propose formulated in a different way. So the entire question boils down to one of accounting. Let me go one step back. In general we have a channel that looks like this:
this corresponds to the uncertainty that the actual liquidity is If we now know something about the channel which can happen due to several circumstances (failed, successful HTLCs or unresolved HTLCs in flight) our uncertainty gets reduces and we can adopt the probabilities. which is exactly what you do by setting the probability to 0 if the payment is above the assumed liquidity and to 1 if it is blow the assumed existing liquidity. so given some prior knowledge of
which corresponds to Like in your formula we have three cases for a payment of amount
As long as you do a single path payment there is no strong necessity to account for in-flight HTLCs. of course if you are an active LSP you might have several concurrent payments that you are handling and you use knowledge from various payment sessions. But if you are in a setting as @stefanwouldgo and I were where we compute a flow it happens quite often that we allocate several HTLCs to the same channel and we have to account for in-flight HTLCs. From the perspective of an LSP it also makes sense to maintain the uncertainty network for more than one payment session and maintain those upper and lower bounds of where you are certain with respect to the existing liquidity.
This is also my understanding as I tried to lay out in my above comments
I think our way of tracking |
IIUC, your code base doesn't account for case (2) in addition to case (4). Ignoring in-flight HTLCs, the formula becomes:
vs
In other words, it is using a capacity |
I discussed this with @stefanwouldgo and yes the linked code base does not take into consideration all of the mentioned cases. The code that we shared was mainly to demonstrate the feasibility of the min cost flow solver and probabilistic approach. We did not share all of our code from our integration of the uncertainty network in our mainnet test as we feared it was very hacky and ignoring many roadblocks that can happen in reality (e.g. other errors, htlc limits, channel reserves,...) and we did not want people to copy poorly written proof of concept code. So I would suggest to follow the formulas here and the theory to derive the full result / integration for the general case. Note two things:
Let me know if you need more information / help and I am really looking forward to see / test your implementation in work! |
@renepickhardt @stefanwouldgo I have this implemented in a draft PR (#1227) with an open question on the cost function: #1227 (comment). |
In 0.0.103 we landed a pretty naive scorer to avoid re-routing over the same channel over and over again on retries. #1166 adds a bit better scoring, but we've still got a ways to go. We should probably, at least
Scorer
s across instances, which may lead to wanting to use the same channel in opposite directions. Ideally we'd consider the reason before deciding whether to punish both sides of a channel or only one, but sadly "temporary_channel_failure" can mean both "peer offline" and "amount too large" and those are the two most common reasons. Not sure how to progress here but it may require some experimentation.The text was updated successfully, but these errors were encountered: