-
Notifications
You must be signed in to change notification settings - Fork 407
Revisit MPP splitting heuristic #1276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is a very interesting issue and I will look very closely what results you will arrive at. Obviously the best split is computed via a min cost flow but I assume you want to avoid this as min cost flow computations are more expensive than candidate path generation via dijkstra... That being said I will leave you with a few high level observations and links to the papers that might be helpful. According to formula (8) in our paper on probabilistic path finding you can see that the expected number of attempts
Given all that I guess the two most promissing approaches are the following.
As said I will look closely what ideas you have. I was trying this for a couple of months until I arrived at a reformulation of the min cost flow problem which leads to the exact solution. I was not able to find any heuristics that I found satisfying but I will be excited if you can detect something that is easy enough and works well. |
Cross post of what I wrote in the lnd issue: I have published code for a piecewise linearized formulation of the problem that runs in less than a second and produces a reasonable approxination to the optimal solution. You can find it and play around with it at: I think that approach - if all engineering details (propper handling of uncertainty network, channel reserves, htlc limits, pruning of piecewise linearized problem, adding fees to the cost function) are included - should be very practical for implementations and as discussed on the mailinglist allow for large payment amounts to be handled |
Cool! Would love to see a Rust/optimized version. Obviously a second for routing is impractical for most of our users, but providing an option to use a different routing algorithm entirely for larger-value payments may be a useful option. |
There is an implementation in rust: https://docs.rs/mcmf/latest/mcmf/ though it seems to be based on a simplex algorithm. My preliminary experiments with piecewise linear approximation last summer did not work as I used networkx min cost flow solver in python that is also based on the simplex algorithm. At least the simplex algorithm based python implementation allowed for floating point values which was comparible fast but messed up the quality of the results. Of course this might be different in the rust implementation and you could just give it a try. The google lib seems to use a cost scaling algorithm. it should be easy enough to reimplement it in rust as the algorithm itself is pretty small: https://github.com/google/or-tools/blob/86d4c543f717fa8716a9d66c25143660668bf825/ortools/graph/min_cost_flow.cc Note our original and slow implementation was a capacity scaling algorithm (while it has a very similar theoretical runtime to the capacity scaling approach it seems that the cost scaling approach is much more faster in practical situations) There is a list of implementations of cost scaling algorithms on stack exchange. and a stand alone version (in c) without dependencies seems to be here: https://github.com/iveney/cs2/blob/master/cs2.c (though the license is not open) I guess you have ton consult a lawyer to find out if one is allowed to just follow that implementation to create a rust version. With respect to your runtime concernsAs discussed on the Mailinglist: Especially for LSPs one can batch several payments that occur within a second into a single min cost flow computation. Given enough load on the LSP this approach should be cheaper than the cumulated CPU cycles that happen when invoking dijkstra calls for every single payment request. For a single stand alone client that makes a payment once in a while I believe sub second computational time is still faster than onion rountrips thus the quality of the suggested paths should outperform a dijkstra based solution on wall-clock time (not on CPU cycles though) Edit: Also the tests from Carsten Otto implied that the java bindings of the lib demonstrated 230 ms runtime for solving the min cost flow. As noted by myself on the mailinglist we can probably easily reduce the size of the search space / network by a factor of 10 which also reduces the runtime again significantly and should push it far below 100ms. What runtime do you wish to achieve and how expensive are the dijsktra calls and splitting logic of this implementation now? |
I'd be interested in having a look at this soon(-ish). |
This doesn't improve latency, though.
I think you overestimate the number of round-trips required with a good scoring heuristic. Maybe this is reasonable as a "start running this in the background while the first attempt at paying is being tried". Until we can get things down in the sub-100ms range it doesn't make all that much sense, and doubly not as long as it still needs base fee to be diminutive for search to work right. |
Until then, we should still look at the original topic of this issue, but getting into some crazy routing rewrite is not the topic of this issue :) |
I am still puzzled that you actually made that statement. The entire research on the optimally reliable and cheap payment flows paper was not centered around routing but the question of how we should split as we can clearly see from the original single line README.md and the name of the repository which is literally called mpp-splitter. and not Since despite your (what I consider to be very very rude and disrepsectful) comments others like @tnull might still be interested in the progress. That is why I will note that @C-Otto has posted on twitter (https://twitter.com/c_otto83/status/1506017622747488259) that he has implemented a basic version of our results on lnd at C-Otto/lnd-manageJ#6 and he seems to be hacking slightly faster than I am with my progress on a c-lightning integration. So far it is only the splitting logic and not the acutal payment loop and sending of onions. He computes a split of 0.5 BTC from his node to mine into 18 onions each of which has a probability higher than 50% to be deliverd. Meaning on the first attampt we can expect far less than half the onions to fail. his split can be seen at: http://web.archive.org/web/20220322080201/https://c-otto.de/rene.txt. As he indicated before the runtime of the solver was some 230 ms but I guess that is a factor 2 off for your boundary.
|
Yes it very much does (see below) since roundtrip times of onions are by now from a time perspective more expensive than planning of onions to be sent. Also I am very surprised and of course delighted that latency suddenly becomes important to you. Last time when I suggested to add latency considerations in the cost function you didn't seem too keen on that. At least the issue is closed without a comment from you and the PR that supposedly resolved that issue did to the best of my knowledge not do that yet. (To be fair the latency considerations for the cost function have been kind of off topic in that issue)
No I do not overestimate this at all! It's literally one of the thing I do most: Evaluate the accuracy of our results and how it differs. Let's assume some very simple numbers here. Median onion roundtrip time is Just to make this more expressive here is an output from my log file: First I optimize solely or reliability:
In contrast when I optimize for fees and raliability
Of course I also compared it with optimizing mainly for fees
|
Please don't over-read tone out of text, that's a recipe for constantly being insulted on the internet without anyone intending anything. To clarify what was meant here (apologies if it used some secondary meanings in casual English instead of strict meanings, which may have been misleading). by "the original topic" I meant a heuristic by which we split in a dijkstras router, which is something we'd like to revistit in our router, by "crazy routing rewrite" I meant a rewrite of the router using a completely different algorithm, where "crazy" here meant "large", not "bad".
That said, please stop claiming your solution here is "the" solution - it is not, it is a solution, that performs well in certain dimensions, and performs poorly in others. |
That's awesome to hear! That's much more realistic than the 1 second you'd quoted above. We should open an issue to track potentially providing a non-dijkstras router in the future to track it.
Yep, but in many cases its important to note that a second may exceed the time to send, if the path is short/through "good" nodes, which is why I highlighted one second being problematic in ~most payments, at least for well-connected nodes paying other well-connected nodes.
Please don't complain about feeling insulted and then turn around and make comments like this.
My point was a common lightning use-case today is a well-connected node (eg user behind an LSP or a large provider) paying a well-connected node (ie a large merchant processor), in which contexts round-trips are generally much more rare, just looking at numbers from pleb-net-style nodes is a bit misleading/just one deployment model of lightning, and far from the only one (but of course one that's important and that may thus want to have a different router!). |
We tend to be pretty conservative and avoid splitting payments quite a bit. That's fine, but we can probably be a bit smarter about when we split. Once we land #1227 we can maybe also adapt the scoring API to return more information, including things like "you shouldn't send more than N over this channel". One way to accomplish that may be to have a
RouterState
enum passed to the scorer that is first set toRouterState::ConservativeFirstPass
where the scorer can be very aggressive, refusing to route more than a small % of most channels' guessed capacity. If that pass finds a route we can still take the happy path and return it immediately, if not we can do the later passes with a differentRouterState
and then do....something to combine the paths selected across the different conservatism scales.One thing I don't want to have is the state a lot of other nodes appear to have where they just always split all the time, which feels counter-productive and can cause other issues.
cc @jkczyz
The text was updated successfully, but these errors were encountered: