-
Notifications
You must be signed in to change notification settings - Fork 285
Implement fast algorithm to multiply two arb_poly with log-convex coefficients
#2542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Interesting! Could you specify some benchmarks for some cases (both @vneiger and @fredrik-johansson write great benchmarks, for example see #2437 and #2524)? Like cases where this algorithm works great compared to current routines, and perhaps cases were it doesn't work too well? In case this will be used in other routines, algorithm selection will be important. |
|
The problem is not as much about performance, it's more about overestimation of error bound. But in order to determine whether overestimation of error happens, it's necessary to compute an (approximate) max-plus convolution of log coefficients, which is already like In practice, something like "if N ≥ threshold, run new algorithm. If new algorithm has large errors, run old algorithm" should be fine --- if new algorithm is at least say 3× faster than old algorithm, then the hybrid version above is at most 33% slower than old algorithm. |
|
Just to let you know I'm aware of this PR. Superficially it looks good, but I don't know when I will have time to do a detailed review. One concern I have is that it seems to reinvent instead of improving the existing I would not be surprised to see the same kind of speedup with appropriate modifications to The default |
|
Thanks for the comment.
I tried to understand what It's true that the old code could be improved to use the Afterwards, it should be possible to reuse more code by refactoring
I think the main improvement is that this PR allows each block to have a different slope. The old algorithm requires all blocks to have the same slope. Another difference in the subdivision strategy is that in general, even if both polynomials are convex, each polynomial may still be divided into I have an argument (see the linked PDF) that this algorithm gives Of course, in practice, it may be the case that Would need a benchmark though.
Note that to ensure correctness, it would be necessary to add a small error term to each element of the output, taking linear time per discarded block. Or one can pre-add the error at the start or end (which is what I do here). |
|
Thinking about the comment above, with the current implementation of I can think of some ways
I wonder if it's possible to tighten the analysis to shave a log n factor. The original paper (Bringmann and Cassis, 2023) introduces a log n factor in the block decomposition (Lemma 24), when adapted to polynomial convolution, I think another log n factor is introduced because of the FFT. |
Currently,$O(n^2)$ unless both polynomials have log-coefficients close to linear with same (integral multiple of log(2)) slope.
arb_poly_muluses a blockwise decomposition algorithm, which behaves likeThis pull request implements an algorithm that performs reasonably efficiently for polynomials with convex (actually negative-convex) log-coefficients.
In particular, with$f = (x+1)^{10000}$ and $g = (x+2)^{10000}$ ,
arb_poly_mul(output, f, g, 128)takes 1.5s, and the new algorithm is 10× faster. The speedup is larger for larger polynomial degree. Benchmark code.Currently just a draft version (*). What do you (the maintainers) think of the idea (i.e. is there bandwidth to review the code + the algorithm)?
For more details and proof of correctness, refer to https://drive.proton.me/urls/RA07BXFPFW#aCldUtt21hLr .
Caveat: not (peer-)reviewed.
(*): Needed:
slongfor the exponent (and more importantly, the multiplication insideconvex_hull) is okay, or shouldfmpzbe used (fmpzis about 4% slower) [edit: I think this was debug build, actual overhead is less]arb_poly_mul. In particular, computing convex hull of log-coefficient takes linear time in degree and does not depend onprec, which is still relatively cheap.nice_powerstrick is useful forarb_poly_mullow. It would be helpful with polynomials with linear log-coefficient with slope being a non-integral multiple of log(2), but this may be an extremely specific case.Previously discussed in #2278