perf: exploration of better matmul algorithms#69
perf: exploration of better matmul algorithms#69steven-murray wants to merge 18 commits intomainfrom
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
for more information, see https://pre-commit.ci
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #69 +/- ##
============================================
- Coverage 100.00% 65.07% -34.93%
============================================
Files 8 8
Lines 567 567
Branches 88 88
============================================
- Hits 567 369 -198
- Misses 0 193 +193
- Partials 0 5 +5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
…cpu into matrix-multiply-profiling
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
|
If you're going through all the efforts of writing a cublas implementation, might investigated compiled extensions like cython or a rust extension. Both have interop with numpy which is pretty straightforward |
This adds a notebook that explores different ways to do the
V = Z.dot(Z.T.conj())calculation.Ideas that I haven't yet explored: