Skip to content

[DOC]: Could you add the speed comparion with Triton/ThunderKittens? #9

@EricLina

Description

@EricLina

How would you describe the priority of this documentation request?

Critical (currently preventing usage)

Please provide a link or source to the relevant docs

https://docs.nvidia.com/cuda/cutile-python/performance.html

Describe the problems in the documentation

Thanks for your awesome work.

I believe this official repo would bring significant efficiency gains over a naive PyTorch implementation.
But I recommend adding a speed benchmark compared with other languages like Triton and ThunderKittens. This would be incredibly helpful for users.

(Optional) Propose a correction

No response

Contributing Guidelines

  • I agree to follow cuTile Python's contributing guidelines
  • I have searched the open documentation and have found no duplicates for this documentation request

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions