AICL-Lab
Popular repositories Loading
-
the-book-of-secret-knowledge-zh
the-book-of-secret-knowledge-zh Public📚 秘密知识之书中文版 - A curated collection of tools, manuals, cheatsheets, and resources for SysAdmins, DevOps, Pentesters and Security Researchers. Chinese translation of the-book-of-secret-knowledge.
Python 4
-
-
diy-flash-attention
diy-flash-attention PublicLearn Triton by building FlashAttention from scratch — V2 kernels, persistent threads, mask DSL, profiling toolkit, bilingual docs
Python 4
-
heterogeneous-task-scheduler
heterogeneous-task-scheduler PublicC++17 DAG scheduler for heterogeneous CPU/GPU workloads - production-ready with CPU-only validation path
C++ 4
-
triton-fused-ops
triton-fused-ops PublicFused Triton kernels for Transformer inference: RMSNorm+RoPE, Gated MLP, FP8 GEMM — CPU-testable references, autotuning, and benchmarking
Python 4
Repositories
- fq-compressor-rust Public
High-performance FASTQ compressor with block-indexed archive format, random access support, and multiple compression modes
AICL-Lab/fq-compressor-rust’s past year of commit activity - triton-fused-ops Public
Fused Triton kernels for Transformer inference: RMSNorm+RoPE, Gated MLP, FP8 GEMM — CPU-testable references, autotuning, and benchmarking
AICL-Lab/triton-fused-ops’s past year of commit activity - gpu-fft Public
WebGPU FFT core for JavaScript/TypeScript with 1D/2D and real-input transforms, plus CPU utilities. Zero runtime dependencies.
AICL-Lab/gpu-fft’s past year of commit activity - hetero-paged-infer Public
High-Performance LLM Inference Engine with PagedAttention & Continuous Batching in Rust
AICL-Lab/hetero-paged-infer’s past year of commit activity - mini-inference-engine Public
CUDA GEMM optimization tutorial and mini inference engine with progressive kernels, benchmarks, and OpenSpec docs
AICL-Lab/mini-inference-engine’s past year of commit activity - mini-image-pipe Public
GPU-accelerated image processing pipeline with DAG scheduling, CUDA operators, and multi-stream execution
AICL-Lab/mini-image-pipe’s past year of commit activity - cuflash-attn Public
CUDA C++ FlashAttention reference implementation - O(N) memory, FP32/FP16, forward/backward
AICL-Lab/cuflash-attn’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…