diff --git a/v2/orchestrators/resources/gpu-support.mdx b/v2/orchestrators/resources/gpu-support.mdx index 2f9240416..178ff658d 100644 --- a/v2/orchestrators/resources/gpu-support.mdx +++ b/v2/orchestrators/resources/gpu-support.mdx @@ -13,6 +13,8 @@ keywords: - session limits - RTX - transcoding + - HEVC + - H.265 - AI inference 'og:image': /snippets/assets/site/og-image/en/orchestrators.png 'og:image:alt': Livepeer Docs social preview image for Orchestrators @@ -23,7 +25,7 @@ pageType: reference audience: orchestrator purpose: reference status: review -lastVerified: 2026-03-13 +lastVerified: 2026-04-07 --- {/* TODO: Terminology Validation: @@ -64,6 +66,7 @@ go-livepeer requires NVIDIA GPUs with NVENC and NVDEC support. AMD and Intel GPU GPU Family Transcoding + HEVC Encode AI Inference Notes @@ -71,29 +74,34 @@ go-livepeer requires NVIDIA GPUs with NVENC and NVDEC support. AMD and Intel GPU **GeForce RTX 40xx** (Ada Lovelace) Yes Yes - Best consumer option. AV1 encode support. + Yes + Best consumer option. AV1 and HEVC 10-bit encode support. **GeForce RTX 30xx** (Ampere) Yes Yes + Yes Widely used by orchestrators. Good price-performance. **GeForce RTX 20xx** (Turing) Yes Yes - Supported but older. + Yes + Supported but older. HEVC B-frames supported. **GeForce GTX 16xx** (Turing) Yes + Yes Limited - No Tensor cores — AI inference slower or unsupported for some pipelines. + No Tensor cores - AI inference slower or unsupported for some pipelines. **GeForce GTX 10xx** (Pascal) Yes + Yes Limited Legacy. NVENC Gen 6. No Tensor cores. @@ -101,42 +109,49 @@ go-livepeer requires NVIDIA GPUs with NVENC and NVDEC support. AMD and Intel GPU **Tesla T4** Yes Yes + Yes Data centre card. 16 GB VRAM. Common in cloud. **Tesla V100** Yes Yes + Yes Data centre. 16/32 GB VRAM. **A100** Yes Yes + Yes Data centre. 40/80 GB VRAM. Highest throughput. **A10 / A10G** Yes Yes + Yes Cloud-optimised (AWS G5, etc.). 24 GB VRAM. **L4** Yes Yes + Yes Ada Lovelace data centre. 24 GB VRAM. Good for AI. **L40 / L40S** Yes Yes + Yes 48 GB VRAM. High-end AI and transcoding. **H100** Transcoding works but overkill Yes + Yes 80 GB VRAM. Primarily for LLM and large model inference. @@ -314,6 +329,96 @@ For detailed per-pipeline VRAM planning, see the [Model and Demand Reference](/v +## Popular GPUs for AI Workloads + +The following GPUs are well-suited for AI inference on the Livepeer network. VRAM is the primary constraint - larger models require more VRAM, and running multiple warm models simultaneously multiplies the requirement. + + + + GPU Model + VRAM + HEVC Encode + Best For + Notes + + + **RTX 4090** + 24 GB + Yes + Image/video AI, quantised LLMs + Top consumer GPU. High throughput for diffusion models. + + + **RTX 3090 / 3090 Ti** + 24 GB + Yes + Image/video AI, quantised LLMs + Best value 24 GB option. Widely used in the Livepeer network. + + + **RTX 4070 Ti Super** + 16 GB + Yes + Single warm AI model + transcoding + Good balance of price, power, and VRAM. + + + **RTX 4080 Super** + 16 GB + Yes + Single warm AI model + transcoding + Higher CUDA core count than 4070 Ti Super. + + + **Tesla T4** + 16 GB + Yes + Cloud AI inference + Efficient data centre card. Common in AWS/GCP/Azure. + + + **A10G** + 24 GB + Yes + Cloud AI inference + transcoding + AWS G5 instance GPU. Strong diffusion model performance. + + + **L4** + 24 GB + Yes + Cloud AI inference + transcoding + Ada Lovelace. Efficient power draw. Common in GCP. + + + **L40S** + 48 GB + Yes + Multi-model AI, large LLMs + 48 GB VRAM allows multiple warm models simultaneously. + + + **A100 SXM/PCIe** + 40/80 GB + Yes + Large LLMs, high-throughput inference + Data centre. Highest AI throughput available at scale. + + + **H100 SXM/PCIe** + 80 GB + Yes + Very large LLMs, maximum throughput + Data centre. ~3x faster than A100 for LLM inference. + + + + + For orchestrators running both AI pipelines and video transcoding, prioritise VRAM when selecting a GPU. A 24 GB card such as the RTX 3090, A10G, or L4 provides headroom for one or two warm AI models alongside active transcoding sessions. + + + + ## See Also diff --git a/v2/orchestrators/resources/reference/gpu-support.mdx b/v2/orchestrators/resources/reference/gpu-support.mdx index 128169e79..e39c73448 100644 --- a/v2/orchestrators/resources/reference/gpu-support.mdx +++ b/v2/orchestrators/resources/reference/gpu-support.mdx @@ -13,6 +13,8 @@ keywords: - session limits - RTX - transcoding + - HEVC + - H.265 - AI inference 'og:image': /snippets/assets/media/og-images/en/orchestrators.png 'og:image:alt': Livepeer Docs social preview image for Orchestrators @@ -23,7 +25,7 @@ pageType: reference audience: orchestrator purpose: reference status: review -lastVerified: 2026-03-13 +lastVerified: 2026-04-07 --- {/* TODO: Terminology Validation: @@ -64,6 +66,7 @@ go-livepeer requires NVIDIA GPUs with NVENC and NVDEC support. AMD and Intel GPU GPU Family Transcoding + HEVC Encode AI Inference Notes @@ -71,29 +74,34 @@ go-livepeer requires NVIDIA GPUs with NVENC and NVDEC support. AMD and Intel GPU **GeForce RTX 40xx** (Ada Lovelace) Yes Yes - Best consumer option. AV1 encode support. + Yes + Best consumer option. AV1 and HEVC 10-bit encode support. **GeForce RTX 30xx** (Ampere) Yes Yes + Yes Widely used by orchestrators. Good price-performance. **GeForce RTX 20xx** (Turing) Yes Yes - Supported but older. + Yes + Supported but older. HEVC B-frames supported. **GeForce GTX 16xx** (Turing) Yes + Yes Limited - No Tensor cores — AI inference slower or unsupported for some pipelines. + No Tensor cores - AI inference slower or unsupported for some pipelines. **GeForce GTX 10xx** (Pascal) Yes + Yes Limited Legacy. NVENC Gen 6. No Tensor cores. @@ -101,42 +109,49 @@ go-livepeer requires NVIDIA GPUs with NVENC and NVDEC support. AMD and Intel GPU **Tesla T4** Yes Yes + Yes Data centre card. 16 GB VRAM. Common in cloud. **Tesla V100** Yes Yes + Yes Data centre. 16/32 GB VRAM. **A100** Yes Yes + Yes Data centre. 40/80 GB VRAM. Highest throughput. **A10 / A10G** Yes Yes + Yes Cloud-optimised (AWS G5, etc.). 24 GB VRAM. **L4** Yes Yes + Yes Ada Lovelace data centre. 24 GB VRAM. Good for AI. **L40 / L40S** Yes Yes + Yes 48 GB VRAM. High-end AI and transcoding. **H100** Transcoding works but overkill Yes + Yes 80 GB VRAM. Primarily for LLM and large model inference. @@ -314,6 +329,96 @@ For detailed per-pipeline VRAM planning, see the [Model and Demand Reference](/v +## Popular GPUs for AI Workloads + +The following GPUs are well-suited for AI inference on the Livepeer network. VRAM is the primary constraint - larger models require more VRAM, and running multiple warm models simultaneously multiplies the requirement. + + + + GPU Model + VRAM + HEVC Encode + Best For + Notes + + + **RTX 4090** + 24 GB + Yes + Image/video AI, quantised LLMs + Top consumer GPU. High throughput for diffusion models. + + + **RTX 3090 / 3090 Ti** + 24 GB + Yes + Image/video AI, quantised LLMs + Best value 24 GB option. Widely used in the Livepeer network. + + + **RTX 4070 Ti Super** + 16 GB + Yes + Single warm AI model + transcoding + Good balance of price, power, and VRAM. + + + **RTX 4080 Super** + 16 GB + Yes + Single warm AI model + transcoding + Higher CUDA core count than 4070 Ti Super. + + + **Tesla T4** + 16 GB + Yes + Cloud AI inference + Efficient data centre card. Common in AWS/GCP/Azure. + + + **A10G** + 24 GB + Yes + Cloud AI inference + transcoding + AWS G5 instance GPU. Strong diffusion model performance. + + + **L4** + 24 GB + Yes + Cloud AI inference + transcoding + Ada Lovelace. Efficient power draw. Common in GCP. + + + **L40S** + 48 GB + Yes + Multi-model AI, large LLMs + 48 GB VRAM allows multiple warm models simultaneously. + + + **A100 SXM/PCIe** + 40/80 GB + Yes + Large LLMs, high-throughput inference + Data centre. Highest AI throughput available at scale. + + + **H100 SXM/PCIe** + 80 GB + Yes + Very large LLMs, maximum throughput + Data centre. ~3x faster than A100 for LLM inference. + + + + + For orchestrators running both AI pipelines and video transcoding, prioritise VRAM when selecting a GPU. A 24 GB card such as the RTX 3090, A10G, or L4 provides headroom for one or two warm AI models alongside active transcoding sessions. + + + + ## See Also