Skip to content

[BUG] Incorrect tensor layout stride values in Blackwell tutorial 01_mma_sm100.cu comments #2920

@Johnsonms

Description

@Johnsonms

Which component has the problem?

CuTe Tutorial

Bug Report

Describe the bug
The Blackwell MMA tutorial example 01_mma_sm100.cu contains incorrect tensor layout stride values in the documentation comments. The commented stride values do not match the actual tensor layouts produced when the code executes, which can mislead users learning from the tutorial.

Steps/Code to reproduce bug

  1. Navigate to the Blackwell tutorial directory:
    cd examples/cute/tutorial/blackwell
    
  2. Compile the tutorial with debug prints enabled:
  nvcc -std=c++17 -arch=sm_100a \
    -I../../../../include \
    -I../../../../tools/util/include \
    01_mma_sm100.cu -o 01_mma_sm100
  1. Run the executable:
    ./01_mma_sm100
  2. Compare the actual output for tensor layouts (printed to console) with the documented values in the comments at lines 181-184 and 212-214.

Expected behavior
Observed incorrect comment values:

  • Line 182: gB: (_256,_64,4):(_1,256,16384)
  • Line 183: gC: (_128,_256):(256,_1)
  • Line 184: gD: (_128,_256):(256,_1)
  • Line 212: tCgB: ((_256,_16),_1,_4,4):((_1,256),_0,4096,16384)
  • Line 213: tCgC: ((_128,_256),_1,_1):((256,_1),_0,_0)
  • Line 214: tCgD: ((_128,_256),_1,_1):((256,_1),_0,_0)

The documentation comments should accurately reflect the actual tensor layout strides produced by the code. The correct values should be:

  • Line 182: gB: (_256,_64,4):(256,_1,_64)
  • Line 183: gC: (_128,_256):(1024,_1)
  • Line 184: gD: (_128,_256):(1024,_1)
  • Line 212: tCgB: ((_256,_16),_1,_4,4):((256,_1),_0,_16,_64)
  • Line 213: tCgC: ((_128,_256),_1,_1):((1024,_1),_0,_0)
  • Line 214: tCgD: ((_128,_256),_1,_1):((1024,_1),_0,_0)

Environment details (please complete the following information):

  • Environment location: Bare-metal / Docker (depends on user setup)
  • CUTLASS version: 4.3.4 (latest main branch)
  • CUDA version: Any version supporting sm_100a (Blackwell architecture)
  • GPU architecture: Blackwell (SM 100a)
  • File affected: examples/cute/tutorial/blackwell/01_mma_sm100.cu
    Additional context
    Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions