This example shows the use of a kernel that computes a moving average on one-dimensional data. In a sequential program, the moving average of a given input array is found by processing the elements one by one. The average of the previous
- Define constants to control the problem size and the kernel launch parameters.
- Allocate and initialize the input array. This array is initialized as the sequentially increasing sequence
$0, 1, 2, \ldots\mod n$ . - Allocate the device array and copy the host array to it.
- Launch the kernel to compute the moving average.
- Copy the result back to the host and validate it. As each average is computed using
$n$ consecutive values from the input array, the average is computed over the values$0, 1, 2,\ldots, n - 1 $ , the average of which is equal to$(n-1)/2$ .
Device memory is allocated with hipMalloc, deallocated with hipFree. Copies to and from the device are made with hipMemcpy with options hipMemcpyHostToDevice and hipMemcpyDeviceToHost, respectively. A kernel is launched with the myKernel<<<params>>>()-syntax. Shared memory is allocated in the kernel with the __shared__ memory space specifier.
__shared____syncthreadsblockDimblockIdxthreadIdx
__global__hipFreehipGetLastErrorhipMallochipMemcpyhipMemcpyDeviceToHosthipMemcpyHostToDevicehipStreamDefault