Broadcast lane

Proposed in #28 (originally #27). It is different from existing splat, since it broadcasts a lane from input, rather than a scalar, also takes an index to select which element to broadcast:

> Gets a single lane from vector and broadcast it to the entire vector.
> `idx` is interpreted modulo the cardinal of the vector.
> 
> - `vec.v8.splat_lane(v: vec.v8, idx: i32) -> vec.v8`
> - `vec.v16.splat_lane(v: vec.v16, idx: i32) -> vec.v16`
> - `vec.v32.splat_lane(v: vec.v32, idx: i32) -> vec.v32`
> - `vec.v64.splat_lane(v: vec.v64, idx: i32) -> vec.v64`
> - `vec.v128.splat_lane(v: vec.v128, idx: i32) -> vec.v128`

On x86 broadcast instructions first appear in AVX (32-bit floating point elements, AVX2 for integers), however x86 variants don't take an index and only broadcasts first element of the source. General-purpose shuffle would need to be used to emulate this on SSE, which is not great (definitely slower than specialized version). Also, taking an index would lead to this turning into a general purpose shuffle on AVX+ as well.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Broadcast lane #29

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Broadcast lane #29

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions