wgpu WGSL compute shader does not appear to be doing anything

1.3k Views Asked by At

I'm trying to follow along the "hello compute" example from wgpu on Windows 10 (with some minor modifications, mainly gutting the shader so it does basically no actual computing), but when I read the buffer at the end, it's always zeroed out.

This is the shader I'm trying to run, it compiles fine and I think it's correct

[[block]]
struct Numbers
{
    data: [[stride(4)]] array<u32>;
};


[[group(0), binding(0)]]
var<storage, read_write> numbers: Numbers;

[[stage(compute), workgroup_size(1)]]
fn main()
{
    numbers.data[0] = numbers.data[0] + u32(1);
    numbers.data[1] = numbers.data[1] + u32(1);
    numbers.data[2] = numbers.data[2] + u32(1);
}

As for the wgpu code, it follows the tutorial quite closely:

I get the instance, device, and queue

let instance = Instance::new(Backends::PRIMARY);

let adapter = block_on(instance
    .request_adapter(&RequestAdapterOptions
    {
        power_preference: PowerPreference::default(),
        compatible_surface: None,
    }))
    .unwrap();

let (device, queue) = block_on(adapter
    .request_device(&Default::default(), None))
    .unwrap();

Compile the shader and make a pipeline:

let shader = device.create_shader_module(&ShaderModuleDescriptor
{
    label: Some("shader"),
    source: ShaderSource::Wgsl(shader_src.into()),
});

let pipeline = device.create_compute_pipeline(&ComputePipelineDescriptor
{
    label: None,
    layout: None,
    module: &shader,
    entry_point: "main",
});

Make the staging and storage buffer. The dbg!(size) prints 12, which should be correct for a 3-length array for 4-byte u32s.

let buffer = [1u32, 2, 3];
let size = std::mem::size_of_val(&buffer) as u64;
dbg!(size);

let staging_buffer = device.create_buffer(&BufferDescriptor
{
    label: None,
    size: size,
    usage: BufferUsages::MAP_READ | BufferUsages::COPY_DST,
    mapped_at_creation: false,
});

let storage_buffer = device.create_buffer_init(&BufferInitDescriptor
{
    label: Some("storage buffer"),
    contents: cast_slice(&buffer),
    usage: BufferUsages::STORAGE
        | BufferUsages::COPY_DST
        | BufferUsages::COPY_SRC,
});

set up the bind group:

let bg_layout = pipeline.get_bind_group_layout(0);
let bind_group = device.create_bind_group(&BindGroupDescriptor
{
    label: None,
    layout: &bg_layout,
    entries: &[BindGroupEntry
    {
        binding: 0,
        resource: storage_buffer.as_entire_binding(),
    }]
});

Get the encoder and create the compute pass. The copy_buffer_to_buffer should copy the storage buffer to the staging buffer so I can read it at the end.

let mut encoder = device.create_command_encoder(&CommandEncoderDescriptor
{
    label: None,
});

{
    let mut cpass = encoder.begin_compute_pass(&ComputePassDescriptor
    {
        label: None
    });
    cpass.set_pipeline(&pipeline);
    cpass.set_bind_group(0, &bind_group, &[]);
    cpass.dispatch(1, 1, 1);
}

encoder.copy_buffer_to_buffer(
    &storage_buffer, 0,
    &staging_buffer, 0,
    size);

queue.submit(Some(encoder.finish()));

And then submit the compute pass and block for the result:

let buf_slice = staging_buffer.slice(..);
let buf_future = buf_slice.map_async(MapMode::Read);

device.poll(Maintain::Wait);

if let Ok(()) = block_on(buf_future)
{
    let data = buf_slice.get_mapped_range();
    let result = cast_slice::<u8, u32>(&data).to_vec();

    drop(data);
    staging_buffer.unmap();

    println!("{:?}", result);
}
else
{
    println!("error");
}

The error case isn't reached, and the program terminates with no errors, but the result is always printed [0, 0 ,0], when it should be [2, 3, 4].

What am I doing wrong?

1

There are 1 best solutions below

0
On BEST ANSWER

The program works fine when I'm running it on my discrete graphics card, but wgpu is bugged on my integrated Intel HD Graphics 630, which is why the program appeared not to work.