WebGPU Particles

A million particles. Zero CPU involvement after init.

GPU-computed particle simulation using WebGPU compute shaders. Physics runs entirely in WGSL — the CPU never touches particle state after initialization. One million particles at 60fps.

// project.spec

typeExperiment

statusLive

year2025

// stack

WebGPUCompute ShadersWGSLTypeScript

Live site ↗GitHub ↗

One million particles. GPU-computed physics. Zero CPU involvement after initialization. This is the demo that made WebGPU feel real to me — not a spec document, not a blog post about the future, but an actual compute pipeline running in a browser tab at 60 frames per second.

The architecture

The demo has three stages, each running as a WebGPU compute pass:

Stage 1: Physics update. A compute shader reads the current position and velocity of every particle from a storage buffer, applies forces (gravity, wind, turbulence, mouse interaction), integrates using Verlet integration, and writes the updated state back to the buffer. Every particle is processed in parallel — there is no particle-to-particle iteration in the shader.

Stage 2: Grid hashing (optional). For particle-particle interactions (collision, flocking), a spatial hash grid maps particles to cells. This is a separate compute pass that updates the grid based on current positions, then uses the grid to accelerate neighbor lookups in the physics pass. With the grid enabled, the demo supports up to 100k interacting particles at 60fps.

Stage 3: Render. A vertex shader reads the position buffer and renders each particle as a point sprite. The fragment shader applies color based on velocity magnitude — slow particles are cool tones, fast particles are warm tones, creating a heatmap-like visualization of the velocity field.

What makes WebGPU essential here

You can do particle simulation in WebGL. I've done it. The technique is well-known: encode particle state as pixel data in floating-point textures, use fragment shaders to compute updates, ping-pong between two textures each frame. It works. It's also fragile, limited by texture format constraints, and essentially a hack — you're using the rendering pipeline to do computation.

WebGPU removes the hack. Compute shaders read and write typed storage buffers. There's no texture encoding, no framebuffer binding, no pretending that particle positions are pixel colors. The code reads like GPU compute code because that's what it is:

@compute @workgroup_size(64)
fn physics_main(@builtin(global_invocation_id) id: vec3<u32>) {
    let idx = id.x;
    if (idx >= u_particle_count) { return; }

    var pos = particles_in[idx].position;
    var vel = particles_in[idx].velocity;

    // Apply forces
    vel = vel + u_gravity * u_delta_time;
    vel = vel + turbulence(pos) * u_delta_time;
    vel = vel * u_damping;

    // Integrate
    pos = pos + vel * u_delta_time;

    particles_out[idx] = Particle(pos, vel);
}

Performance characteristics

On a 2021 MacBook Pro (Apple M1 Pro, 16 GPU cores), the demo runs at:

1M particles (physics only): ~2.5ms per frame for physics → 60fps, GPU at ~30%
100k particles with spatial hash + neighbor forces: ~4ms per frame → 60fps, GPU at ~50%
1M particles with rendering: ~5ms total → 60fps, GPU at ~65%

The bottleneck at 1M particles is memory bandwidth — reading and writing 1M × (position + velocity) = 1M × 24 bytes = 24MB per frame. At 60fps, that's 1.44 GB/s of buffer traffic. The M1 Pro has approximately 200 GB/s of memory bandwidth, so we're well within the envelope, but the memory access pattern matters — coalesced reads from the compute shader make full use of the available bandwidth.

What I learned

The biggest surprise was how much the explicit memory model matters. In WebGL, you don't think about buffer synchronization because the API handles it for you — but the cost is unpredictable latency. In WebGPU, you decide when to synchronize, and the API makes the cost visible. This forces you to think about data dependencies in a way that WebGL let you ignore, and the result is faster, more predictable code.

The other lesson was that WGSL is a better language than GLSL for the same reason TypeScript is better than untyped JavaScript: the compiler catches errors before they become runtime mysteries. The trade-off is verbosity — WGSL requires explicit types and bounds checks that GLSL implicitly handles. For a demo with a few hundred lines of shader code, the verbosity is worth it. For a quick prototype, GLSL is still faster to write.

Try it

The demo is live at particles.anjana784.dev. It requires a browser with WebGPU support — Chrome 113+, Edge 113+, or Firefox Nightly with the dom.webgpu.enabled flag. Source is on GitHub.

more projects

← All projects