Releases: gfx-rs/wgpu
v25.0.1
This release includes wgpu-core
, wgpu-hal
and naga
version 25.0.1
. All other crates remain at their previous versions.
Bug Fixes
- Fix typos in various documentation. By @waywardmonkeys in #7510.
- Fix compile error when building with
profiling/profile-with-*
feature enabled. By @waywardmonkeys in #7509. - Use
once_cell::race::OnceBox
instead ofstd::sync::LazyLock
to allownaga::proc::Namer::default()
to be available without backend features being enabled. By @cwfitzgerald in #7517.
DX12
- Fix validation error when creating a non-mappable buffer using the committed allocation scheme. By @cwfitzgerald and @ErichDonGubler in #7519.
v25.0.0
Major Features
Hashmaps Removed from APIs
Both PipelineCompilationOptions::constants
and ShaderSource::Glsl::defines
now take
slices of key-value pairs instead of hashmap
s. This is to prepare for no_std
support and allow us to keep which hashmap
hasher and such as implementation details. It
also allows more easily creating these structures inline.
By @cwfitzgerald in #7133
All Backends Now Have Features
Previously, the vulkan
and gles
backends were non-optional on windows, linux, and android and there was no way to disable them. We have now figured out how to properly make them disablable! Additionally, if you turn on the webgl
feature, you will only get the GLES backend on WebAssembly, it won't leak into native builds, like previously it might have.
Warning
If you use wgpu with default-features = false
and you want to retain the vulkan
and gles
backends, you will need to add them to your feature list.
-wgpu = { version = "24", default-features = false, features = ["metal", "wgsl", "webgl"] }
+wgpu = { version = "25", default-features = false, features = ["metal", "wgsl", "webgl", "vulkan", "gles"] }
By @cwfitzgerald in #7076.
device.poll
Api Reworked
This release reworked the poll api significantly to allow polling to return errors when polling hits internal timeout limits.
Maintain
was renamed PollType
. Additionally, poll
now returns a result containing information about what happened during the poll.
-pub fn wgpu::Device::poll(&self, maintain: wgpu::Maintain) -> wgpu::MaintainResult
+pub fn wgpu::Device::poll(&self, poll_type: wgpu::PollType) -> Result<wgpu::PollStatus, wgpu::PollError>
-device.poll(wgpu::Maintain::Poll);
+device.poll(wgpu::PollType::Poll).unwrap();
pub enum PollType<T> {
/// On wgpu-core based backends, block until the given submission has
/// completed execution, and any callbacks have been invoked.
///
/// On WebGPU, this has no effect. Callbacks are invoked from the
/// window event loop.
WaitForSubmissionIndex(T),
/// Same as WaitForSubmissionIndex but waits for the most recent submission.
Wait,
/// Check the device for a single time without blocking.
Poll,
}
pub enum PollStatus {
/// There are no active submissions in flight as of the beginning of the poll call.
/// Other submissions may have been queued on other threads during the call.
///
/// This implies that the given Wait was satisfied before the timeout.
QueueEmpty,
/// The requested Wait was satisfied before the timeout.
WaitSucceeded,
/// This was a poll.
Poll,
}
pub enum PollError {
/// The requested Wait timed out before the submission was completed.
Timeout,
}
Warning
As part of this change, WebGL's default behavior has changed. Previously device.poll(Wait)
appeared as though it functioned correctly. This was a quirk caused by the bug that these PRs fixed. Now it will always return Timeout
if the submission has not already completed. As many people rely on this behavior on WebGL, there is a new options in BackendOptions
. If you want the old behavior, set the following on instance creation:
instance_desc.backend_options.gl.fence_behavior = wgpu::GlFenceBehavior::AutoFinish;
You will lose the ability to know exactly when a submission has completed, but device.poll(Wait)
will behave the same as it does on native.
By @cwfitzgerald in #6942 and #7030.
wgpu::Device::start_capture
renamed, documented, and made unsafe
- device.start_capture();
+ unsafe { device.start_graphics_debugger_capture() }
// Your code here
- device.stop_capture();
+ unsafe { device.stop_graphics_debugger_capture() }
There is now documentation to describe how this maps to the various debuggers' apis.
By @cwfitzgerald in #7470
Ensure loops generated by SPIR-V and HLSL Naga backends are bounded
Make sure that all loops in shaders generated by these naga backends are bounded
to avoid undefined behaviour due to infinite loops. Note that this may have a
performance cost. As with the existing implementation for the MSL backend this
can be disabled by using Device::create_shader_module_trusted()
.
By @jamienicol in #6929 and #7080.
Split up Features
internally
Internally split up the Features
struct and recombine them internally using a macro. There should be no breaking
changes from this. This means there are also namespaces (as well as the old Features::*
) for all wgpu specific
features and webgpu feature (FeaturesWGPU
and FeaturesWebGPU
respectively) and Features::from_internal_flags
which
allow you to be explicit about whether features you need are available on the web too.
WebGPU compliant dual source blending feature
Previously, dual source blending was implemented with a wgpu
native only feature flag and used a custom syntax in wgpu.
By now, dual source blending was added to the WebGPU spec as an extension.
We're now following suite and implement the official syntax.
Existing shaders using dual source blending need to be updated:
struct FragmentOutput{
- @location(0) source0: vec4<f32>,
- @location(0) @second_blend_source source1: vec4<f32>,
+ @location(0) @blend_src(0) source0: vec4<f32>,
+ @location(0) @blend_src(1) source1: vec4<f32>,
}
With that wgpu::Features::DUAL_SOURCE_BLENDING
is now available on WebGPU.
Furthermore, GLSL shaders now support dual source blending as well via the index
layout qualifier:
layout(location = 0, index = 0) out vec4 output0;
layout(location = 0, index = 1) out vec4 output1;
Unify interface for SpirV shader passthrough
Replace device create_shader_module_spirv
function with a generic create_shader_module_passthrough
function
taking a ShaderModuleDescriptorPassthrough
enum as parameter.
Update your calls to create_shader_module_spirv
and use create_shader_module_passthrough
instead:
- device.create_shader_module_spirv(
- wgpu::ShaderModuleDescriptorSpirV {
- label: Some(&name),
- source: Cow::Borrowed(&source),
- }
- )
+ device.create_shader_module_passthrough(
+ wgpu::ShaderModuleDescriptorPassthrough::SpirV(
+ wgpu::ShaderModuleDescriptorSpirV {
+ label: Some(&name),
+ source: Cow::Borrowed(&source),
+ },
+ ),
+ )
Noop Backend
It is now possible to create a dummy wgpu
device even when no GPU is available. This may be useful for testing of code which manages graphics resources. Currently, it supports reading and writing buffers, and other resource types can be created but do nothing.
To use it, enable the noop
feature of wgpu
, and either call Device::noop()
, or add NoopBackendOptions { enable: true }
to the backend options of your Instance
(this is an additional safeguard beyond the Backends
bits).
By @kpreid in #7063 and #7342.
SHADER_F16
feature is now available with naga shaders
Previously this feature only allowed you to use f16
on SPIR-V passthrough shaders. Now you can use it on all shaders, including WGSL, SPIR-V, and GLSL!
enable f16;
fn hello_world(a: f16) -> f16 {
return a + 1.0h;
}
By @FL33TW00D, @ErichDonGubler, and @cwfitzgerald in #5701
Bindless support improved and validation rules changed.
Metal support for bindless has significantly improved and the limits for binding arrays have been increased.
Previously, all resources inside binding arrays contributed towards the standard limit of their type (texture_2d
arrays for example would contribute to max_sampled_textures_per_shader_stage
). Now these resources will only contribute towards binding-array specific limits:
max_binding_array_elements_per_shader_stage
for all non-sampler resourcesmax_binding_array_sampler_elements_per_shader_stage
for sampler resources.
This change has allowed the metal binding array limits to go from between 32 and 128 resources, all the way 500,000 sampled textures. Additionally binding arrays are now bound more efficiently on Metal.
This change also enabled legacy Intel GPUs to support 1M bindless resources, instead of the previous 1800.
To facilitate this change, there was an additional validation rule put in place: if there is a binding array in a bind group, you may not use dynamic offset buffers or uniform buffers in that bind group. This requirement comes from vulkan rules on UpdateAfterBind
descriptors.
By @cwfitzgerald in #6811, #6815, and #6952.
New Features
General
v24.0.4
v24.0.3
v24.0.2
This release bumps wgpu-core
and wgpu-hal
to 24.0.2
. All other crates remain at their previous verions.
Bug Fixes
v24.0.1
This release contains wgpu
v24.0.1. All other crates remain at v24.0.0.
Bug Fixes
v24.0.0
Major changes
Refactored Dispatch Between wgpu-core
and webgpu
The crate wgpu
has two different "backends", one which targets webgpu in the browser, one which targets wgpu_core
on native platforms and webgl. This was previously very difficult to traverse and add new features to. The entire system was refactored to make it simpler. Additionally the new system has zero overhead if there is only one "backend" in use. You can see the new system in action by using go-to-definition on any wgpu functions in your IDE.
By @cwfitzgerald in #6619.
Most objects in wgpu
are now Clone
All types in the wgpu
API are now Clone
.
This is implemented with internal reference counting, so cloning for instance a Buffer
does copies only the "handle" of the GPU buffer, not the underlying resource.
Previously, libraries using wgpu
objects like Device
, Buffer
or Texture
etc. often had to manually wrap them in a Arc
to allow passing between libraries.
This caused a lot of friction since if one library wanted to use a Buffer
by value, calling code had to give up ownership of the resource which may interfere with other subsystems.
Note that this also mimics how the WebGPU javascript API works where objects can be cloned and moved around freely.
By @cwfitzgerald in #6665.
Render and Compute Passes Now Properly Enforce Their Lifetime
A regression introduced in 23.0.0 caused lifetimes of render and compute passes to be incorrectly enforced. While this is not
a soundness issue, the intent is to move an error from runtime to compile time. This issue has been fixed and restored to the 22.0.0 behavior.
Bindless (binding_array
) Grew More Capabilities
- DX12 now supports
PARTIALLY_BOUND_BINDING_ARRAY
on Resource Binding Tier 3 Hardware. This is most D3D12 hardware D3D12 Feature Table for more information on what hardware supports this feature. By @cwfitzgerald in #6734.
Device::create_shader_module_unchecked
Renamed and Now Has Configuration Options
create_shader_module_unchecked
became create_shader_module_trusted
.
This allows you to customize which exact checks are omitted so that you can get the correct balance of performance and safety for your use case. Calling the function is still unsafe, but now can be used to skip certain checks only on certain builds.
This also allows users to disable the workarounds in the msl-out
backend to prevent the compiler from optimizing infinite loops. This can have a big impact on performance, but is not recommended for untrusted shaders.
let desc: ShaderModuleDescriptor = include_wgsl!(...)
- let module = unsafe { device.create_shader_module_unchecked(desc) };
+ let module = unsafe { device.create_shader_module_trusted(desc, wgpu::ShaderRuntimeChecks::unchecked()) };
By @cwfitzgerald and @rudderbucky in #6662.
wgpu::Instance::new
now takes InstanceDescriptor
by reference
Previously wgpu::Instance::new
took InstanceDescriptor
by value (which is overall fairly uncommon in wgpu).
Furthermore, InstanceDescriptor
is now cloneable.
- let instance = wgpu::Instance::new(instance_desc);
+ let instance = wgpu::Instance::new(&instance_desc);
Environment Variable Handling Overhaul
Previously how various bits of code handled reading settings from environment variables was inconsistent and unideomatic.
We have unified it to (Type::from_env()
or Type::from_env_or_default()
) and Type::with_env
for all types.
- wgpu::util::backend_bits_from_env()
+ wgpu::Backends::from_env()
- wgpu::util::power_preference_from_env()
+ wgpu::PowerPreference::from_env()
- wgpu::util::dx12_shader_compiler_from_env()
+ wgpu::Dx12Compiler::from_env()
- wgpu::util::gles_minor_version_from_env()
+ wgpu::Gles3MinorVersion::from_env()
- wgpu::util::instance_descriptor_from_env()
+ wgpu::InstanceDescriptor::from_env_or_default()
- wgpu::util::parse_backends_from_comma_list(&str)
+ wgpu::Backends::from_comma_list(&str)
By @cwfitzgerald in #6895
Backend-specific instance options are now in separate structs
In order to better facilitate growing more interesting backend options, we have put them into individual structs. This allows users to more easily understand what options can be defaulted and which they care about. All of these new structs implement from_env()
and delegate to their respective from_env()
methods.
- let instance = wgpu::Instance::new(&wgpu::InstanceDescriptor {
- backends: wgpu::Backends::all(),
- flags: wgpu::InstanceFlags::default(),
- dx12_shader_compiler: wgpu::Dx12Compiler::Dxc,
- gles_minor_version: wgpu::Gles3MinorVersion::Automatic,
- });
+ let instance = wgpu::Instance::new(&wgpu::InstanceDescriptor {
+ backends: wgpu::Backends::all(),
+ flags: wgpu::InstanceFlags::default(),
+ backend_options: wgpu::BackendOptions {
+ dx12: wgpu::Dx12BackendOptions {
+ shader_compiler: wgpu::Dx12ShaderCompiler::Dxc,
+ },
+ gl: wgpu::GlBackendOptions {
+ gles_minor_version: wgpu::Gles3MinorVersion::Automatic,
+ },
+ },
+ });
If you do not need any of these options, or only need one backend's info use the default()
impl to fill out the remaining feelds.
By @cwfitzgerald in #6895
The diagnostic(…);
directive is now supported in WGSL
Naga now parses diagnostic(…);
directives according to the WGSL spec. This allows users to control certain lints, similar to Rust's allow
, warn
, and deny
attributes. For example, in standard WGSL (but, notably, not Naga yet—see #4369) this snippet would emit a uniformity error:
@group(0) @binding(0) var s : sampler;
@group(0) @binding(2) var tex : texture_2d<f32>;
@group(1) @binding(0) var<storage, read> ro_buffer : array<f32, 4>;
@fragment
fn main(@builtin(position) p : vec4f) -> @location(0) vec4f {
if ro_buffer[0] == 0 {
// Emits a derivative uniformity error during validation.
return textureSample(tex, s, vec2(0.,0.));
}
return vec4f(0.);
}
…but we can now silence it with the off
severity level, like so:
// Disable the diagnosic with this…
diagnostic(off, derivative_uniformity);
@group(0) @binding(0) var s : sampler;
@group(0) @binding(2) var tex : texture_2d<f32>;
@group(1) @binding(0) var<storage, read> ro_buffer : array<f32, 4>;
@fragment
fn main(@builtin(position) p : vec4f) -> @location(0) vec4f {
if ro_buffer[0] == 0 {
// Look ma, no error!
return textureSample(tex, s, vec2(0.,0.));
}
return vec4f(0.);
}
There are some limitations to keep in mind with this new functionality:
- We support
@diagnostic(…)
rules asfn
attributes, but prioritization for rules in statement positions (i.e.,if (…) @diagnostic(…) { … }
is unclear. If you are blocked by not being able to parsediagnostic(…)
rules in statement positions, please let us know in #5320, so we can determine how to prioritize it! - Standard WGSL specifies
error
,warning
,info
, andoff
severity levels. These are all technically usable now! A caveat, though: warning- and info-level are only emitted tostderr
via thelog
façade, rather than being reported through aResult::Err
in Naga or theCompilationInfo
interface inwgpu{,-core}
. This will require breaking changes in Naga to fix, and is being tracked by #6458. - Not all lints can be controlled with
diagnostic(…)
rules. In fact, only thederivative_uniformity
triggering rule exists in the WGSL standard. That said, Naga contributors are excited to see how this level of control unlocks a new ecosystem of configurable diagnostics. - Finally,
diagnostic(…)
rules are not yet emitted in WGSL output. This means thatwgsl-in
→wgsl-out
is currently a lossy process. We felt that it was important to unblock users who neededdiagnostic(…)
rules (i.e., #3135) before we took significant effort to fix this (tracked in #6496).
By @ErichDonGubler in #6456, #6148, #6533, #6353, #6537.
New Features
Naga
- Support atomic operations on fields of global structs in the SPIR-V frontend. By @schell in #6693.
- Clean up tests for atomic operations support in SPIR-V frontend. By @schell in #6692
- Fix an issue where
naga
CLI would incorrectly skip the first positional argument when--stdin-file-path
was specified. By @ErichDonGubler in #6480. - Fix textureNumLevels in the GLSL backend. By @magcius in #6483.
- Support 64-bit hex literals and unary operations in constants #6616.
- Implement
quantizeToF16()
for WGSL frontend, and WGSL, SPIR-V, HLSL, MSL, and GLSL backends. By @jamienicol in #6519. - Add support for GLSL
usampler*
andisampler*
. By @DavidPeicho in [#6513](https://github.co...
v23.1.0 (2024-12-16)
23.0.1 (2024-11-25)
This release includes patches for wgpu
, wgpu-core
and wgpu-hal
. All other crates remain at 23.0.0.
Below changes were cherry-picked from 24.0.0 development line.
Bug fixes
General
- Fix Texture view leaks regression. By @xiaopengli89 in #6576
Metal
Vulkan
v23.0.0 (2024-10-25)
Themes of this release
This release's theme is one that is likely to repeat for a few releases: convergence with the WebGPU specification! WGPU's design and base functionality are actually determined by two specifications: one for WebGPU, and one for the WebGPU Shading Language.
This may not sound exciting, but let us convince you otherwise! All major web browsers have committed to offering WebGPU in their environment. Even JS runtimes like Node and Deno have communities that are very interested in providing WebGPU! WebGPU is slowly eating the world, as it were. 😀 It's really important, then, that WebGPU implementations behave in ways that one would expect across all platforms. For example, if Firefox's WebGPU implementation were to break when running scripts and shaders that worked just fine in Chrome, that would mean sad users for both application authors and browser authors.
WGPU also benefits from standard, portable behavior in the same way as web browsers. Because of this behavior, it's generally fairly easy to port over usage of WebGPU in JavaScript to WGPU. It is also what lets WGPU go full circle: WGPU can be an implementation of WebGPU on native targets, but also it can use other implementations of WebGPU as a backend in JavaScript when compiled to WASM. Therefore, the same dynamic applies: if WGPU's own behavior were significantly different, then WGPU and end users would be sad, sad humans as soon as they discover places where their nice apps are breaking, right?
The answer is: yes, we do have sad, sad humans that really want their WGPU code to work everywhere. As Firefox and others use WGPU to implement WebGPU, the above example of Firefox diverging from standard is, unfortunately, today's reality. It mostly behaves the same as a standards-compliant WebGPU, but it still doesn't in many important ways. Of particular note is Naga, its implementation of the WebGPU Shader Language. Shaders are pretty much a black-and-white point of failure in GPU programming; if they don't compile, then you can't use the rest of the API! And yet, it's extremely easy to run into a case like that from #4400:
fn gimme_a_float() -> f32 {
return 42; // fails in Naga, but standard WGSL happily converts to `f32`
}
We intend to continue making visible strides in converging with specifications for WebGPU and WGSL, as this release has. This is, unfortunately, one of the major reasons that WGPU has no plans to work hard at keeping a SemVer-stable interface for the foreseeable future; we have an entire platform of GPU programming functionality we have to catch up with, and SemVer stability is unfortunately in tension with that. So, for now, you're going to keep seeing major releases and breaking changes. Where possible, we'll try to make that painless, but compromises to do so don't always make sense with our limited resources.
This is also the last planned major version release of 2024; the next milestone is set for January 1st, 2025, according to our regular 12-week cadence (offset from the originally planned date of 2024-10-09 for this release 😅). We'll see you next year!
Contributor spotlight: @sagudev
This release, we'd like to spotlight the work of @sagudev, who has made significant contributions to the WGPU ecosystem this release. Among other things, they contributed a particularly notable feature where runtime-known indices are finally allowed for use with const
array values. For example, this WGSL shader previously wasn't allowed:
const arr: array<u32, 4> = array(1, 2, 3, 4);
fn what_number_should_i_use(idx: u32) -> u32 {
return arr[idx];
}
…but now it works! This is significant because this sort of shader rejection was one of the most impactful issues we are aware of for converging with the WGSL specification. There are more still to go—some of which we expect to even more drastically change how folks author shaders—but we suspect that many more will come in the next few releases, including with @sagudev's help.
We're excited for more of @sagudev's contributions via the Servo community. Oh, did we forget to mention that these contributions were motivated by their work on Servo? That's right, a third well-known JavaScript runtime is now using WGPU to implement its WebGPU implementation. We're excited to support Servo to becoming another fully fledged browsing environment this way.
Major Changes
In addition to the above spotlight, we have the following particularly interesting items to call out for this release:
wgpu-core
is no longer generic over wgpu-hal
backends
Dynamic dispatch between different backends has been moved from the user facing wgpu
crate, to a new dynamic dispatch mechanism inside the backend abstraction layer wgpu-hal
.
Whenever targeting more than a single backend (default on Windows & Linux) this leads to faster compile times and smaller binaries! This also solves a long standing issue with cargo doc
failing to run for wgpu-core
.
Benchmarking indicated that compute pass recording is slower as a consequence, whereas on render passes speed improvements have been observed. However, this effort simplifies many of the internals of the wgpu family of crates which we're hoping to build performance improvements upon in the future.
By @Wumpf in #6069, #6099, #6100.
wgpu
's resources no longer have .global_id()
getters
wgpu-core
's internals no longer use nor need IDs and we are moving towards removing IDs completely. This is a step in that direction.
Current users of .global_id()
are encouraged to make use of the PartialEq
, Eq
, Hash
, PartialOrd
and Ord
traits that have now been implemented for wgpu
resources.
set_bind_group
now takes an Option
for the bind group argument.
https://gpuweb.github.io/gpuweb/#programmable-passes-bind-groups specifies that bindGroup is nullable. This change is the start of implementing this part of the spec. Callers that specify a Some()
value should have unchanged behavior. Handling of None
values still needs to be implemented by backends.
For convenience, the set_bind_group
on compute/render passes & encoders takes impl Into<Option<&BindGroup>>
, so most code should still work the same.
By @bradwerth in #6216.
entry_point
s are now Option
al
One of the changes in the WebGPU spec. (from about this time last year 😅) was to allow optional entry points in GPUProgrammableStage
. In wgpu
, this corresponds to a subset of fields in FragmentState
, VertexState
, and ComputeState
as the entry_point
member:
let render_pipeline = device.createRenderPipeline(wgpu::RenderPipelineDescriptor {
module,
entry_point: Some("cs_main"), // This is now `Option`al.
// …
});
let compute_pipeline = device.createComputePipeline(wgpu::ComputePipelineDescriptor {
module,
entry_point: None, // This is now `Option`al.
// …
});
When set to None
, it's assumed that the shader only has a single entry point associated with the pipeline stage (i.e., @compute
, @fragment
, or @vertex
). If there is not one and only one candidate entry point, then a validation error is returned. To continue the example, we might have written the above API usage with the following shader module:
// We can't use `entry_point: None` for compute pipelines with this module,
// because there are two `@compute` entry points.
@compute
fn cs_main() { /* … */ }
@compute
fn other_cs_main() { /* … */ }
// The following entry points _can_ be inferred from `entry_point: None` in a
// render pipeline, because they're the only `@vertex` and `@fragment` entry
// points:
@vertex
fn vs_main() { /* … */ }
@fragment
fn fs_main() { /* … */ }
WGPU's DX12 backend is now based on the windows
crate ecosystem, instead of the d3d12
crate
WGPU has retired the d3d12
crate (based on winapi
), and now uses the windows
crate for interfacing with Windows. For many, this may not be a change that affects day-to-day work. However, for users who need to vet their dependencies, or who may vendor in dependencies, this may be a nontrivial migration.
By @MarijnS95 in #6006.
New Features
Naga
- Support constant evaluation for
firstLeadingBit
andfirstTrailingBit
numeric built-ins in WGSL. Front-ends that translate to these built-ins also benefit from constant evaluation. By @ErichDonGubler in #5101. - Add
first
andeither
sampling types for@interpolate(flat, …)
in WGSL. By @ErichDonGubler in #6181. - Support for more atomic ops in the SPIR-V frontend. By @schell in #5824.
- Support local
const
declarations in WGSL. By @sagudev in #6156. - Implemented
const_assert
in WGSL. By @sagudev in #6198. - Support polyfilling
inverse
in WGSL. By @chyyran in [#6385]...