Per maintainer feedback, distill the @attention block down to the
essence: it's a texture for CPU readback, and can't be a GPU resource
at the same time. Add a one-line pointer to examples/30-picking for
the rendered-content case rather than spelling out the blit-to-staging
sequence inline.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bgfx::renderFrame() sets s_renderFrameCalled = true on its first call
(line 1541) and never clears it. bgfx::shutdown() resets s_threadIndex
to 0 (line 3958) but leaves s_renderFrameCalled sticky, which means
that on a 2nd renderFrame after shutdown the gate at line 1534 still
runs BGFX_CHECK_RENDER_THREAD(). That assertion checks
(s_ctx != NULL && single-threaded) || (~BGFX_API_THREAD_MAGIC == s_threadIndex);
both branches fail post-shutdown (s_ctx is NULL, s_threadIndex is 0),
and the assertion fires in debug builds — blocking re-init of the
bgfx context within the same process.
Adding the symmetric s_renderFrameCalled reset to shutdown() makes
shutdown leave bgfx in the same state as a fresh process, so a
subsequent renderFrame()+init() pair behaves like the first one.
Verified locally: a downstream test that constructs an SDL+bgfx
context, runs a frame, destroys it, then constructs another from the
same thread now passes in debug builds with this patch (it asserts at
line 1536 without it).
* Default-initialize SortKey members to eliminate UBSan bool hit
SortKey data members had no initializers; the only initialization path
was an explicit reset() call. Reading m_hasAlphaRef (bool) before
reset() ran surfaced under BabylonNative's -fno-sanitize-recover=all
UBSan as 'load of value 190, which is not a valid value for type bool'.
Integer siblings had the same hazard, but UBSan only flags bools.
Add in-class default initializers matching reset()'s values. Zero
runtime cost where reset() immediately follows. No API/ABI change.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Avoid shift-by-width UB in BitMaskToIndexIteratorT
UBSan on macOS flags the shifts at bgfx_p.h:854 and :862. bx::countTrailingZeros returns the type width (e.g. 32 for uint32_t) when its argument is zero, and that value was fed directly into the shift operator, producing shift-by-width UB.
Guard each shift against a zero mask so the width-valued ntz is never used as a shift exponent. Behavior is preserved for every input that previously avoided UB, and callers see isDone() == true in exactly the same iterations as before.
* Address review feedback: tighter BitMask ternary + SortKey reset() in EncoderImpl ctor
Per review on bkaradzic/bgfx#3688:
- BitMask: use branchless ternary (0 == _mask ? 0 : _mask >> ntz) in place of early-return guard.
- SortKey: drop in-class default initializers; initialize via m_key.reset() in EncoderImpl() ctor, which is the narrower location where the UB read originates.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Toggling BGFX_RESET_VSYNC via bgfx::reset() only updated the main/init swap
chain. Secondary swap chains created through createFrameBuffer(nwh, ...) kept
their initial vsync state, leaving framerate capped at refresh rate. D3D12
already handled this correctly.
D3D11: forward present flags to FrameBufferD3D11::present so it can pass
DXGI_PRESENT_ALLOW_TEARING alongside syncInterval=0. Compute presentFlags once
in flip() and reuse for both secondary framebuffers and the main swap chain.
Vulkan: in updateResolution, iterate m_windows and call FrameBufferVK::update
on each valid secondary framebuffer so SwapChainVK::update sees the new
BGFX_RESET_VSYNC and recreates the swapchain with the correct present mode.
OpenGL (WGL/EGL): wglSwapIntervalEXT is per-context, eglSwapInterval is
per-surface. Cache the desired interval in GlContext::m_swapInterval and
re-apply it in makeCurrent() whenever a different context/surface becomes
current, so secondary SwapChainGL instances pick up the current value. Also
honor the initial BGFX_RESET_VSYNC flag in create() instead of hard-coding 0.
The fragment shader CBV address was computed as gpuAddress + vsh->m_size,
but D3D12 requires constant buffer view addresses to be 256-byte aligned
(D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT). When the vertex shader
constant buffer exceeds 256 bytes without being a multiple of 256, the
fragment shader reads from a misaligned address, producing corrupt uniform
values.
This causes rendering artifacts or completely black output for any shader
program where the VS constant buffer size is not a multiple of 256 bytes,
since the fragment shader receives incorrect uniform data (e.g. a non-zero
alpha reference threshold that discards all pixels).
The fix aligns the VS constant buffer size to 256 bytes before placing the
FS constants, and uses the aligned offset when computing the FS CBV address
in both the main render loop and the blitter path.
* DX12: fix invalid state when creating a BLIT_DST texture without mem block
* Only set state when it's not external texture.
---------
Co-authored-by: Branimir Karadžić <branimirkaradzic@gmail.com>
The _external parameter is written to the command buffer as uint64_t (8 bytes)
but read back as uintptr_t (4 bytes on 32-bit). This causes a 4-byte
deserialization misalignment on x86, corrupting all subsequent command
buffer reads and leading to EXCEPTION_ACCESS_VIOLATION (0xC0000005) during
D3D11 initialization.
Co-authored-by: Matteo Valdina <mavaldin@microsoft.com>