Compare commits

..

95 Commits

Author SHA1 Message Date
Benjamin Doherty
031cd302dd Capture command callstacks for debugging 2023-10-30 14:42:24 -07:00
Benjamin Doherty
87351097ad Adjust NEW_RELEASE_NOTES to reflect cherry-picks 2023-08-30 16:30:10 -07:00
Ben Doherty
694766682f Update FrameCompletedCallback using directive (#7128) 2023-08-30 16:26:15 -07:00
Ben Doherty
c946ebd1e6 Make destroyFence asynchronous (#7127) 2023-08-30 16:22:37 -07:00
Romain Guy
25a8291101 Don't force masked blending for transmission/volume materials (#7126)
* Don't force masked blending for transmission/volume materials

glTF lets you choose your own alpha mode when using the transmission
and volume material extensions. We were forcing the masked mode which
was incorrect, except to pass the standard tests.

* Update release notes
2023-08-30 13:34:49 -07:00
Mathias Agopian
60c689688d attempt to fix remote ui (#7120)
fixes #7116
2023-08-30 08:46:53 -07:00
Romain Guy
763bc1f34a Fix possible NPE when updating fog options (#7123) 2023-08-30 08:44:39 -07:00
Romain Guy
ef07638eef Properly apply emissive to masked materials (#7122)
* Properly apply emissive to masked materials

The emissive property should not be multiplied by the color alpha
in masked materials. The alpha is treated as a coverage value in
that case, not an opacity value.

* Update release notes
2023-08-30 08:43:36 -07:00
Ben Doherty
0aa0efe159 Transition setFrameCompletedCallback to take a CallbackHandler (#7103) 2023-08-28 10:27:38 -07:00
Powei Feng
ef7bcd1e19 vulkan: refactor resource garbage collection (#7110) 2023-08-28 10:20:37 -07:00
Powei Feng
702ceda82a vulkan: fix debug marker pop (#7112) 2023-08-25 21:27:30 -07:00
Mathias Agopian
66b78074de Revert "workaround another PowerVR compiler bug "
This reverts commit 58f96be2c4.

This caused material files to increase in size significantly. It turns
out that glslang has to generate a copy for each parameter that is
passed to a function as a non-const parameter.


This revert will break IMG devices again, but that should be the case
only on debug builds. Release builds lose the const qualifier by 
virtue of going through spirv. We'll try to address this some other 
way later.
2023-08-25 15:31:00 -07:00
Mathias Agopian
8d440cea17 Update/Improve ViewerGUI
- separate out the settings for bloom, ssao and ssr
- update webgl binaries

- change default bloom resolution to 384 from 360 to have up to 7 
mipmap levels vertically
2023-08-25 09:53:52 -07:00
Mathias Agopian
3ab8e4d725 fix lenseflare effect
we were accessing an uninitialized LOD.
2023-08-25 09:53:33 -07:00
Mathias Agopian
97f20afdd7 remove the anonymous union in SamplerParams
- don't rely on it being 32-bits
- update the jni code to store SamplerParams in a long (64 bits)
  instead of a int. This gives us some future-proofing of the java side.
2023-08-24 21:36:42 -07:00
Mathias Agopian
42989e76d7 fix possible NPE crasher in timerquery
fixes #7106
2023-08-24 21:36:21 -07:00
Powei Feng
ad45cc9092 Release Filament 1.42.0 2023-08-22 13:36:35 -07:00
Ben Doherty
04669f6ab9 Add Engine query for stereoscopic support (#7086) 2023-08-22 12:41:56 -07:00
Powei Feng
c3c0dde82f vulkan: fix crashing Pixel 4xl adreno (#7087)
Adreno doesn't seem to like defining the size of arrays using a
`const int`.
2023-08-22 12:36:45 -07:00
Powei Feng
ecd5b681d0 Update MaterialEnums.h (#7098) 2023-08-21 10:49:44 -07:00
Romain Guy
66081e6cc1 Add fields used by JNI to proguard rules (#7096) 2023-08-21 10:39:34 -07:00
Jacob Su
aa6e94a128 Fix Mat cofactor UT error on Mac M2 chip machine. 2023-08-18 10:28:51 -07:00
Mathias Agopian
098be2e115 rework how we initialize the gl context (#7085)
* rework how we initialize the gl context

- early initialization is now implemented with static methods so that
  it's very clear which state they need.

- the version number is no longer used outside of initialization,
  instead we use the feature level.

- ES3.0 Adreno devices are downgraded to feature level 0

* Update filament/backend/src/opengl/OpenGLContext.cpp

Co-authored-by: Powei Feng <powei@google.com>

---------

Co-authored-by: Powei Feng <powei@google.com>
2023-08-18 10:25:10 -07:00
Mathias Agopian
17caf6cae9 improvements to CompilerThreadPool and OpenGLPlatform
CompilerThreadPool:
- it now supports a thread cleanup function
- some initialization is moved to the setup function

OpenGLPlatform:
- now cleans-up the thread pool threads upon exit
2023-08-17 20:11:57 -07:00
Mathias Agopian
26952631a3 only attempt to compile shaders in parallel if supported
It can be extremely counter productive to attempt to do this if not
supported.
2023-08-17 20:10:57 -07:00
Mathias Agopian
fc7b6447b7 make sure to not assert when matdbg is enabled 2023-08-17 20:09:55 -07:00
Powei Feng
6c0db37919 vulkan: fix readPixels selectMemory (#7084)
readPixels requests staging memory to be host-visible/coherent/cached.
But "cached" is not supported on Mali (Pixel 6pro).  We make it a
preferrable but optional bit.
2023-08-17 15:19:43 -07:00
Mathias Agopian
69f78dbcbe better fix for calls to eglMakeCurrent
turns out that KHR_surfaceless_context is implied for ES3.0 when 
KHR_create_context is present. However, Adreno 306 fails even if
it advertises it. So, we now reset the value of KHR_surfaceless_context
based on actually calling eglMakeCurrent(EGL_NO_SURFACE).
2023-08-16 22:46:36 -07:00
Mathias Agopian
c0389ac54c rework ShaderCompilerService to improve performance
- remove support for non-shared contextes parallel compilation.
  this wasn't used. we can always revive it later if we need to.

- rework how callbacks work so that we don't have to use a work list
  executed at each tick() in the shared context case (common case).
  this improves performance significantly on low-end devices, by
  not having to go through the list to check if all programs are
  compiled, multiple times per frame.

The new CallbackManager handles scheduling the callbacks after all
previous programs are compiled.
2023-08-16 22:46:13 -07:00
Mathias Agopian
c0db909c13 don't use eglMakeCurrent with EGL_NO_SURFACE unless we're allowed
EGL_KHR_surfaceless_context is needed to be able to use eglMakeCurrent
without an EGLSurface.
2023-08-16 21:35:29 -07:00
Ben Doherty
46e4e966b9 Fix assert with matdbg enabled (#7079) 2023-08-16 14:23:24 -07:00
Powei Feng
288b59a348 Fix missing createFence (#7076)
Continuing from #7072
2023-08-16 12:18:15 -07:00
Mathias Agopian
1c7293db8d fix fuzzyEqual
- the return value was inverted
- fuzzyEqual could generate alignment faults
- move it out of mat4 and mat2 because it was
  only used in one place.
2023-08-16 10:40:13 -07:00
Benjamin Doherty
6006b47c44 Release Filament 1.41.0 2023-08-15 17:11:38 -07:00
Ben Doherty
6bb29f6e01 Implement preliminary support for instanced stereo (#6967) 2023-08-15 17:08:11 -07:00
Mathias Agopian
f1b160db04 remove backend wait(timeout) API
The only use of this API was with a timeout 0 to check the fence
status. Timeouts other than zero could be very dangerous and since we're
not using that feature for now, we just get rid of it.


wait() is replaced with getFenceStatus(). It is currently only used by
the FrameSkipper.

This is not a public API.
2023-08-15 12:17:02 -07:00
Mathias Agopian
3bb52f083b Remove (unused) support for hardware fences
This code hasn't been used for a while and we should not resurrect it.
2023-08-15 12:17:02 -07:00
Mathias Agopian
88337ab358 use whenGpuCommandComplete to emulate platform fences
This is more appropriate (and simple) than runEveryNowAndThen because 
the later doesn't manage a fence, and therefore is more of a superset.
This will allow us to use a shared context implementation in the future.
2023-08-15 12:17:02 -07:00
Mathias Agopian
0935fe3fe3 cleanup the timer query implementations
We don't use runEveryNowAndThen for implementations that don't need it
(e.g. the EGL fence version, or the fallback version).
2023-08-15 12:17:02 -07:00
Mathias Agopian
96ed19549e reduce the number of shader compiler thread to two from four
more threads also use (much) more memory which can be a problem for
lower end devices
2023-08-14 10:20:01 -07:00
Mathias Agopian
945e9a2cb5 don't pin the GL thread on PowerVR 2023-08-14 10:20:01 -07:00
Ben Doherty
4d703e3807 Refactor CompilerThreadPool out of OpenGLDriver (#7067) 2023-08-11 17:03:36 -07:00
Mathias Agopian
f537f62adf Use 4 background threads for shader compiler on PowerVR
Since powervr supports parallel shader compilation well, we use 
4 background threads for shader compilation.
2023-08-11 15:22:19 -07:00
Mathias Agopian
7840404132 Enable read-only feedback loop for PowerVR
This is technically forbidden by the GLES 3.x specification but many
GPU support it, which saves us a depth buffer copy. 
Note that this is supported in GL desktop.
2023-08-11 15:22:19 -07:00
Mathias Agopian
afad361cac Workaround a PowerVR performance issue with destroying FBOs
Destroying the FBO target of a blit operation causes a stall similar
to calling glFnish().
We workaround this by delaying all FBO destructions to after the
GPU is finished with the current frame.
2023-08-11 15:22:19 -07:00
Mathias Agopian
58b23c290c Work around a PowerVR bug where gl_InstanceID is not initialized 2023-08-11 15:22:19 -07:00
Mathias Agopian
12a73137d7 workaround another PowerVR compiler bug
In some situation, functions with const parameter cause the shader
compilation to fail without an error message.

We remove all the `const` qualifiers on functions, assuming this
shouldn't impact code generation a lot.
2023-08-11 15:22:19 -07:00
Powei Feng
7a136eec5d vulkan: clean-up includes and refactor handle allocator (#7056) 2023-08-10 16:19:24 -07:00
Mathias Agopian
e9bd9ab3a6 Fix matdbg for 32 bits architectures 2023-08-09 19:19:05 -07:00
Mathias Agopian
083bff62e3 better handle "urgent" shader compilation
instead of moving the "urgent" compilation to the head of the queue,
we simply remove it from the queue and process it immediately. This
has the benefit that on drivers that truly support parallel compilation,
the latency will be reduced as we don't need to wait for the current
compile to finish.
2023-08-09 15:01:55 -07:00
Mathias Agopian
f713316541 Enable parallel shader compilation on more devices 2023-08-09 15:01:32 -07:00
Powei Feng
bcfdf2f70d Release Filament 1.40.5 2023-08-09 10:26:17 -07:00
Mathias Agopian
018d6f877f Workaround for some PowerVR devices
The PowerVR compiler systematically crashes on some devices when
`gl_Position` is written twice in the vertex shader.


fixes #5118, b/190221124
2023-08-08 08:59:59 -07:00
Mathias Agopian
1e4172b820 RenderTarget needs not to have a color attachment
This was a somewhat arbitrary requirement, some RenderTarget could be
depth only for instance.
2023-08-08 08:59:32 -07:00
Mathias Agopian
6b6827b70d add a GLES compiler unit test (#7050)
* add a GLES compiler unit test

* Update filament/test/compiler_test.cpp

Co-authored-by: Ben Doherty <bendoherty@google.com>

---------

Co-authored-by: Ben Doherty <bendoherty@google.com>
2023-08-08 08:59:06 -07:00
Romain Guy
96cccc83c6 Update BUILDING.md 2023-08-06 08:44:55 -07:00
Mathias Agopian
2468a3a854 fix a typo causing EXT_color_buffer_float enabled on al ES3 devices
b/287126679
2023-08-04 16:51:04 -07:00
Powei Feng
f68825f2ed vulkan: fix fence initialization (#7038)
Previously, we have a VulkanSync with a default constructor that
allows us to have sync objects that returns error when
actual fences are not yet present.  We need to replicate that
with VulkanFence since sync objects have been removed from the
API.

Fixes #7034
2023-08-04 11:13:57 -07:00
Mathias Agopian
2a12f71f96 fix a crash when shutting the engine down
We were breaking the promise of pending shader compilation jobs by 
destroying the corresponding std::promise embedded in the job queue. 
In practice there was no danger of a deadlock by construction, but
std::promise throws an exception in that case. On builds without
exception enabled, this we be turned into an abort().

We fix this by using our own mechanism for signaling instead of
std::promise. This ends up be more lightweight anyways.


Fix: #6933
2023-08-04 10:00:37 -07:00
Romain Guy
74b64a5451 Update README.md (#7037) 2023-08-03 15:03:03 -07:00
조다니엘(Daniel Cho)
7d01e0349b Fix rendering issue when using DoF 2023-08-03 13:49:57 -07:00
Mathias Agopian
80014bf2b1 fix a possible overflow when picking
We don't need to convert the object id to float,
instead we can just "reinterpret_cast" it.

With the current possible values of Entities, there was a risk of 
overflow once the age gets to 128 (very rare).
2023-08-03 13:49:14 -07:00
Mathias Agopian
adf3421f4a Workaround Adreno bug causing picking to fail
Adreno drivers don't support precision qualifiers in structs.

fixes #6997
2023-08-03 13:49:14 -07:00
Mathias Agopian
51c65ccfdc add picking to gltf_viewer for debugging 2023-08-02 12:40:16 -07:00
Powei Feng
5d37d08cf8 vulkan: fix TSAN in readpixels (#7023) 2023-08-01 16:29:27 -07:00
Benjamin Doherty
eb18d75b2e Release Filament 1.40.4 2023-08-01 15:37:51 -07:00
mackong
549c582287 engine: support setDepthFunc for MaterialInstance (#7004)
Co-authored-by: Mathias Agopian <mathias@google.com>
2023-07-31 15:49:48 -07:00
Mathias Agopian
6c05029a9f workaround a crash with some adreno drivers
The crashes are triggered by spirv-opt's MergeReturnPass, so we 
just disable it. This pass also caused issues with AMD drivers on macOS.


fixes b/291140208
2023-07-31 15:48:44 -07:00
Mathias Agopian
b2278986dd Update bug_report.md 2023-07-31 11:47:32 -07:00
Mathias Agopian
98a2b8f159 Improve FrameSkipper performance on GLES/Android
We get rid of the backend's HwSync object because on all platforms
but GL it was implemented just like a HwFence. We now use HwFence 
instead.

On GL platforms though, HwFence doesn't exist natively it is instead
provided by the Platform. In that case, we emulate it as with GLSync
objects -- the emulation incurs some latency that can cause frames
to be skipped.

On Android and platforms that provide the Fence functionality, there is
no such issue.

This change improves significantly frame pacing on Android.
2023-07-31 11:29:40 -07:00
Mathias Agopian
0ba891fb14 Fix off-by-one in FrameSkipper
The frame latency specified was off-by-one, i.e. a value of 1 meant a
latency of 2. The default was 2, which meant 3. Also it wasn't possible
to specify the max latency of 4, which would OOB.
2023-07-31 11:29:40 -07:00
Mathias Agopian
042cd670aa Improve fence-based timer query emulation
It now uses the fence for the start time and end time, leading to much
more accurate timings. We also use a single atomic variable instead of
two.
2023-07-31 11:29:40 -07:00
mackong
95c7e4d02b sample: fix typo 2023-07-27 10:03:11 -07:00
Mathias Agopian
d302525674 Add a way to query the validity of filament objects
Engine::isValid() can be used to check the validity of most filament
objects.
2023-07-27 10:02:31 -07:00
Mathias Agopian
3dbb7298f8 fix a cleanup of material parallel compilation
When the engine is shut down, it's possible for some parallel
compilation jobs (and callbacks) to be queued. We need to make sure
to clear the queues and call the callbacks before destroying the
parallel compilation service.


Fixes b/290388359
2023-07-27 10:02:31 -07:00
Mathias Agopian
0ed71ab53b fix an issue causing callbacks to be called too late
We were waiting for programs from both queues to be compiled before 
calling the callback associated with one queue. In practice this caused
the callback associated with high priority programs to be called only 
after low priority programs were ready.

Also cleanup-up "token" so that it doesn't store the priority.

Update the documentation and sample to better reflect what the 
implementation does.
2023-07-27 10:02:31 -07:00
Mathias Agopian
b1491ae5b1 Disable timer queries on all Mali GPUs
fixes b/233754398
2023-07-27 09:57:41 -07:00
Mathias Agopian
03b8dc8027 The "back" key will now terminate the gltf_viewer activity
This is useful for testing our shutdown code.
2023-07-27 09:57:14 -07:00
Mirsfang
e3568cd89f fix macos openggl compile process_ARB_shading_language_packing type conversion error (#6994) 2023-07-27 09:34:41 -07:00
Powei Feng
b35e24daa7 gltfio: exclude unsupported platforms from test (#7000) 2023-07-26 19:11:50 -07:00
Powei Feng
f506b27a31 gltfio: simple test for asset loading (#6990) 2023-07-26 16:50:06 -07:00
Benjamin Doherty
9452d5be1d Release Filament 1.40.3 2023-07-26 12:51:50 -07:00
Romain Guy
ea3f449a08 Update dependencies (#6992) 2023-07-26 11:05:09 -07:00
Ben Doherty
dc9510fe25 Support EXT_clip_cull_distance for future use (#6965)
This PR sets up the ability for shaders to use `gl_ClipDistance`, which will be needed in the future. Desktop GL supports this natively. OpenGL ES requires the EXT_clip_cull_distance extension.

Unfortunately glslang does not support this extension, so we have to employ a workaround for mobile when going through glslang. We instead write to `filament_gl_ClipDistance`, and then modify the SPIR-V to decorate this as `gl_ClipDistance`. See the comment in SpirvFixup.h.

Note this PR does not actually use `gl_ClipDistance` yet, so there should be no change to shaders.
2023-07-26 10:01:28 -07:00
Powei Feng
731dd761d9 vulkan: Implement async readPixels (#6695)
- Carry out readPixels without blocking and wait for the read to
   complete own a separate thread.
 - Add mContext.commands->wait() in finish()
 - Wait for readPixels to complete in finish()
 - Remove unused commandBuffer in Context
2023-07-25 14:07:56 -07:00
Powei Feng
6726ccb2fb vulkan: fix subpass validation (#6980)
- Before, we supposed that the maximum number of input attachment
   should match the maximum number of color attachments. But in
   reality, we've only used one input attachment for the second
   subpass.
 - The problem with the above supposition is that the descriptor
   set layout for the input attachment descriptor set must have the
   exact number of input attachment specified in the shader. If the
   *layout* has more input attachment slots than specified in the
   shader, then we'd run into a validation error.
 - In this patch, we fix the number of max input attachment in
   the descriptor set layout to 1, since we ever only make use of
   one.

Fixes #6513
2023-07-25 11:24:14 -07:00
Powei Feng
5fce0f9ecf filamentapp: fix vulkan dependency (#6987)
Fixes #6983
2023-07-24 16:50:51 -07:00
Mathias Agopian
ad03fc4118 fix typo that prevented the shader blob cache to work (#6979)
fixes b/290670707
2023-07-24 09:59:19 -07:00
Y-way
3f77ff8815 Build error on msvc 2022 2023-07-21 10:00:51 -07:00
Mathias Agopian
26b5fa1e38 fix a crash when using Material::compile with a callback
The work queue is sorted by priority but when we insert a notification
job we didn't have a priority to use for insertion, in addition the
priority was taken from the token, but for the notification job we don't
have a token.

The fix consists in passing the priority around so we have it when needed.
2023-07-21 10:00:21 -07:00
mackong
24286e6016 gltfio: fix crash when compute morph target without material 2023-07-21 09:08:22 -07:00
Ben Doherty
626577fe3d Fix ineffective FOG variant filter (#6968) 2023-07-20 11:54:18 -07:00
Romain Guy
176915b59c Add missing setParameter variants to MaterialInstance (#6973)
Adds support for mat3/mat4 parameter types
2023-07-20 11:19:35 -07:00
mackong
0b60933c2a web: remove const qualifier for getMaterialInstances (#6952) 2023-07-20 10:43:33 -07:00
Pawan Vimukthi
0e31d6936a Update android/Windows.md (#6964)
Updated the flag `filament-skip-samples` in the code snippet to align with the documentation.
2023-07-19 13:23:08 -07:00
205 changed files with 5962 additions and 2707 deletions

View File

@@ -4,6 +4,8 @@ about: Create a report to help us improve
---
⚠️ **Issues not using this template will be systematically closed.**
**Describe the bug**
A clear and concise description of what the bug is.
@@ -18,8 +20,8 @@ A clear and concise description of what you expected to happen.
If applicable, add screenshots to help explain your problem.
**Logs**
If applicable, copy logs from your console here. Please *do not*
use screenshots of logs, copy them as text.
If applicable, copy **full** logs from your console here. Please *do not*
use screenshots of logs, copy them as text, use gist or attach an *uncompressed* file.
**Desktop (please complete the following information):**
- OS: [e.g. iOS]

View File

@@ -5,7 +5,7 @@
To build Filament, you must first install the following tools:
- CMake 3.19 (or more recent)
- clang 7.0 (or more recent)
- clang 14.0 (or more recent)
- [ninja 1.10](https://github.com/ninja-build/ninja/wiki/Pre-built-Ninja-packages) (or more recent)
Additional dependencies may be required for your operating system. Please refer to the appropriate
@@ -87,10 +87,10 @@ Options can also be set with the CMake GUI.
Make sure you've installed the following dependencies:
- `clang-7` or higher
- `clang-14` or higher
- `libglu1-mesa-dev`
- `libc++-7-dev` (`libcxx-devel` and `libcxx-static` on Fedora) or higher
- `libc++abi-7-dev` (`libcxxabi-static` on Fedora) or higher
- `libc++-14-dev` (`libcxx-devel` and `libcxx-static` on Fedora) or higher
- `libc++abi-14-dev` (`libcxxabi-static` on Fedora) or higher
- `ninja-build`
- `libxi-dev`
- `libxcomposite-dev` (`libXcomposite-devel` on Fedora)
@@ -114,7 +114,7 @@ Your Linux distribution might default to `gcc` instead of `clang`, if that's the
```
$ mkdir out/cmake-release
$ cd out/cmake-release
# Or use a specific version of clang, for instance /usr/bin/clang-7
# Or use a specific version of clang, for instance /usr/bin/clang-14
$ CC=/usr/bin/clang CXX=/usr/bin/clang++ CXXFLAGS=-stdlib=libc++ \
cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=../release/filament ../..
```
@@ -124,8 +124,8 @@ solution is to use `update-alternatives` to both change the default compiler, an
specific version of clang:
```
$ update-alternatives --install /usr/bin/clang clang /usr/bin/clang-7 100
$ update-alternatives --install /usr/bin/clang++ clang++ /usr/bin/clang++-7 100
$ update-alternatives --install /usr/bin/clang clang /usr/bin/clang-14 100
$ update-alternatives --install /usr/bin/clang++ clang++ /usr/bin/clang++-14 100
$ update-alternatives --install /usr/bin/cc cc /usr/bin/clang 100
$ update-alternatives --install /usr/bin/c++ c++ /usr/bin/clang++ 100
```

View File

@@ -7,3 +7,9 @@ for next branch cut* header.
appropriate header in [RELEASE_NOTES.md](./RELEASE_NOTES.md).
## Release notes for next branch cut
- Fix possible NPE when updating fog options from Java/Kotlin
- The `emissive` property was not applied properly to `MASKED` materials, and could cause
dark fringes to appear (recompile materials)
- Allow glTF materials with transmission/volume extensions to choose their alpha mode
instead of forcing `MASKED`

View File

@@ -31,7 +31,7 @@ repositories {
}
dependencies {
implementation 'com.google.android.filament:filament-android:1.40.3'
implementation 'com.google.android.filament:filament-android:1.42.0'
}
```
@@ -40,6 +40,7 @@ Here are all the libraries available in the group `com.google.android.filament`:
| Artifact | Description |
| ------------- | ------------- |
| [![filament-android](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/filament-android/badge.svg?subject=filament-android)](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/filament-android) | The Filament rendering engine itself. |
| [![filament-android-debug](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/filament-android-debug/badge.svg?subject=filament-android-debug)](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/filament-android-debug) | Debug version of `filament-android`. |
| [![gltfio-android](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/gltfio-android/badge.svg?subject=gltfio-android)](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/gltfio-android) | A glTF 2.0 loader for Filament, depends on `filament-android`. |
| [![filament-utils-android](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/filament-utils-android/badge.svg?subject=filament-utils-android)](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/filament-utils-android) | KTX loading, Kotlin math, and camera utilities, depends on `gltfio-android`. |
| [![filamat-android](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/filamat-android/badge.svg?subject=filamat-android)](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/filamat-android) | A runtime material builder/compiler. This library is large but contains a full shader compiler/validator/optimizer and supports both OpenGL and Vulkan. |
@@ -50,7 +51,7 @@ Here are all the libraries available in the group `com.google.android.filament`:
iOS projects can use CocoaPods to install the latest release:
```
pod 'Filament', '~> 1.40.3'
pod 'Filament', '~> 1.42.0'
```
### Snapshots

View File

@@ -7,6 +7,37 @@ A new header is inserted each time a *tag* is created.
Instead, if you are authoring a PR for the main branch, add your release note to
[NEW_RELEASE_NOTES.md](./NEW_RELEASE_NOTES.md).
## v1.42.1
## v1.42.0
- engine: add preliminary support for instanced stereoscopic rendering [⚠️ **Recompile materials**]
## v1.41.0
- backend: fix #6997 : picking can fail on Adreno [⚠️ **New Material Version**]
- backend: A partial workaround for PowerVR devices (#5118, b/190221124) [⚠️ **Recompile Materials**]
## v1.40.5
- backend: Disable timer queries on all Mali GPUs (fixes b/233754398)
- engine: Add a way to query the validity of most filament objects (see `Engine::isValid`)
- opengl: fix b/290388359 : possible crash when shutting down the engine
- engine: Improve precision of frame time measurement when using emulated TimerQueries
- backend: Improve frame pacing on Android and Vulkan.
- backend: workaround b/291140208 (gltf_viewer crashes on Nexus 6P)
- engine: support `setDepthFunc` for `MaterialInstance`
- web: Added setDepthFunc()/getDepthFunc() to MaterialInstance
- android: Added setDepthFunc()/getDepthFunc() to MaterialInstance
## v1.40.4
- gltfio: fix crash when compute morph target without material
- matc: fix buggy `variant-filter` flag
- web: Added missing setMat3Parameter()/setMat4Parameter() to MaterialInstance
- opengl: fix b/290670707 : crash when using the blob cache
- engine: fix a crash with `Material::compile()` when a callback is specified
## v1.40.3
## v1.40.2

View File

@@ -135,7 +135,7 @@ gradlew -Pcom.google.android.filament.dist-dir=..\out\android-release\filament a
If you're only interested in building SDK, you may skip samples build by passing a `com.google.android.filament.skip-samples` flag:
```
gradlew -Pcom.google.android.filament.dist-dir=..\out\android-release\filament assembleRelease -Pfilament_skip_samples
gradlew -Pcom.google.android.filament.dist-dir=..\out\android-release\filament assembleRelease -Pcom.google.android.filament.skip-samples
```

View File

@@ -83,12 +83,12 @@ buildscript {
'minSdk': 19,
'targetSdk': 33,
'compileSdk': 33,
'kotlin': '1.8.20',
'kotlin_coroutines': '1.7.1',
'buildTools': '33.0.2',
'kotlin': '1.9.0',
'kotlin_coroutines': '1.7.2',
'buildTools': '34.0.0',
'ndk': '25.1.8937393',
'androidx_core': '1.10.0',
'androidx_annotations': '1.3.0'
'androidx_core': '1.10.1',
'androidx_annotations': '1.6.0'
]
ext.deps = [
@@ -104,7 +104,7 @@ buildscript {
]
dependencies {
classpath 'com.android.tools.build:gradle:8.0.2'
classpath 'com.android.tools.build:gradle:8.1.0'
classpath "org.jetbrains.kotlin:kotlin-gradle-plugin:${versions.kotlin}"
}
@@ -152,7 +152,7 @@ buildscript {
}
plugins {
id "io.github.gradle-nexus.publish-plugin" version "1.1.0"
id "io.github.gradle-nexus.publish-plugin" version "1.3.0"
}
// See https://github.com/gradle-nexus/publish-plugin

View File

@@ -278,6 +278,112 @@ Java_com_google_android_filament_Engine_nDestroyEntity(JNIEnv*, jclass,
engine->destroy(entity);
}
extern "C" JNIEXPORT jboolean JNICALL
Java_com_google_android_filament_Engine_nIsValidRenderer(JNIEnv*, jclass,
jlong nativeEngine, jlong nativeRenderer) {
Engine* engine = (Engine *)nativeEngine;
return (jboolean)engine->isValid((Renderer*)nativeRenderer);
}
extern "C" JNIEXPORT jboolean JNICALL
Java_com_google_android_filament_Engine_nIsValidView(JNIEnv*, jclass,
jlong nativeEngine, jlong nativeView) {
Engine* engine = (Engine *)nativeEngine;
return (jboolean)engine->isValid((View*)nativeView);
}
extern "C" JNIEXPORT jboolean JNICALL
Java_com_google_android_filament_Engine_nIsValidScene(JNIEnv*, jclass,
jlong nativeEngine, jlong nativeScene) {
Engine* engine = (Engine *)nativeEngine;
return (jboolean)engine->isValid((Scene*)nativeScene);
}
extern "C" JNIEXPORT jboolean JNICALL
Java_com_google_android_filament_Engine_nIsValidFence(JNIEnv*, jclass,
jlong nativeEngine, jlong nativeFence) {
Engine* engine = (Engine *)nativeEngine;
return (jboolean)engine->isValid((Fence*)nativeFence);
}
extern "C" JNIEXPORT jboolean JNICALL
Java_com_google_android_filament_Engine_nIsValidStream(JNIEnv*, jclass,
jlong nativeEngine, jlong nativeStream) {
Engine* engine = (Engine *)nativeEngine;
return (jboolean)engine->isValid((Stream*)nativeStream);
}
extern "C" JNIEXPORT jboolean JNICALL
Java_com_google_android_filament_Engine_nIsValidIndexBuffer(JNIEnv*, jclass,
jlong nativeEngine, jlong nativeIndexBuffer) {
Engine* engine = (Engine *)nativeEngine;
return (jboolean)engine->isValid((IndexBuffer*)nativeIndexBuffer);
}
extern "C" JNIEXPORT jboolean JNICALL
Java_com_google_android_filament_Engine_nIsValidVertexBuffer(JNIEnv*, jclass,
jlong nativeEngine, jlong nativeVertexBuffer) {
Engine* engine = (Engine *)nativeEngine;
return (jboolean)engine->isValid((VertexBuffer*)nativeVertexBuffer);
}
extern "C" JNIEXPORT jboolean JNICALL
Java_com_google_android_filament_Engine_nIsValidSkinningBuffer(JNIEnv*, jclass,
jlong nativeEngine, jlong nativeSkinningBuffer) {
Engine* engine = (Engine *)nativeEngine;
return (jboolean)engine->isValid((SkinningBuffer*)nativeSkinningBuffer);
}
extern "C" JNIEXPORT jboolean JNICALL
Java_com_google_android_filament_Engine_nIsValidIndirectLight(JNIEnv*, jclass,
jlong nativeEngine, jlong nativeIndirectLight) {
Engine* engine = (Engine *)nativeEngine;
return (jboolean)engine->isValid((IndirectLight*)nativeIndirectLight);
}
extern "C" JNIEXPORT jboolean JNICALL
Java_com_google_android_filament_Engine_nIsValidMaterial(JNIEnv*, jclass,
jlong nativeEngine, jlong nativeMaterial) {
Engine* engine = (Engine *)nativeEngine;
return (jboolean)engine->isValid((Material*)nativeMaterial);
}
extern "C" JNIEXPORT jboolean JNICALL
Java_com_google_android_filament_Engine_nIsValidSkybox(JNIEnv*, jclass,
jlong nativeEngine, jlong nativeSkybox) {
Engine* engine = (Engine *)nativeEngine;
return (jboolean)engine->isValid((Skybox*)nativeSkybox);
}
extern "C" JNIEXPORT jboolean JNICALL
Java_com_google_android_filament_Engine_nIsValidColorGrading(JNIEnv*, jclass,
jlong nativeEngine, jlong nativeColorGrading) {
Engine* engine = (Engine *)nativeEngine;
return (jboolean)engine->isValid((ColorGrading*)nativeColorGrading);
}
extern "C" JNIEXPORT jboolean JNICALL
Java_com_google_android_filament_Engine_nIsValidTexture(JNIEnv*, jclass,
jlong nativeEngine, jlong nativeTexture) {
Engine* engine = (Engine *)nativeEngine;
return (jboolean)engine->isValid((Texture*)nativeTexture);
}
extern "C" JNIEXPORT jboolean JNICALL
Java_com_google_android_filament_Engine_nIsValidRenderTarget(JNIEnv*, jclass,
jlong nativeEngine, jlong nativeTarget) {
Engine* engine = (Engine *)nativeEngine;
return (jboolean)engine->isValid((RenderTarget*)nativeTarget);
}
extern "C" JNIEXPORT jboolean JNICALL
Java_com_google_android_filament_Engine_nIsValidSwapChain(JNIEnv*, jclass,
jlong nativeEngine, jlong nativeSwapChain) {
Engine* engine = (Engine *)nativeEngine;
return (jboolean)engine->isValid((SwapChain*)nativeSwapChain);
}
extern "C" JNIEXPORT void JNICALL
Java_com_google_android_filament_Engine_nFlushAndWait(JNIEnv*, jclass,
jlong nativeEngine) {

View File

@@ -246,17 +246,21 @@ Java_com_google_android_filament_MaterialInstance_nSetFloatParameterArray(JNIEnv
env->ReleaseStringUTFChars(name_, name);
}
// defined in TextureSampler.cpp
namespace filament::JniUtils {
TextureSampler from_long(jlong params) noexcept;
} // TextureSamplerJniUtils
extern "C"
JNIEXPORT void JNICALL
Java_com_google_android_filament_MaterialInstance_nSetParameterTexture(
JNIEnv *env, jclass, jlong nativeMaterialInstance, jstring name_,
jlong nativeTexture, jint sampler_) {
jlong nativeTexture, jlong sampler_) {
MaterialInstance* instance = (MaterialInstance*) nativeMaterialInstance;
Texture* texture = (Texture*) nativeTexture;
TextureSampler& sampler = reinterpret_cast<TextureSampler&>(sampler_);
const char *name = env->GetStringUTFChars(name_, 0);
instance->setParameter(name, texture, sampler);
instance->setParameter(name, texture, JniUtils::from_long(sampler_));
env->ReleaseStringUTFChars(name_, name);
}
@@ -357,6 +361,14 @@ Java_com_google_android_filament_MaterialInstance_nSetDepthCulling(JNIEnv*,
instance->setDepthCulling(enable);
}
extern "C"
JNIEXPORT void JNICALL
Java_com_google_android_filament_MaterialInstance_nSetDepthFunc(JNIEnv*,
jclass, jlong nativeMaterialInstance, jlong function) {
MaterialInstance* instance = (MaterialInstance*) nativeMaterialInstance;
instance->setDepthFunc(static_cast<MaterialInstance::DepthFunc>(function));
}
extern "C"
JNIEXPORT void JNICALL
Java_com_google_android_filament_MaterialInstance_nSetStencilCompareFunction(JNIEnv*, jclass,
@@ -524,3 +536,11 @@ Java_com_google_android_filament_MaterialInstance_nIsDepthCullingEnabled(JNIEnv*
MaterialInstance* instance = (MaterialInstance*)nativeMaterialInstance;
return instance->isDepthCullingEnabled();
}
extern "C"
JNIEXPORT jint JNICALL
Java_com_google_android_filament_MaterialInstance_nGetDepthFunc(JNIEnv* env, jclass clazz,
jlong nativeMaterialInstance) {
MaterialInstance* instance = (MaterialInstance*)nativeMaterialInstance;
return (jint)instance->getDepthFunc();
}

View File

@@ -27,11 +27,10 @@ extern "C" JNIEXPORT void JNICALL
Java_com_google_android_filament_SwapChain_nSetFrameCompletedCallback(JNIEnv* env, jclass,
jlong nativeSwapChain, jobject handler, jobject runnable) {
SwapChain* swapChain = (SwapChain*) nativeSwapChain;
auto *callback = JniCallback::make(env, handler, runnable);
swapChain->setFrameCompletedCallback([](void* user) {
JniCallback* callback = (JniCallback*)user;
auto* callback = JniCallback::make(env, handler, runnable);
swapChain->setFrameCompletedCallback(nullptr, [callback](SwapChain* swapChain) {
JniCallback::postToJavaAndDestroy(callback);
}, callback);
});
}
extern "C" JNIEXPORT jboolean JNICALL

View File

@@ -18,142 +18,139 @@
#include <filament/TextureSampler.h>
#include <utils/algorithm.h>
using namespace filament;
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nCreateSampler(JNIEnv *env, jclass type, jint min,
namespace filament::JniUtils {
jlong to_long(TextureSampler const& sampler) noexcept {
return jlong(utils::bit_cast<uint32_t>(sampler.getSamplerParams()));
}
TextureSampler from_long(jlong params) noexcept {
return TextureSampler{
utils::bit_cast<backend::SamplerParams>(
static_cast<uint32_t>(params))};
}
} // namespace filament::JniUtils
using namespace JniUtils;
extern "C" JNIEXPORT jlong JNICALL
Java_com_google_android_filament_TextureSampler_nCreateSampler(JNIEnv *, jclass, jint min,
jint max, jint s, jint t, jint r) {
return TextureSampler(static_cast<TextureSampler::MinFilter>(min),
static_cast<TextureSampler::MagFilter>(max), static_cast<TextureSampler::WrapMode>(s),
static_cast<TextureSampler::WrapMode>(t),
static_cast<TextureSampler::WrapMode>(r)).getSamplerParams().u;
TextureSampler sampler(static_cast<TextureSampler::MinFilter>(min),
static_cast<TextureSampler::MagFilter>(max),
static_cast<TextureSampler::WrapMode>(s),
static_cast<TextureSampler::WrapMode>(t),
static_cast<TextureSampler::WrapMode>(r));
return to_long(sampler);
}
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nCreateCompareSampler(JNIEnv *env, jclass type,
extern "C" JNIEXPORT jlong JNICALL
Java_com_google_android_filament_TextureSampler_nCreateCompareSampler(JNIEnv *, jclass,
jint mode, jint function) {
return TextureSampler(static_cast<TextureSampler::CompareMode>(mode),
static_cast<TextureSampler::CompareFunc>(function)).getSamplerParams().u;
TextureSampler sampler(static_cast<TextureSampler::CompareMode>(mode),
static_cast<TextureSampler::CompareFunc>(function));
return to_long(sampler);
}
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nGetMinFilter(JNIEnv *env, jclass type,
jint sampler_) {
TextureSampler &sampler = reinterpret_cast<TextureSampler &>(sampler_);
return static_cast<jint>(sampler.getMinFilter());
Java_com_google_android_filament_TextureSampler_nGetMinFilter(JNIEnv *, jclass, jlong sampler) {
return static_cast<jint>(from_long(sampler).getMinFilter());
}
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nSetMinFilter(JNIEnv *env, jclass type,
jint sampler_, jint filter) {
TextureSampler &sampler = reinterpret_cast<TextureSampler &>(sampler_);
extern "C" JNIEXPORT jlong JNICALL
Java_com_google_android_filament_TextureSampler_nSetMinFilter(JNIEnv *, jclass, jlong sampler_, jint filter) {
TextureSampler sampler{from_long(sampler_)};
sampler.setMinFilter(static_cast<TextureSampler::MinFilter>(filter));
return sampler.getSamplerParams().u;
return to_long(sampler);
}
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nGetMagFilter(JNIEnv *env, jclass type,
jint sampler_) {
TextureSampler &sampler = reinterpret_cast<TextureSampler &>(sampler_);
return static_cast<jint>(sampler.getMagFilter());
Java_com_google_android_filament_TextureSampler_nGetMagFilter(JNIEnv *, jclass, jlong sampler) {
return static_cast<jint>(from_long(sampler).getMagFilter());
}
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nSetMagFilter(JNIEnv *env, jclass type,
jint sampler_, jint filter) {
TextureSampler &sampler = reinterpret_cast<TextureSampler &>(sampler_);
extern "C" JNIEXPORT jlong JNICALL
Java_com_google_android_filament_TextureSampler_nSetMagFilter(JNIEnv *, jclass, jlong sampler_, jint filter) {
TextureSampler sampler{from_long(sampler_)};
sampler.setMagFilter(static_cast<TextureSampler::MagFilter>(filter));
return sampler.getSamplerParams().u;
return to_long(sampler);
}
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nGetWrapModeS(JNIEnv *env, jclass type,
jint sampler_) {
TextureSampler &sampler = reinterpret_cast<TextureSampler &>(sampler_);
return static_cast<jint>(sampler.getWrapModeS());
Java_com_google_android_filament_TextureSampler_nGetWrapModeS(JNIEnv *, jclass, jlong sampler) {
return static_cast<jint>(from_long(sampler).getWrapModeS());
}
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nSetWrapModeS(JNIEnv *env, jclass type,
jint sampler_, jint mode) {
TextureSampler &sampler = reinterpret_cast<TextureSampler &>(sampler_);
extern "C" JNIEXPORT jlong JNICALL
Java_com_google_android_filament_TextureSampler_nSetWrapModeS(JNIEnv *, jclass, jlong sampler_, jint mode) {
TextureSampler sampler{from_long(sampler_)};
sampler.setWrapModeS(static_cast<TextureSampler::WrapMode>(mode));
return sampler.getSamplerParams().u;
return to_long(sampler);
}
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nGetWrapModeT(JNIEnv *env, jclass type,
jint sampler_) {
TextureSampler &sampler = reinterpret_cast<TextureSampler &>(sampler_);
return static_cast<jint>(sampler.getWrapModeT());
Java_com_google_android_filament_TextureSampler_nGetWrapModeT(JNIEnv *, jclass, jlong sampler) {
return static_cast<jint>(from_long(sampler).getWrapModeT());
}
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nSetWrapModeT(JNIEnv *env, jclass type,
jint sampler_, jint mode) {
TextureSampler &sampler = reinterpret_cast<TextureSampler &>(sampler_);
extern "C" JNIEXPORT jlong JNICALL
Java_com_google_android_filament_TextureSampler_nSetWrapModeT(JNIEnv *, jclass, jlong sampler_, jint mode) {
TextureSampler sampler{from_long(sampler_)};
sampler.setWrapModeT(static_cast<TextureSampler::WrapMode>(mode));
return sampler.getSamplerParams().u;
return to_long(sampler);
}
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nGetWrapModeR(JNIEnv *env, jclass type,
jint sampler_) {
TextureSampler &sampler = reinterpret_cast<TextureSampler &>(sampler_);
return static_cast<jint>(sampler.getWrapModeR());
Java_com_google_android_filament_TextureSampler_nGetWrapModeR(JNIEnv *, jclass, jlong sampler) {
return static_cast<jint>(from_long(sampler).getWrapModeR());
}
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nSetWrapModeR(JNIEnv *env, jclass type,
jint sampler_, jint mode) {
TextureSampler &sampler = reinterpret_cast<TextureSampler &>(sampler_);
extern "C" JNIEXPORT jlong JNICALL
Java_com_google_android_filament_TextureSampler_nSetWrapModeR(JNIEnv *, jclass, jlong sampler_, jint mode) {
TextureSampler sampler{from_long(sampler_)};
sampler.setWrapModeR(static_cast<TextureSampler::WrapMode>(mode));
return sampler.getSamplerParams().u;
return to_long(sampler);
}
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nGetCompareMode(JNIEnv *env, jclass type,
jint sampler_) {
TextureSampler &sampler = reinterpret_cast<TextureSampler &>(sampler_);
return static_cast<jint>(sampler.getCompareMode());
Java_com_google_android_filament_TextureSampler_nGetCompareMode(JNIEnv *, jclass, jlong sampler) {
return static_cast<jint>(from_long(sampler).getCompareMode());
}
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nSetCompareMode(JNIEnv *env, jclass type,
jint sampler_, jint mode) {
TextureSampler &sampler = reinterpret_cast<TextureSampler &>(sampler_);
extern "C" JNIEXPORT jlong JNICALL
Java_com_google_android_filament_TextureSampler_nSetCompareMode(JNIEnv *, jclass, jlong sampler_, jint mode) {
TextureSampler sampler{from_long(sampler_)};
sampler.setCompareMode(static_cast<TextureSampler::CompareMode>(mode),
sampler.getCompareFunc());
return sampler.getSamplerParams().u;
return to_long(sampler);
}
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nGetCompareFunction(JNIEnv *env, jclass type,
jint sampler_) {
TextureSampler &sampler = reinterpret_cast<TextureSampler &>(sampler_);
return static_cast<jint>(sampler.getCompareFunc());
Java_com_google_android_filament_TextureSampler_nGetCompareFunction(JNIEnv *, jclass, jlong sampler) {
return static_cast<jint>(from_long(sampler).getCompareFunc());
}
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nSetCompareFunction(JNIEnv *env, jclass type,
jint sampler_, jint function) {
TextureSampler &sampler = reinterpret_cast<TextureSampler &>(sampler_);
extern "C" JNIEXPORT jlong JNICALL
Java_com_google_android_filament_TextureSampler_nSetCompareFunction(JNIEnv *, jclass, jlong sampler_, jint function) {
TextureSampler sampler{from_long(sampler_)};
sampler.setCompareMode(sampler.getCompareMode(),
static_cast<TextureSampler::CompareFunc>(function));
return sampler.getSamplerParams().u;
return to_long(sampler);
}
extern "C" JNIEXPORT jfloat JNICALL
Java_com_google_android_filament_TextureSampler_nGetAnisotropy(JNIEnv *env, jclass type,
jint sampler_) {
TextureSampler &sampler = reinterpret_cast<TextureSampler &>(sampler_);
return sampler.getAnisotropy();
Java_com_google_android_filament_TextureSampler_nGetAnisotropy(JNIEnv *, jclass, jlong sampler) {
return from_long(sampler).getAnisotropy();
}
extern "C" JNIEXPORT jint JNICALL
Java_com_google_android_filament_TextureSampler_nSetAnisotropy(JNIEnv *env, jclass type,
jint sampler_, jfloat anisotropy) {
TextureSampler &sampler = reinterpret_cast<TextureSampler &>(sampler_);
extern "C" JNIEXPORT jlong JNICALL
Java_com_google_android_filament_TextureSampler_nSetAnisotropy(JNIEnv *, jclass, jlong sampler_, jfloat anisotropy) {
TextureSampler sampler{from_long(sampler_)};
sampler.setAnisotropy(anisotropy);
return sampler.getSamplerParams().u;
return to_long(sampler);
}

View File

@@ -449,6 +449,141 @@ public class Engine {
swapChain.clearNativeObject();
}
/**
* Returns whether the object is valid.
* @param object Object to check for validity
* @return returns true if the specified object is valid.
*/
public boolean isValidRenderer(@NonNull Renderer object) {
return nIsValidRenderer(getNativeObject(), object.getNativeObject());
}
/**
* Returns whether the object is valid.
* @param object Object to check for validity
* @return returns true if the specified object is valid.
*/
public boolean isValidView(@NonNull View object) {
return nIsValidView(getNativeObject(), object.getNativeObject());
}
/**
* Returns whether the object is valid.
* @param object Object to check for validity
* @return returns true if the specified object is valid.
*/
public boolean isValidScene(@NonNull Scene object) {
return nIsValidScene(getNativeObject(), object.getNativeObject());
}
/**
* Returns whether the object is valid.
* @param object Object to check for validity
* @return returns true if the specified object is valid.
*/
public boolean isValidFence(@NonNull Fence object) {
return nIsValidFence(getNativeObject(), object.getNativeObject());
}
/**
* Returns whether the object is valid.
* @param object Object to check for validity
* @return returns true if the specified object is valid.
*/
public boolean isValidStream(@NonNull Stream object) {
return nIsValidStream(getNativeObject(), object.getNativeObject());
}
/**
* Returns whether the object is valid.
* @param object Object to check for validity
* @return returns true if the specified object is valid.
*/
public boolean isValidIndexBuffer(@NonNull IndexBuffer object) {
return nIsValidIndexBuffer(getNativeObject(), object.getNativeObject());
}
/**
* Returns whether the object is valid.
* @param object Object to check for validity
* @return returns true if the specified object is valid.
*/
public boolean isValidVertexBuffer(@NonNull VertexBuffer object) {
return nIsValidVertexBuffer(getNativeObject(), object.getNativeObject());
}
/**
* Returns whether the object is valid.
* @param object Object to check for validity
* @return returns true if the specified object is valid.
*/
public boolean isValidSkinningBuffer(@NonNull SkinningBuffer object) {
return nIsValidSkinningBuffer(getNativeObject(), object.getNativeObject());
}
/**
* Returns whether the object is valid.
* @param object Object to check for validity
* @return returns true if the specified object is valid.
*/
public boolean isValidIndirectLight(@NonNull IndirectLight object) {
return nIsValidIndirectLight(getNativeObject(), object.getNativeObject());
}
/**
* Returns whether the object is valid.
* @param object Object to check for validity
* @return returns true if the specified object is valid.
*/
public boolean isValidMaterial(@NonNull Material object) {
return nIsValidMaterial(getNativeObject(), object.getNativeObject());
}
/**
* Returns whether the object is valid.
* @param object Object to check for validity
* @return returns true if the specified object is valid.
*/
public boolean isValidSkybox(@NonNull Skybox object) {
return nIsValidSkybox(getNativeObject(), object.getNativeObject());
}
/**
* Returns whether the object is valid.
* @param object Object to check for validity
* @return returns true if the specified object is valid.
*/
public boolean isValidColorGrading(@NonNull ColorGrading object) {
return nIsValidColorGrading(getNativeObject(), object.getNativeObject());
}
/**
* Returns whether the object is valid.
* @param object Object to check for validity
* @return returns true if the specified object is valid.
*/
public boolean isValidTexture(@NonNull Texture object) {
return nIsValidTexture(getNativeObject(), object.getNativeObject());
}
/**
* Returns whether the object is valid.
* @param object Object to check for validity
* @return returns true if the specified object is valid.
*/
public boolean isValidRenderTarget(@NonNull RenderTarget object) {
return nIsValidRenderTarget(getNativeObject(), object.getNativeObject());
}
/**
* Returns whether the object is valid.
* @param object Object to check for validity
* @return returns true if the specified object is valid.
*/
public boolean isValidSwapChain(@NonNull SwapChain object) {
return nIsValidSwapChain(getNativeObject(), object.getNativeObject());
}
// View
/**
@@ -785,17 +920,17 @@ public class Engine {
private static native long nCreateSwapChain(long nativeEngine, Object nativeWindow, long flags);
private static native long nCreateSwapChainHeadless(long nativeEngine, int width, int height, long flags);
private static native long nCreateSwapChainFromRawPointer(long nativeEngine, long pointer, long flags);
private static native boolean nDestroySwapChain(long nativeEngine, long nativeSwapChain);
private static native long nCreateView(long nativeEngine);
private static native boolean nDestroyView(long nativeEngine, long nativeView);
private static native long nCreateRenderer(long nativeEngine);
private static native boolean nDestroyRenderer(long nativeEngine, long nativeRenderer);
private static native long nCreateCamera(long nativeEngine, int entity);
private static native long nGetCameraComponent(long nativeEngine, int entity);
private static native void nDestroyCameraComponent(long nativeEngine, int entity);
private static native long nCreateScene(long nativeEngine);
private static native boolean nDestroyScene(long nativeEngine, long nativeScene);
private static native long nCreateFence(long nativeEngine);
private static native boolean nDestroyRenderer(long nativeEngine, long nativeRenderer);
private static native boolean nDestroyView(long nativeEngine, long nativeView);
private static native boolean nDestroyScene(long nativeEngine, long nativeScene);
private static native boolean nDestroyFence(long nativeEngine, long nativeFence);
private static native boolean nDestroyStream(long nativeEngine, long nativeStream);
private static native boolean nDestroyIndexBuffer(long nativeEngine, long nativeIndexBuffer);
@@ -808,6 +943,22 @@ public class Engine {
private static native boolean nDestroyColorGrading(long nativeEngine, long nativeColorGrading);
private static native boolean nDestroyTexture(long nativeEngine, long nativeTexture);
private static native boolean nDestroyRenderTarget(long nativeEngine, long nativeTarget);
private static native boolean nDestroySwapChain(long nativeEngine, long nativeSwapChain);
private static native boolean nIsValidRenderer(long nativeEngine, long nativeRenderer);
private static native boolean nIsValidView(long nativeEngine, long nativeView);
private static native boolean nIsValidScene(long nativeEngine, long nativeScene);
private static native boolean nIsValidFence(long nativeEngine, long nativeFence);
private static native boolean nIsValidStream(long nativeEngine, long nativeStream);
private static native boolean nIsValidIndexBuffer(long nativeEngine, long nativeIndexBuffer);
private static native boolean nIsValidVertexBuffer(long nativeEngine, long nativeVertexBuffer);
private static native boolean nIsValidSkinningBuffer(long nativeEngine, long nativeSkinningBuffer);
private static native boolean nIsValidIndirectLight(long nativeEngine, long nativeIndirectLight);
private static native boolean nIsValidMaterial(long nativeEngine, long nativeMaterial);
private static native boolean nIsValidSkybox(long nativeEngine, long nativeSkybox);
private static native boolean nIsValidColorGrading(long nativeEngine, long nativeColorGrading);
private static native boolean nIsValidTexture(long nativeEngine, long nativeTexture);
private static native boolean nIsValidRenderTarget(long nativeEngine, long nativeTarget);
private static native boolean nIsValidSwapChain(long nativeEngine, long nativeSwapChain);
private static native void nDestroyEntity(long nativeEngine, int entity);
private static native void nFlushAndWait(long nativeEngine);
private static native long nGetTransformManager(long nativeEngine);

View File

@@ -625,6 +625,15 @@ public class MaterialInstance {
nSetDepthCulling(getNativeObject(), enable);
}
/**
* Sets the depth comparison function (default is {@link TextureSampler.CompareFunction#GE}).
*
* @param func the depth comparison function
*/
public void setDepthFunc(TextureSampler.CompareFunction func) {
nSetDepthFunc(getNativeObject(), func.ordinal());
}
/**
* Returns whether depth culling is enabled.
*/
@@ -632,6 +641,13 @@ public class MaterialInstance {
return nIsDepthCullingEnabled(getNativeObject());
}
/**
* Returns the depth comparison function.
*/
public TextureSampler.CompareFunction getDepthFunc() {
return TextureSampler.EnumCache.sCompareFunctionValues[nGetDepthFunc(getNativeObject())];
}
/**
* Sets the stencil comparison function (default is {@link TextureSampler.CompareFunction#ALWAYS}).
*
@@ -884,7 +900,7 @@ public class MaterialInstance {
@IntRange(from = 0) int offset, @IntRange(from = 1) int count);
private static native void nSetParameterTexture(long nativeMaterialInstance,
@NonNull String name, long nativeTexture, int sampler);
@NonNull String name, long nativeTexture, long sampler);
private static native void nSetScissor(long nativeMaterialInstance,
@IntRange(from = 0) int left, @IntRange(from = 0) int bottom,
@@ -908,6 +924,7 @@ public class MaterialInstance {
private static native void nSetDepthWrite(long nativeMaterialInstance, boolean enable);
private static native void nSetStencilWrite(long nativeMaterialInstance, boolean enable);
private static native void nSetDepthCulling(long nativeMaterialInstance, boolean enable);
private static native void nSetDepthFunc(long nativeMaterialInstance, long function);
private static native void nSetStencilCompareFunction(long nativeMaterialInstance,
long function, long face);
@@ -939,4 +956,5 @@ public class MaterialInstance {
private static native boolean nIsDepthWriteEnabled(long nativeMaterialInstance);
private static native boolean nIsStencilWriteEnabled(long nativeMaterialInstance);
private static native boolean nIsDepthCullingEnabled(long nativeMaterialInstance);
private static native int nGetDepthFunc(long nativeMaterialInstance);
}

View File

@@ -81,8 +81,6 @@ public class RenderTarget {
/**
* Sets a texture to a given attachment point.
*
* <p>All RenderTargets must have a non-null <code>COLOR</code> attachment.</p>
*
* @param attachment The attachment point of the texture.
* @param texture The associated texture object.
* @return A reference to this Builder for chaining calls.

View File

@@ -137,10 +137,6 @@ public class SwapChain {
* </p>
*
* <p>
* The FrameCompletedCallback is guaranteed to be called on the main Filament thread.
* </p>
*
* <p>
* Warning: Only Filament's Metal backend supports frame callbacks. Other backends ignore the
* callback (which will never be called) and proceed normally.
* </p>

View File

@@ -126,7 +126,7 @@ public class TextureSampler {
NEVER
}
int mSampler = 0; // bit field used by native
long mSampler = 0; // bit field used by native
/**
* Initializes the <code>TextureSampler</code> with default values.
@@ -342,26 +342,26 @@ public class TextureSampler {
}
}
private static native int nCreateSampler(int min, int max, int s, int t, int r);
private static native int nCreateCompareSampler(int mode, int function);
private static native long nCreateSampler(int min, int max, int s, int t, int r);
private static native long nCreateCompareSampler(int mode, int function);
private static native int nGetMinFilter(int sampler);
private static native int nSetMinFilter(int sampler, int filter);
private static native int nGetMagFilter(int sampler);
private static native int nSetMagFilter(int sampler, int filter);
private static native int nGetMinFilter(long sampler);
private static native long nSetMinFilter(long sampler, int filter);
private static native int nGetMagFilter(long sampler);
private static native long nSetMagFilter(long sampler, int filter);
private static native int nGetWrapModeS(int sampler);
private static native int nSetWrapModeS(int sampler, int mode);
private static native int nGetWrapModeT(int sampler);
private static native int nSetWrapModeT(int sampler, int mode);
private static native int nGetWrapModeR(int sampler);
private static native int nSetWrapModeR(int sampler, int mode);
private static native int nGetWrapModeS(long sampler);
private static native long nSetWrapModeS(long sampler, int mode);
private static native int nGetWrapModeT(long sampler);
private static native long nSetWrapModeT(long sampler, int mode);
private static native int nGetWrapModeR(long sampler);
private static native long nSetWrapModeR(long sampler, int mode);
private static native int nGetCompareMode(int sampler);
private static native int nSetCompareMode(int sampler, int mode);
private static native int nGetCompareFunction(int sampler);
private static native int nSetCompareFunction(int sampler, int function);
private static native int nGetCompareMode(long sampler);
private static native long nSetCompareMode(long sampler, int mode);
private static native int nGetCompareFunction(long sampler);
private static native long nSetCompareFunction(long sampler, int function);
private static native float nGetAnisotropy(int sampler);
private static native int nSetAnisotropy(int sampler, float anisotropy);
private static native float nGetAnisotropy(long sampler);
private static native long nSetAnisotropy(long sampler, float anisotropy);
}

View File

@@ -27,6 +27,8 @@ import static com.google.android.filament.Asserts.assertFloat3In;
import static com.google.android.filament.Asserts.assertFloat4In;
import static com.google.android.filament.Colors.LinearColor;
import com.google.android.filament.proguard.UsedByNative;
/**
* Encompasses all the state needed for rendering a {@link Scene}.
*
@@ -965,7 +967,8 @@ public class View {
options.heightFalloff, options.cutOffDistance,
options.color[0], options.color[1], options.color[2],
options.density, options.inScatteringStart, options.inScatteringSize,
options.fogColorFromIbl, options.skyColor.getNativeObject(),
options.fogColorFromIbl,
options.skyColor == null ? 0 : options.skyColor.getNativeObject(),
options.enabled);
}
@@ -1095,10 +1098,29 @@ public class View {
nPick(getNativeObject(), x, y, handler, internalCallback);
}
@UsedByNative("View.cpp")
private static class InternalOnPickCallback implements Runnable {
private final OnPickCallback mUserCallback;
private final PickingQueryResult mPickingQueryResult = new PickingQueryResult();
@UsedByNative("View.cpp")
@Entity
int mRenderable;
@UsedByNative("View.cpp")
float mDepth;
@UsedByNative("View.cpp")
float mFragCoordsX;
@UsedByNative("View.cpp")
float mFragCoordsY;
@UsedByNative("View.cpp")
float mFragCoordsZ;
public InternalOnPickCallback(OnPickCallback mUserCallback) {
this.mUserCallback = mUserCallback;
}
@Override
public void run() {
mPickingQueryResult.renderable = mRenderable;
@@ -1108,13 +1130,6 @@ public class View {
mPickingQueryResult.fragCoords[2] = mFragCoordsZ;
mUserCallback.onPick(mPickingQueryResult);
}
private final OnPickCallback mUserCallback;
private final PickingQueryResult mPickingQueryResult = new PickingQueryResult();
@Entity int mRenderable;
float mDepth;
float mFragCoordsX;
float mFragCoordsY;
float mFragCoordsZ;
}
/**
@@ -1377,13 +1392,13 @@ public class View {
/**
* resolution of vertical axis (2^levels to 2048)
*/
public int resolution = 360;
public int resolution = 384;
/**
* bloom x/y aspect-ratio (1/32 to 32)
*/
public float anamorphism = 1.0f;
/**
* number of blur levels (3 to 11)
* number of blur levels (1 to 11)
*/
public int levels = 6;
/**
@@ -1971,4 +1986,11 @@ public class View {
*/
public float penumbraRatioScale = 1.0f;
}
/**
* Options for stereoscopic (multi-eye) rendering.
*/
public static class StereoscopicOptions {
public boolean enabled = false;
}
}

View File

@@ -1,5 +1,5 @@
GROUP=com.google.android.filament
VERSION_NAME=1.40.3
VERSION_NAME=1.42.0
POM_DESCRIPTION=Real-time physically based rendering engine for Android.

View File

@@ -28,6 +28,7 @@ import com.google.android.filament.Fence
import com.google.android.filament.IndirectLight
import com.google.android.filament.Skybox
import com.google.android.filament.View
import com.google.android.filament.View.OnPickCallback
import com.google.android.filament.utils.*
import kotlinx.coroutines.CoroutineScope
import kotlinx.coroutines.Dispatchers
@@ -56,7 +57,9 @@ class MainActivity : Activity() {
private lateinit var modelViewer: ModelViewer
private lateinit var titlebarHint: TextView
private val doubleTapListener = DoubleTapListener()
private val singleTapListener = SingleTapListener()
private lateinit var doubleTapDetector: GestureDetector
private lateinit var singleTapDetector: GestureDetector
private var remoteServer: RemoteServer? = null
private var statusToast: Toast? = null
private var statusText: String? = null
@@ -77,6 +80,7 @@ class MainActivity : Activity() {
choreographer = Choreographer.getInstance()
doubleTapDetector = GestureDetector(applicationContext, doubleTapListener)
singleTapDetector = GestureDetector(applicationContext, singleTapListener)
modelViewer = ModelViewer(surfaceView)
viewerContent.view = modelViewer.view
@@ -88,6 +92,7 @@ class MainActivity : Activity() {
surfaceView.setOnTouchListener { _, event ->
modelViewer.onTouchEvent(event)
doubleTapDetector.onTouchEvent(event)
singleTapDetector.onTouchEvent(event)
true
}
@@ -229,6 +234,7 @@ class MainActivity : Activity() {
modelViewer.scene.skybox = sky
modelViewer.scene.indirectLight = ibl
viewerContent.indirectLight = ibl
}
}
}
@@ -337,6 +343,11 @@ class MainActivity : Activity() {
remoteServer?.close()
}
override fun onBackPressed() {
super.onBackPressed()
finish()
}
fun loadModelData(message: RemoteServer.ReceivedMessage) {
Log.i(TAG, "Downloaded model ${message.label} (${message.buffer.capacity()} bytes)")
clearStatusText()
@@ -425,4 +436,19 @@ class MainActivity : Activity() {
return super.onDoubleTap(e)
}
}
// Just for testing purposes
inner class SingleTapListener : GestureDetector.SimpleOnGestureListener() {
override fun onSingleTapUp(event: MotionEvent): Boolean {
modelViewer.view.pick(
event.x.toInt(),
surfaceView.height - event.y.toInt(),
surfaceView.handler, {
val name = modelViewer.asset!!.getName(it.renderable)
Log.v("Filament", "Picked ${it.renderable}: " + name)
},
)
return super.onSingleTapUp(event)
}
}
}

View File

@@ -118,9 +118,10 @@ class MainActivity : Activity() {
}
private fun setupView() {
val ssaoOptions = view.ambientOcclusionOptions
ssaoOptions.enabled = true
view.ambientOcclusionOptions = ssaoOptions
// ambient occlusion is the cheapest effect that adds a lot of quality
view.ambientOcclusionOptions = view.ambientOcclusionOptions.apply {
enabled = true
}
// NOTE: Try to disable post-processing (tone-mapping, etc.) to see the difference
// view.isPostProcessingEnabled = false

View File

@@ -1139,7 +1139,8 @@ Type
: array of `string`
Value
: Each entry must be any of `dynamicLighting`, `directionalLighting`, `shadowReceiver`,`skinning` or `ssr`.
: Each entry must be any of `dynamicLighting`, `directionalLighting`, `shadowReceiver`,
`skinning`, `ssr`, or `stereo`.
Description
: Used to specify a list of shader variants that the application guarantees will never be
@@ -1158,6 +1159,7 @@ Description of the variants:
- `fog`, used when global fog is applied to the scene
- `vsm`, used when VSM shadows are enabled and the object is a shadow receiver
- `ssr`, used when screen-space reflections are enabled in the View
- `stereo`, used when stereoscopic rendering is enabled in the View
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ JSON
material {

File diff suppressed because one or more lines are too long

Binary file not shown.

File diff suppressed because one or more lines are too long

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -27,10 +27,10 @@ set(SRCS
src/BackendUtils.cpp
src/BlobCacheKey.cpp
src/Callable.cpp
src/CallbackHandler.cpp
src/CircularBuffer.cpp
src/CommandBufferQueue.cpp
src/CommandStream.cpp
src/CompilerThreadPool.cpp
src/Driver.cpp
src/Handle.cpp
src/HandleAllocator.cpp
@@ -55,6 +55,7 @@ set(PRIVATE_HDRS
include/private/backend/PlatformFactory.h
include/private/backend/SamplerGroup.h
src/CommandStreamDispatcher.h
src/CompilerThreadPool.h
src/DataReshaper.h
src/DriverBase.h
)
@@ -66,6 +67,8 @@ set(PRIVATE_HDRS
if (FILAMENT_SUPPORTS_OPENGL AND NOT FILAMENT_USE_EXTERNAL_GLES3 AND NOT FILAMENT_USE_SWIFTSHADER)
list(APPEND SRCS
include/backend/platforms/OpenGLPlatform.h
src/opengl/CallbackManager.h
src/opengl/CallbackManager.cpp
src/opengl/gl_headers.cpp
src/opengl/gl_headers.h
src/opengl/GLUtils.cpp
@@ -174,8 +177,6 @@ if (FILAMENT_SUPPORTS_VULKAN)
src/vulkan/VulkanConstants.h
src/vulkan/VulkanContext.cpp
src/vulkan/VulkanContext.h
src/vulkan/VulkanDisposer.cpp
src/vulkan/VulkanDisposer.h
src/vulkan/VulkanDriver.cpp
src/vulkan/VulkanDriver.h
src/vulkan/VulkanDriverFactory.h
@@ -195,6 +196,11 @@ if (FILAMENT_SUPPORTS_VULKAN)
src/vulkan/VulkanStagePool.h
src/vulkan/VulkanSwapChain.cpp
src/vulkan/VulkanSwapChain.h
src/vulkan/VulkanReadPixels.cpp
src/vulkan/VulkanReadPixels.h
src/vulkan/VulkanResourceAllocator.h
src/vulkan/VulkanResources.cpp
src/vulkan/VulkanResources.h
src/vulkan/VulkanTexture.cpp
src/vulkan/VulkanTexture.h
src/vulkan/VulkanUtility.cpp

View File

@@ -66,7 +66,7 @@ public:
virtual void post(void* user, Callback callback) = 0;
protected:
virtual ~CallbackHandler();
virtual ~CallbackHandler() = default;
};
} // namespace filament::backend

View File

@@ -796,32 +796,53 @@ enum class SamplerCompareFunc : uint8_t {
//! Sampler parameters
struct SamplerParams { // NOLINT
union {
struct {
SamplerMagFilter filterMag : 1; //!< magnification filter (NEAREST)
SamplerMinFilter filterMin : 3; //!< minification filter (NEAREST)
SamplerWrapMode wrapS : 2; //!< s-coordinate wrap mode (CLAMP_TO_EDGE)
SamplerWrapMode wrapT : 2; //!< t-coordinate wrap mode (CLAMP_TO_EDGE)
SamplerMagFilter filterMag : 1; //!< magnification filter (NEAREST)
SamplerMinFilter filterMin : 3; //!< minification filter (NEAREST)
SamplerWrapMode wrapS : 2; //!< s-coordinate wrap mode (CLAMP_TO_EDGE)
SamplerWrapMode wrapT : 2; //!< t-coordinate wrap mode (CLAMP_TO_EDGE)
SamplerWrapMode wrapR : 2; //!< r-coordinate wrap mode (CLAMP_TO_EDGE)
uint8_t anisotropyLog2 : 3; //!< anisotropy level (0)
SamplerCompareMode compareMode : 1; //!< sampler compare mode (NONE)
uint8_t padding0 : 2; //!< reserved. must be 0.
SamplerWrapMode wrapR : 2; //!< r-coordinate wrap mode (CLAMP_TO_EDGE)
uint8_t anisotropyLog2 : 3; //!< anisotropy level (0)
SamplerCompareMode compareMode : 1; //!< sampler compare mode (NONE)
uint8_t padding0 : 2; //!< reserved. must be 0.
SamplerCompareFunc compareFunc : 3; //!< sampler comparison function (LE)
uint8_t padding1 : 5; //!< reserved. must be 0.
SamplerCompareFunc compareFunc : 3; //!< sampler comparison function (LE)
uint8_t padding1 : 5; //!< reserved. must be 0.
uint8_t padding2 : 8; //!< reserved. must be 0.
uint8_t padding2 : 8; //!< reserved. must be 0.
};
uint32_t u;
struct Hasher {
size_t operator()(SamplerParams p) const noexcept {
// we don't use std::hash<> here, so we don't have to include <functional>
return *reinterpret_cast<uint64_t const*>(reinterpret_cast<char const*>(&p));
}
};
struct EqualTo {
bool operator()(SamplerParams lhs, SamplerParams rhs) const noexcept {
auto* pLhs = reinterpret_cast<uint64_t const*>(reinterpret_cast<char const*>(&lhs));
auto* pRhs = reinterpret_cast<uint64_t const*>(reinterpret_cast<char const*>(&rhs));
return *pLhs == *pRhs;
}
};
struct LessThan {
bool operator()(SamplerParams lhs, SamplerParams rhs) const noexcept {
auto* pLhs = reinterpret_cast<uint64_t const*>(reinterpret_cast<char const*>(&lhs));
auto* pRhs = reinterpret_cast<uint64_t const*>(reinterpret_cast<char const*>(&rhs));
return *pLhs == *pRhs;
}
};
private:
friend inline bool operator < (SamplerParams lhs, SamplerParams rhs) {
return lhs.u < rhs.u;
friend inline bool operator < (SamplerParams lhs, SamplerParams rhs) noexcept {
return SamplerParams::LessThan{}(lhs, rhs);
}
};
static_assert(sizeof(SamplerParams) == sizeof(uint32_t), "SamplerParams must be 32 bits");
// The limitation to 64-bits max comes from how we store a SamplerParams in our JNI code
// see android/.../TextureSampler.cpp
static_assert(sizeof(SamplerParams) <= sizeof(uint64_t),
"SamplerParams must be no more than 64 bits");
//! blending equation function
enum class BlendEquation : uint8_t {
@@ -1126,8 +1147,6 @@ static_assert(sizeof(StencilState) == 12u,
using FrameScheduledCallback = void(*)(PresentCallable callable, void* user);
using FrameCompletedCallback = void(*)(void* user);
enum class Workaround : uint16_t {
// The EASU pass must split because shader compiler flattens early-exit branch
SPLIT_EASU,
@@ -1141,6 +1160,11 @@ enum class Workaround : uint16_t {
A8X_STATIC_TEXTURE_TARGET_ERROR,
// Adreno drivers sometimes aren't able to blit into a layer of a texture array.
DISABLE_BLIT_INTO_TEXTURE_ARRAY,
// Multiple workarounds needed for PowerVR GPUs
POWER_VR_SHADER_WORKAROUNDS,
// The driver has some threads pinned, and we can't easily know on which core, it can hurt
// performance more if we end-up pinned on the same one.
DISABLE_THREAD_AFFINITY
};
} // namespace filament::backend

View File

@@ -39,7 +39,6 @@ struct HwRenderTarget;
struct HwSamplerGroup;
struct HwStream;
struct HwSwapChain;
struct HwSync;
struct HwTexture;
struct HwTimerQuery;
struct HwVertexBuffer;
@@ -126,7 +125,6 @@ using RenderTargetHandle = Handle<HwRenderTarget>;
using SamplerGroupHandle = Handle<HwSamplerGroup>;
using StreamHandle = Handle<HwStream>;
using SwapChainHandle = Handle<HwSwapChain>;
using SyncHandle = Handle<HwSync>;
using TextureHandle = Handle<HwTexture>;
using TimerQueryHandle = Handle<HwTimerQuery>;
using VertexBufferHandle = Handle<HwVertexBuffer>;

View File

@@ -288,6 +288,12 @@ public:
* @see terminate()
*/
virtual void createContext(bool shared);
/**
* Detach and destroy the current context if any and releases all resources associated to
* this thread.
*/
virtual void releaseContext() noexcept;
};
} // namespace filament

View File

@@ -40,6 +40,7 @@ public:
PlatformEGL() noexcept;
bool isExtraContextSupported() const noexcept override;
void createContext(bool shared) override;
void releaseContext() noexcept override;
protected:
@@ -139,6 +140,7 @@ protected:
bool KHR_create_context = false;
bool KHR_gl_colorspace = false;
bool KHR_no_config_context = false;
bool KHR_surfaceless_context = false;
} egl;
} ext;

View File

@@ -73,14 +73,32 @@ public:
// a cost here (writing and reading the stack at each iteration), in the end it's
// probably better to pay the cost at just one location.
intptr_t next;
driver.mCurrentExecutingCommand = this;
mExecute(driver, this, &next);
return reinterpret_cast<CommandBase*>(reinterpret_cast<intptr_t>(this) + next);
}
inline void captureCallstack() noexcept {
auto c = utils::CallStack::unwind(4);
size_t i = 0;
for (; i < c.getFrameCount() && i < 16; i++) {
mCallstack[i] = c[i];
}
for (; i < 16; i++) {
mCallstack[i] = 0;
}
}
void printCallstack() noexcept {
auto c = utils::CallStack(mCallstack);
utils::slog.d << c << utils::io::endl;
}
inline ~CommandBase() noexcept = default;
private:
Execute mExecute;
std::array<intptr_t, 16> mCallstack = {0};
};
// ------------------------------------------------------------------------------------------------
@@ -218,6 +236,7 @@ public:
using Cmd = COMMAND_TYPE(methodName); \
void* const p = allocateCommand(CommandBase::align(sizeof(Cmd))); \
new(p) Cmd(mDispatcher.methodName##_, APPLY(std::move, params)); \
((Cmd*)p)->captureCallstack(); \
DEBUG_COMMAND_END(methodName, false); \
}
@@ -237,6 +256,7 @@ public:
using Cmd = COMMAND_TYPE(methodName##R); \
void* const p = allocateCommand(CommandBase::align(sizeof(Cmd))); \
new(p) Cmd(mDispatcher.methodName##_, RetType(result), APPLY(std::move, params)); \
((Cmd*)p)->captureCallstack(); \
DEBUG_COMMAND_END(methodName, false); \
return result; \
}

View File

@@ -53,6 +53,7 @@ template<typename T>
class ConcreteDispatcher;
class Dispatcher;
class CommandStream;
class CommandBase;
class Driver {
public:
@@ -83,6 +84,8 @@ public:
virtual void debugCommandEnd(CommandStream* cmds,
bool synchronous, const char* methodName) noexcept = 0;
CommandBase* mCurrentExecutingCommand = nullptr;
/*
* Asynchronous calls here only to provide a type to CommandStream. They must be non-virtual
* so that calling the concrete implementation won't go through a vtable.

View File

@@ -142,7 +142,8 @@ DECL_DRIVER_API_N(setFrameScheduledCallback,
DECL_DRIVER_API_N(setFrameCompletedCallback,
backend::SwapChainHandle, sch,
backend::FrameCompletedCallback, callback,
backend::CallbackHandler*, handler,
backend::CallbackHandler::Callback, callback,
void*, user)
DECL_DRIVER_API_N(setPresentationTime,
@@ -245,8 +246,6 @@ DECL_DRIVER_API_R_N(backend::RenderTargetHandle, createRenderTarget,
DECL_DRIVER_API_R_0(backend::FenceHandle, createFence)
DECL_DRIVER_API_R_0(backend::SyncHandle, createSync)
DECL_DRIVER_API_R_N(backend::SwapChainHandle, createSwapChain,
void*, nativeWindow,
uint64_t, flags)
@@ -275,7 +274,7 @@ DECL_DRIVER_API_N(destroyRenderTarget, backend::RenderTargetHandle, rth)
DECL_DRIVER_API_N(destroySwapChain, backend::SwapChainHandle, sch)
DECL_DRIVER_API_N(destroyStream, backend::StreamHandle, sh)
DECL_DRIVER_API_N(destroyTimerQuery, backend::TimerQueryHandle, sh)
DECL_DRIVER_API_N(destroySync, backend::SyncHandle, sh)
DECL_DRIVER_API_N(destroyFence, backend::FenceHandle, fh)
/*
* Synchronous APIs
@@ -289,8 +288,7 @@ DECL_DRIVER_API_SYNCHRONOUS_N(void, setAcquiredImage, backend::StreamHandle, str
DECL_DRIVER_API_SYNCHRONOUS_N(void, setStreamDimensions, backend::StreamHandle, stream, uint32_t, width, uint32_t, height)
DECL_DRIVER_API_SYNCHRONOUS_N(int64_t, getStreamTimestamp, backend::StreamHandle, stream)
DECL_DRIVER_API_SYNCHRONOUS_N(void, updateStreams, backend::DriverApi*, driver)
DECL_DRIVER_API_SYNCHRONOUS_N(void, destroyFence, backend::FenceHandle, fh)
DECL_DRIVER_API_SYNCHRONOUS_N(backend::FenceStatus, wait, backend::FenceHandle, fh, uint64_t, timeout)
DECL_DRIVER_API_SYNCHRONOUS_N(backend::FenceStatus, getFenceStatus, backend::FenceHandle, fh)
DECL_DRIVER_API_SYNCHRONOUS_N(bool, isTextureFormatSupported, backend::TextureFormat, format)
DECL_DRIVER_API_SYNCHRONOUS_0(bool, isTextureSwizzleSupported)
DECL_DRIVER_API_SYNCHRONOUS_N(bool, isTextureFormatMipmappable, backend::TextureFormat, format)
@@ -300,13 +298,14 @@ DECL_DRIVER_API_SYNCHRONOUS_0(bool, isFrameBufferFetchMultiSampleSupported)
DECL_DRIVER_API_SYNCHRONOUS_0(bool, isFrameTimeSupported)
DECL_DRIVER_API_SYNCHRONOUS_0(bool, isAutoDepthResolveSupported)
DECL_DRIVER_API_SYNCHRONOUS_0(bool, isSRGBSwapChainSupported)
DECL_DRIVER_API_SYNCHRONOUS_0(bool, isStereoSupported)
DECL_DRIVER_API_SYNCHRONOUS_0(bool, isParallelShaderCompileSupported)
DECL_DRIVER_API_SYNCHRONOUS_0(uint8_t, getMaxDrawBuffers)
DECL_DRIVER_API_SYNCHRONOUS_0(size_t, getMaxUniformBufferSize)
DECL_DRIVER_API_SYNCHRONOUS_0(math::float2, getClipSpaceParams)
DECL_DRIVER_API_SYNCHRONOUS_0(bool, canGenerateMipmaps)
DECL_DRIVER_API_SYNCHRONOUS_N(void, setupExternalImage, void*, image)
DECL_DRIVER_API_SYNCHRONOUS_N(bool, getTimerQueryValue, backend::TimerQueryHandle, query, uint64_t*, elapsedTime)
DECL_DRIVER_API_SYNCHRONOUS_N(backend::SyncStatus, getSyncStatus, backend::SyncHandle, sh)
DECL_DRIVER_API_SYNCHRONOUS_N(bool, isWorkaroundNeeded, backend::Workaround, workaround)
DECL_DRIVER_API_SYNCHRONOUS_0(backend::FeatureLevel, getFeatureLevel)
@@ -389,6 +388,7 @@ DECL_DRIVER_API_N(endTimerQuery,
backend::TimerQueryHandle, query)
DECL_DRIVER_API_N(compilePrograms,
backend::CompilerPriorityQueue, priority,
backend::CallbackHandler*, handler,
backend::CallbackHandler::Callback, callback,
void*, user)

View File

@@ -1,23 +0,0 @@
/*
* Copyright (C) 2021 The Android Open Source Project
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <backend/CallbackHandler.h>
namespace filament::backend {
CallbackHandler::~CallbackHandler() = default;
} // namespace filament::backend

View File

@@ -0,0 +1,137 @@
/*
* Copyright (C) 2023 The Android Open Source Project
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include "CompilerThreadPool.h"
#include <utils/Systrace.h>
#include <memory>
namespace filament::backend {
using namespace utils;
ProgramToken::~ProgramToken() = default;
CompilerThreadPool::CompilerThreadPool() noexcept = default;
CompilerThreadPool::~CompilerThreadPool() noexcept {
assert_invariant(mCompilerThreads.empty());
assert_invariant(mQueues[0].empty());
assert_invariant(mQueues[1].empty());
}
void CompilerThreadPool::init(uint32_t threadCount,
ThreadSetup&& threadSetup, ThreadCleanup&& threadCleanup) noexcept {
auto setup = std::make_shared<ThreadSetup>(std::move(threadSetup));
auto cleanup = std::make_shared<ThreadCleanup>(std::move(threadCleanup));
for (size_t i = 0; i < threadCount; i++) {
mCompilerThreads.emplace_back([this, setup, cleanup]() {
SYSTRACE_CONTEXT();
(*setup)();
// process jobs from the queue until we're asked to exit
while (!mExitRequested) {
std::unique_lock lock(mQueueLock);
mQueueCondition.wait(lock, [this]() {
return mExitRequested ||
(!std::all_of( std::begin(mQueues), std::end(mQueues),
[](auto&& q) { return q.empty(); }));
});
SYSTRACE_VALUE32("CompilerThreadPool Jobs",
mQueues[0].size() + mQueues[1].size());
if (UTILS_LIKELY(!mExitRequested)) {
Job job;
// use the first queue that's not empty
auto& queue = [this]() -> auto& {
for (auto& q: mQueues) {
if (!q.empty()) {
return q;
}
}
return mQueues[0]; // we should never end-up here.
}();
assert_invariant(!queue.empty());
std::swap(job, queue.front().second);
queue.pop_front();
// execute the job without holding any locks
lock.unlock();
job();
}
}
(*cleanup)();
});
}
}
auto CompilerThreadPool::find(program_token_t const& token) -> std::pair<Queue&, Queue::iterator> {
for (auto&& q: mQueues) {
auto pos = std::find_if(q.begin(), q.end(), [&token](auto&& item) {
return item.first == token;
});
if (pos != q.end()) {
return { q, pos };
}
}
// this can happen if the program is being processed right now
return { mQueues[0], mQueues[0].end() };
}
auto CompilerThreadPool::dequeue(program_token_t const& token) -> Job {
std::unique_lock const lock(mQueueLock);
Job job;
auto&& [q, pos] = find(token);
if (pos != q.end()) {
std::swap(job, pos->second);
q.erase(pos);
}
return job;
}
void CompilerThreadPool::queue(CompilerPriorityQueue priorityQueue,
program_token_t const& token, Job&& job) {
std::unique_lock const lock(mQueueLock);
mQueues[size_t(priorityQueue)].emplace_back(token, std::move(job));
mQueueCondition.notify_one();
}
void CompilerThreadPool::terminate() noexcept {
std::unique_lock lock(mQueueLock);
mExitRequested = true;
mQueueCondition.notify_all();
lock.unlock();
for (auto& thread: mCompilerThreads) {
if (thread.joinable()) {
thread.join();
}
}
mCompilerThreads.clear();
// Clear all the queues, dropping the remaining jobs. This relies on the jobs being cancelable.
for (auto&& q : mQueues) {
q.clear();
}
}
} // namespace filament::backend

View File

@@ -0,0 +1,70 @@
/*
* Copyright (C) 2023 The Android Open Source Project
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef TNT_FILAMENT_BACKEND_COMPILERTHREADPOOL_H
#define TNT_FILAMENT_BACKEND_COMPILERTHREADPOOL_H
#include <backend/DriverEnums.h>
#include <utils/Invocable.h>
#include <utils/Mutex.h>
#include <utils/Condition.h>
#include <array>
#include <deque>
#include <memory>
#include <thread>
#include <utility>
#include <vector>
namespace filament::backend {
struct ProgramToken {
virtual ~ProgramToken();
};
using program_token_t = std::shared_ptr<ProgramToken>;
class Platform;
class CompilerThreadPool {
public:
CompilerThreadPool() noexcept;
~CompilerThreadPool() noexcept;
using Job = utils::Invocable<void()>;
using ThreadSetup = utils::Invocable<void()>;
using ThreadCleanup = utils::Invocable<void()>;
void init(uint32_t threadCount,
ThreadSetup&& threadSetup, ThreadCleanup&& threadCleanup) noexcept;
void terminate() noexcept;
void queue(CompilerPriorityQueue priorityQueue, program_token_t const& token, Job&& job);
Job dequeue(program_token_t const& token);
private:
using Queue = std::deque<std::pair<program_token_t, Job>>;
std::vector<std::thread> mCompilerThreads;
bool mExitRequested{ false };
utils::Mutex mQueueLock;
utils::Condition mQueueCondition;
std::array<Queue, 2> mQueues;
// lock must be held for methods below
std::pair<Queue&, Queue::iterator> find(program_token_t const& token);
};
} // namespace filament::backend
#endif // TNT_FILAMENT_BACKEND_COMPILERTHREADPOOL_H

View File

@@ -63,6 +63,8 @@ DriverBase::DriverBase() noexcept {
}
DriverBase::~DriverBase() noexcept {
assert_invariant(mCallbacks.empty());
assert_invariant(mServiceThreadCallbackQueue.empty());
if constexpr (UTILS_HAS_THREADING) {
// quit our service thread
std::unique_lock<std::mutex> lock(mServiceThreadLock);

View File

@@ -135,9 +135,6 @@ struct HwFence : public HwBase {
Platform::Fence* fence = nullptr;
};
struct HwSync : public HwBase {
};
struct HwSwapChain : public HwBase {
Platform::SwapChain* swapChain = nullptr;
};
@@ -168,13 +165,6 @@ public:
void purge() noexcept final;
// --------------------------------------------------------------------------------------------
// Privates
// --------------------------------------------------------------------------------------------
protected:
class CallbackDataDetails;
// Helpers...
struct CallbackData {
CallbackData(CallbackData const &) = delete;
@@ -205,6 +195,13 @@ protected:
void scheduleCallback(CallbackHandler* handler, void* user, CallbackHandler::Callback callback);
// --------------------------------------------------------------------------------------------
// Privates
// --------------------------------------------------------------------------------------------
protected:
class CallbackDataDetails;
inline void scheduleDestroy(BufferDescriptor&& buffer) noexcept {
if (buffer.hasCallback()) {
scheduleDestroySlow(std::move(buffer));

View File

@@ -67,7 +67,6 @@ template io::ostream& operator<<(io::ostream& out, const Handle<HwFence>& h) noe
template io::ostream& operator<<(io::ostream& out, const Handle<HwSwapChain>& h) noexcept;
template io::ostream& operator<<(io::ostream& out, const Handle<HwStream>& h) noexcept;
template io::ostream& operator<<(io::ostream& out, const Handle<HwTimerQuery>& h) noexcept;
template io::ostream& operator<<(io::ostream& out, const Handle<HwSync>& h) noexcept;
template io::ostream& operator<<(io::ostream& out, const Handle<HwBufferObject>& h) noexcept;
#endif

View File

@@ -176,9 +176,9 @@ void MetalDriver::setFrameScheduledCallback(Handle<HwSwapChain> sch,
}
void MetalDriver::setFrameCompletedCallback(Handle<HwSwapChain> sch,
FrameCompletedCallback callback, void* user) {
CallbackHandler* handler, CallbackHandler::Callback callback, void* user) {
auto* swapChain = handle_cast<MetalSwapChain>(sch);
swapChain->setFrameCompletedCallback(callback, user);
swapChain->setFrameCompletedCallback(handler, callback, user);
}
void MetalDriver::execute(std::function<void(void)> const& fn) noexcept {
@@ -380,11 +380,6 @@ void MetalDriver::createFenceR(Handle<HwFence> fh, int dummy) {
fence->encode();
}
void MetalDriver::createSyncR(Handle<HwSync> sh, int) {
auto* fence = handle_cast<MetalFence>(sh);
fence->encode();
}
void MetalDriver::createSwapChainR(Handle<HwSwapChain> sch, void* nativeWindow, uint64_t flags) {
if (UTILS_UNLIKELY(flags & SWAP_CHAIN_CONFIG_APPLE_CVPIXELBUFFER)) {
CVPixelBufferRef pixelBuffer = (CVPixelBufferRef) nativeWindow;
@@ -454,12 +449,6 @@ Handle<HwFence> MetalDriver::createFenceS() noexcept {
return alloc_and_construct_handle<MetalFence, HwFence>(*mContext);
}
Handle<HwSync> MetalDriver::createSyncS() noexcept {
// The handle must be constructed here, as a synchronous call to getSyncStatus might happen
// before createSyncR is executed.
return alloc_and_construct_handle<MetalFence, HwSync>(*mContext);
}
Handle<HwSwapChain> MetalDriver::createSwapChainS() noexcept {
return alloc_handle<MetalSwapChain>();
}
@@ -567,13 +556,6 @@ void MetalDriver::destroyTimerQuery(Handle<HwTimerQuery> tqh) {
}
}
void MetalDriver::destroySync(Handle<HwSync> sh) {
if (sh) {
destruct_handle<MetalFence>(sh);
}
}
void MetalDriver::terminate() {
// finish() will flush the pending command buffer and will ensure all GPU work has finished.
// This must be done before calling bufferPool->reset() to ensure no buffers are in flight.
@@ -625,12 +607,12 @@ void MetalDriver::destroyFence(Handle<HwFence> fh) {
}
}
FenceStatus MetalDriver::wait(Handle<HwFence> fh, uint64_t timeout) {
FenceStatus MetalDriver::getFenceStatus(Handle<HwFence> fh) {
auto* fence = handle_cast<MetalFence>(fh);
if (!fence) {
return FenceStatus::ERROR;
}
return fence->wait(timeout);
return fence->wait(0);
}
bool MetalDriver::isTextureFormatSupported(TextureFormat format) {
@@ -714,6 +696,14 @@ bool MetalDriver::isSRGBSwapChainSupported() {
return false;
}
bool MetalDriver::isStereoSupported() {
return true;
}
bool MetalDriver::isParallelShaderCompileSupported() {
return false;
}
bool MetalDriver::isWorkaroundNeeded(Workaround workaround) {
switch (workaround) {
case Workaround::SPLIT_EASU:
@@ -726,6 +716,8 @@ bool MetalDriver::isWorkaroundNeeded(Workaround workaround) {
return mContext->bugs.a8xStaticTextureTargetError;
case Workaround::DISABLE_BLIT_INTO_TEXTURE_ARRAY:
return false;
default:
return false;
}
return false;
}
@@ -841,17 +833,6 @@ bool MetalDriver::getTimerQueryValue(Handle<HwTimerQuery> tqh, uint64_t* elapsed
return mContext->timerQueryImpl->getQueryResult(tq, elapsedTime);
}
SyncStatus MetalDriver::getSyncStatus(Handle<HwSync> sh) {
auto* fence = handle_cast<MetalFence>(sh);
FenceStatus status = fence->wait(0);
if (status == FenceStatus::TIMEOUT_EXPIRED) {
return SyncStatus::NOT_SIGNALED;
} else if (status == FenceStatus::CONDITION_SATISFIED) {
return SyncStatus::SIGNALED;
}
return SyncStatus::ERROR;
}
void MetalDriver::generateMipmaps(Handle<HwTexture> th) {
ASSERT_PRECONDITION(!isInRenderPass(mContext),
"generateMipmaps must be called outside of a render pass.");
@@ -975,8 +956,8 @@ void MetalDriver::updateSamplerGroup(Handle<HwSamplerGroup> sbh, BufferDescripto
scheduleDestroy(std::move(data));
}
void MetalDriver::compilePrograms(CallbackHandler* handler,
CallbackHandler::Callback callback, void* user) {
void MetalDriver::compilePrograms(CompilerPriorityQueue priority,
CallbackHandler* handler, CallbackHandler::Callback callback, void* user) {
if (callback) {
scheduleCallback(handler, user, callback);
}

View File

@@ -70,7 +70,8 @@ public:
void releaseDrawable();
void setFrameScheduledCallback(FrameScheduledCallback callback, void* user);
void setFrameCompletedCallback(FrameCompletedCallback callback, void* user);
void setFrameCompletedCallback(CallbackHandler* handler,
CallbackHandler::Callback callback, void* user);
// For CAMetalLayer-backed SwapChains, presents the drawable or schedules a
// FrameScheduledCallback.
@@ -112,8 +113,11 @@ private:
FrameScheduledCallback frameScheduledCallback = nullptr;
void* frameScheduledUserData = nullptr;
FrameCompletedCallback frameCompletedCallback = nullptr;
void* frameCompletedUserData = nullptr;
struct {
CallbackHandler* handler = nullptr;
CallbackHandler::Callback callback = {};
void* user = nullptr;
} frameCompleted;
};
class MetalBufferObject : public HwBufferObject {
@@ -446,9 +450,7 @@ private:
};
// MetalFence is used to implement both Fences and Syncs.
// There's no diamond problem, because HwBase (superclass of HwFence and HwSync) is empty.
static_assert(std::is_empty_v<HwBase>);
class MetalFence : public HwFence, public HwSync {
class MetalFence : public HwFence {
public:
// MetalFence is special, as it gets constructed on the Filament thread. We must delay inserting

View File

@@ -194,13 +194,15 @@ void MetalSwapChain::setFrameScheduledCallback(FrameScheduledCallback callback,
frameScheduledUserData = user;
}
void MetalSwapChain::setFrameCompletedCallback(FrameCompletedCallback callback, void* user) {
frameCompletedCallback = callback;
frameCompletedUserData = user;
void MetalSwapChain::setFrameCompletedCallback(CallbackHandler* handler,
CallbackHandler::Callback callback, void* user) {
frameCompleted.handler = handler;
frameCompleted.callback = callback;
frameCompleted.user = user;
}
void MetalSwapChain::present() {
if (frameCompletedCallback) {
if (frameCompleted.callback) {
scheduleFrameCompletedCallback();
}
if (drawable) {
@@ -244,30 +246,17 @@ void MetalSwapChain::scheduleFrameScheduledCallback() {
}
void MetalSwapChain::scheduleFrameCompletedCallback() {
if (!frameCompletedCallback) {
if (!frameCompleted.callback) {
return;
}
FrameCompletedCallback callback = frameCompletedCallback;
void* userData = frameCompletedUserData;
[getPendingCommandBuffer(&context) addCompletedHandler:^(id<MTLCommandBuffer> cb) {
struct CallbackData {
void* userData;
FrameCompletedCallback callback;
};
CallbackData* data = new CallbackData();
data->userData = userData;
data->callback = callback;
CallbackHandler* handler = frameCompleted.handler;
void* user = frameCompleted.user;
CallbackHandler::Callback callback = frameCompleted.callback;
// Instantiate a BufferDescriptor with a callback for the sole purpose of passing it to
// scheduleDestroy. This forces the BufferDescriptor callback (and thus the
// FrameCompletedCallback) to be called on the user thread.
BufferDescriptor b(nullptr, 0u, [](void* buffer, size_t size, void* user) {
CallbackData* data = (CallbackData*) user;
data->callback(data->userData);
free(data);
}, data);
context.driver->scheduleDestroy(std::move(b));
MetalDriver* driver = context.driver;
[getPendingCommandBuffer(&context) addCompletedHandler:^(id<MTLCommandBuffer> cb) {
driver->scheduleCallback(handler, user, callback);
}];
}

View File

@@ -34,7 +34,7 @@ namespace filament {
namespace backend {
inline bool operator==(const SamplerParams& lhs, const SamplerParams& rhs) {
return lhs.u == rhs.u;
return SamplerParams::EqualTo{}(lhs, rhs);
}
// Rasterization Bindings

View File

@@ -58,7 +58,7 @@ void NoopDriver::setFrameScheduledCallback(Handle<HwSwapChain> sch,
}
void NoopDriver::setFrameCompletedCallback(Handle<HwSwapChain> sch,
FrameCompletedCallback callback, void* user) {
CallbackHandler* handler, CallbackHandler::Callback callback, void* user) {
}
@@ -107,9 +107,6 @@ void NoopDriver::destroyStream(Handle<HwStream> sh) {
void NoopDriver::destroyTimerQuery(Handle<HwTimerQuery> tqh) {
}
void NoopDriver::destroySync(Handle<HwSync> fh) {
}
Handle<HwStream> NoopDriver::createStreamNative(void* nativeStream) {
return {};
}
@@ -135,7 +132,7 @@ void NoopDriver::updateStreams(CommandStream* driver) {
void NoopDriver::destroyFence(Handle<HwFence> fh) {
}
FenceStatus NoopDriver::wait(Handle<HwFence> fh, uint64_t timeout) {
FenceStatus NoopDriver::getFenceStatus(Handle<HwFence> fh) {
return FenceStatus::CONDITION_SATISFIED;
}
@@ -177,6 +174,14 @@ bool NoopDriver::isSRGBSwapChainSupported() {
return false;
}
bool NoopDriver::isStereoSupported() {
return false;
}
bool NoopDriver::isParallelShaderCompileSupported() {
return false;
}
bool NoopDriver::isWorkaroundNeeded(Workaround) {
return false;
}
@@ -236,10 +241,6 @@ bool NoopDriver::getTimerQueryValue(Handle<HwTimerQuery> tqh, uint64_t* elapsedT
return false;
}
SyncStatus NoopDriver::getSyncStatus(Handle<HwSync> sh) {
return SyncStatus::SIGNALED;
}
void NoopDriver::setExternalImage(Handle<HwTexture> th, void* image) {
}
@@ -260,8 +261,8 @@ void NoopDriver::updateSamplerGroup(Handle<HwSamplerGroup> sbh,
scheduleDestroy(std::move(data));
}
void NoopDriver::compilePrograms(CallbackHandler* handler,
CallbackHandler::Callback callback, void* user) {
void NoopDriver::compilePrograms(CompilerPriorityQueue priority,
CallbackHandler* handler, CallbackHandler::Callback callback, void* user) {
if (callback) {
scheduleCallback(handler, user, callback);
}

View File

@@ -0,0 +1,69 @@
/*
* Copyright (C) 2023 The Android Open Source Project
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include "CallbackManager.h"
#include "DriverBase.h"
namespace filament::backend {
CallbackManager::CallbackManager(DriverBase& driver) noexcept
: mDriver(driver), mCallbacks(1) {
}
CallbackManager::~CallbackManager() noexcept = default;
void CallbackManager::terminate() noexcept {
for (auto&& item: mCallbacks) {
if (item.func) {
mDriver.scheduleCallback(
item.handler, item.user, item.func);
}
}
}
CallbackManager::Handle CallbackManager::get() const noexcept {
Container::const_iterator const curr = getCurrent();
curr->count.fetch_add(1);
return curr;
}
void CallbackManager::put(Handle& curr) noexcept {
if (curr->count.fetch_sub(1) == 1) {
if (curr->func) {
mDriver.scheduleCallback(
curr->handler, curr->user, curr->func);
destroySlot(curr);
}
}
curr = {};
}
void CallbackManager::setCallback(
CallbackHandler* handler, CallbackHandler::Callback func, void* user) {
assert_invariant(func);
Container::iterator const curr = allocateNewSlot();
curr->handler = handler;
curr->func = func;
curr->user = user;
if (curr->count == 0) {
mDriver.scheduleCallback(
curr->handler, curr->user, curr->func);
destroySlot(curr);
}
}
} // namespace filament::backend

View File

@@ -0,0 +1,98 @@
/*
* Copyright (C) 2023 The Android Open Source Project
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef TNT_FILAMENT_BACKEND_OPENGL_CALLBACKMANAGER_H
#define TNT_FILAMENT_BACKEND_OPENGL_CALLBACKMANAGER_H
#include <backend/CallbackHandler.h>
#include <utils/Mutex.h>
#include <atomic>
#include <mutex>
#include <list>
namespace filament::backend {
class DriverBase;
class CallbackHandler;
/*
* CallbackManager schedules user callbacks once all previous conditions are met.
* A "Condition" is created by calling "get" and is met by calling "put". These
* are typically called from different threads.
* The callback is specified with "setCallback", which atomically creates a new set of
* conditions to be met.
*/
class CallbackManager {
struct Callback {
mutable std::atomic_int count{};
CallbackHandler* handler = nullptr;
CallbackHandler::Callback func = {};
void* user = nullptr;
};
using Container = std::list<Callback>;
public:
using Handle = Container::const_iterator;
explicit CallbackManager(DriverBase& driver) noexcept;
~CallbackManager() noexcept;
// Calls all the pending callbacks regardless of remaining conditions to be met. This is to
// avoid leaking resources for instance. It also doesn't matter if the conditions are met
// because we're shutting down.
void terminate() noexcept;
// creates a condition and get a handle for it
Handle get() const noexcept;
// Announces the specified condition is met. If a callback was specified and all conditions
// prior to setting the callback are met, the callback is scheduled.
void put(Handle& curr) noexcept;
// Sets a callback to be called when all previously created (get) conditions are met (put).
// If there were no conditions created, or they're all already met, the callback is scheduled
// immediately.
void setCallback(CallbackHandler* handler, CallbackHandler::Callback func, void* user);
private:
Container::const_iterator getCurrent() const noexcept {
std::lock_guard const lock(mLock);
return --mCallbacks.end();
}
Container::iterator allocateNewSlot() noexcept {
std::lock_guard const lock(mLock);
auto curr = --mCallbacks.end();
mCallbacks.emplace_back();
return curr;
}
void destroySlot(Container::const_iterator curr) noexcept {
std::lock_guard const lock(mLock);
mCallbacks.erase(curr);
}
DriverBase& mDriver;
mutable utils::Mutex mLock;
Container mCallbacks;
};
} // namespace filament::backend
#endif // TNT_FILAMENT_BACKEND_OPENGL_CALLBACKMANAGER_H

View File

@@ -49,6 +49,7 @@ bool OpenGLContext::queryOpenGLVersion(GLint* major, GLint* minor) noexcept {
}
OpenGLContext::OpenGLContext() noexcept {
state.vao.p = &mDefaultVAO;
// These queries work with all GL/GLES versions!
@@ -61,264 +62,74 @@ OpenGLContext::OpenGLContext() noexcept {
"[" << state.version << "], [" << state.shader << "]" << io::endl;
/*
* Figure out GL / GLES version and available features
* Figure out GL / GLES version, extensions and capabilities we need to
* determine the feature level
*/
queryOpenGLVersion(&state.major, &state.minor);
glGetIntegerv(GL_MAX_RENDERBUFFER_SIZE, &gets.max_renderbuffer_size);
glGetIntegerv(GL_MAX_TEXTURE_IMAGE_UNITS, &gets.max_texture_image_units);
glGetIntegerv(GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS, &gets.max_combined_texture_image_units);
OpenGLContext::initExtensions(&ext, state.major, state.minor);
if (state.major > 2) { // this check works for both GL and GLES, but is intended for GLES
OpenGLContext::initProcs(&procs, ext, state.major, state.minor);
OpenGLContext::initBugs(&bugs, ext, state.major, state.minor,
state.vendor, state.renderer, state.version, state.shader);
glGetIntegerv(GL_MAX_RENDERBUFFER_SIZE, &gets.max_renderbuffer_size);
glGetIntegerv(GL_MAX_TEXTURE_IMAGE_UNITS, &gets.max_texture_image_units);
glGetIntegerv(GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS, &gets.max_combined_texture_image_units);
mFeatureLevel = OpenGLContext::resolveFeatureLevel(state.major, state.minor, ext, gets, bugs);
#ifdef BACKEND_OPENGL_VERSION_GLES
mShaderModel = ShaderModel::MOBILE;
#else
mShaderModel = ShaderModel::DESKTOP;
#endif
#ifdef BACKEND_OPENGL_VERSION_GLES
if (mFeatureLevel >= FeatureLevel::FEATURE_LEVEL_2) {
features.multisample_texture = true;
}
#else
if (mFeatureLevel >= FeatureLevel::FEATURE_LEVEL_1) {
features.multisample_texture = true;
}
#endif
if (mFeatureLevel >= FeatureLevel::FEATURE_LEVEL_1) {
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
glGetIntegerv(GL_MAX_UNIFORM_BLOCK_SIZE, &gets.max_uniform_block_size);
glGetIntegerv(GL_MAX_UNIFORM_BUFFER_BINDINGS, &gets.max_uniform_buffer_bindings);
glGetIntegerv(GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT, &gets.uniform_buffer_offset_alignment);
glGetIntegerv(GL_MAX_SAMPLES, &gets.max_samples);
glGetIntegerv(GL_MAX_DRAW_BUFFERS, &gets.max_draw_buffers);
glGetIntegerv(GL_MAX_UNIFORM_BLOCK_SIZE,
&gets.max_uniform_block_size);
glGetIntegerv(GL_MAX_UNIFORM_BUFFER_BINDINGS,
&gets.max_uniform_buffer_bindings);
glGetIntegerv(GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT,
&gets.uniform_buffer_offset_alignment);
glGetIntegerv(GL_MAX_SAMPLES,
&gets.max_samples);
glGetIntegerv(GL_MAX_DRAW_BUFFERS,
&gets.max_draw_buffers);
glGetIntegerv(GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS,
&gets.max_transform_feedback_separate_attribs);
#ifdef GL_EXT_texture_filter_anisotropic
if (ext.EXT_texture_filter_anisotropic) {
glGetFloatv(GL_MAX_TEXTURE_MAX_ANISOTROPY_EXT, &gets.max_anisotropy);
}
#endif
} else {
#endif
}
#ifdef BACKEND_OPENGL_VERSION_GLES
else {
gets.max_uniform_block_size = 0;
gets.max_uniform_buffer_bindings = 0;
gets.uniform_buffer_offset_alignment = 0;
gets.max_samples = 1;
gets.max_draw_buffers = 1;
gets.max_transform_feedback_separate_attribs = 0;
}
constexpr auto const caps3 = FEATURE_LEVEL_CAPS[+FeatureLevel::FEATURE_LEVEL_3];
constexpr GLint MAX_VERTEX_SAMPLER_COUNT = caps3.MAX_VERTEX_SAMPLER_COUNT;
constexpr GLint MAX_FRAGMENT_SAMPLER_COUNT = caps3.MAX_FRAGMENT_SAMPLER_COUNT;
// default procs that can be overridden based on runtime version
#ifdef BACKEND_OPENGL_LEVEL_GLES30
procs.genVertexArrays = glGenVertexArrays;
procs.bindVertexArray = glBindVertexArray;
procs.deleteVertexArrays = glDeleteVertexArrays;
// these are core in GL and GLES 3.x
procs.genQueries = glGenQueries;
procs.deleteQueries = glDeleteQueries;
procs.beginQuery = glBeginQuery;
procs.endQuery = glEndQuery;
procs.getQueryObjectuiv = glGetQueryObjectuiv;
# ifdef BACKEND_OPENGL_VERSION_GL
procs.getQueryObjectui64v = glGetQueryObjectui64v; // only core in GL
# elif defined(GL_EXT_disjoint_timer_query)
procs.getQueryObjectui64v = glGetQueryObjectui64vEXT;
# endif // BACKEND_OPENGL_VERSION_GL
// core in ES 3.0 and GL 4.3
procs.invalidateFramebuffer = glInvalidateFramebuffer;
#endif // BACKEND_OPENGL_LEVEL_GLES30
// no-op if not supported
procs.maxShaderCompilerThreadsKHR = +[](GLuint) {};
#ifdef BACKEND_OPENGL_VERSION_GLES
initExtensionsGLES();
if (state.major == 3) {
// Runtime OpenGL version is ES 3.x
assert_invariant(gets.max_texture_image_units >= 16);
assert_invariant(gets.max_combined_texture_image_units >= 32);
if (state.minor >= 1) {
features.multisample_texture = true;
// figure out our feature level
if (ext.EXT_texture_cube_map_array) {
mFeatureLevel = FeatureLevel::FEATURE_LEVEL_2;
if (gets.max_texture_image_units >= MAX_FRAGMENT_SAMPLER_COUNT &&
gets.max_combined_texture_image_units >=
(MAX_FRAGMENT_SAMPLER_COUNT + MAX_VERTEX_SAMPLER_COUNT)) {
mFeatureLevel = FeatureLevel::FEATURE_LEVEL_3;
}
}
}
}
#ifndef IOS // IOS is guaranteed to have ES3.x
else if (UTILS_UNLIKELY(state.major == 2)) {
// Runtime OpenGL version is ES 2.x
#if defined(BACKEND_OPENGL_LEVEL_GLES30)
// mandatory extensions (all supported by Mali-400 and Adreno 304)
assert_invariant(ext.OES_depth_texture);
assert_invariant(ext.OES_depth24);
assert_invariant(ext.OES_packed_depth_stencil);
assert_invariant(ext.OES_rgb8_rgba8);
assert_invariant(ext.OES_standard_derivatives);
assert_invariant(ext.OES_texture_npot);
#endif
if (UTILS_LIKELY(ext.OES_vertex_array_object)) {
procs.genVertexArrays = glGenVertexArraysOES;
procs.bindVertexArray = glBindVertexArrayOES;
procs.deleteVertexArrays = glDeleteVertexArraysOES;
} else {
// if we don't have OES_vertex_array_object, just don't do anything with real VAOs,
// we'll just rebind everything each time. Most Mali-400 support this extension, but
// a few don't.
procs.genVertexArrays = +[](GLsizei, GLuint*) {};
procs.bindVertexArray = +[](GLuint) {};
procs.deleteVertexArrays = +[](GLsizei, GLuint const*) {};
// we activate this workaround path, which does the reset of array buffer
bugs.vao_doesnt_store_element_array_buffer_binding = true;
}
// EXT_disjoint_timer_query is optional -- pointers will be null if not available
procs.genQueries = glGenQueriesEXT;
procs.deleteQueries = glDeleteQueriesEXT;
procs.beginQuery = glBeginQueryEXT;
procs.endQuery = glEndQueryEXT;
procs.getQueryObjectuiv = glGetQueryObjectuivEXT;
procs.getQueryObjectui64v = glGetQueryObjectui64vEXT;
procs.invalidateFramebuffer = glDiscardFramebufferEXT;
procs.maxShaderCompilerThreadsKHR = glMaxShaderCompilerThreadsKHR;
mFeatureLevel = FeatureLevel::FEATURE_LEVEL_0;
}
#endif // IOS
#else
initExtensionsGL();
if (state.major == 4) {
assert_invariant(state.minor >= 1);
mShaderModel = ShaderModel::DESKTOP;
if (state.minor >= 3) {
// cubemap arrays are available as of OpenGL 4.0
mFeatureLevel = FeatureLevel::FEATURE_LEVEL_2;
// figure out our feature level
if (gets.max_texture_image_units >= MAX_FRAGMENT_SAMPLER_COUNT &&
gets.max_combined_texture_image_units >=
(MAX_FRAGMENT_SAMPLER_COUNT + MAX_VERTEX_SAMPLER_COUNT)) {
mFeatureLevel = FeatureLevel::FEATURE_LEVEL_3;
}
}
features.multisample_texture = true;
}
// feedback loops are allowed on GL desktop as long as writes are disabled
bugs.allow_read_only_ancillary_feedback_loop = true;
assert_invariant(gets.max_texture_image_units >= 16);
assert_invariant(gets.max_combined_texture_image_units >= 32);
procs.maxShaderCompilerThreadsKHR = glMaxShaderCompilerThreadsARB;
#endif
#ifdef GL_EXT_texture_filter_anisotropic
if (ext.EXT_texture_filter_anisotropic) {
glGetFloatv(GL_MAX_TEXTURE_MAX_ANISOTROPY_EXT, &gets.max_anisotropy);
gets.max_anisotropy = 1;
}
#endif
/*
* Figure out which driver bugs we need to workaround
*/
const bool isAngle = strstr(state.renderer, "ANGLE");
if (!isAngle) {
if (strstr(state.renderer, "Adreno")) {
// Qualcomm GPU
bugs.invalidate_end_only_if_invalidate_start = true;
// On Adreno (As of 3/20) timer query seem to return the CPU time, not the GPU time.
bugs.dont_use_timer_query = true;
// Blits to texture arrays are failing
// This bug continues to reproduce, though at times we've seen it appear to "go away".
// The standalone sample app that was written to show this problem still reproduces.
// The working hypothesis is that some other state affects this behavior.
bugs.disable_blit_into_texture_array = true;
// early exit condition is flattened in EASU code
bugs.split_easu = true;
// initialize the non-used uniform array for Adreno drivers.
bugs.enable_initialize_non_used_uniform_array = true;
int maj, min, driverMajor, driverMinor;
int const c = sscanf(state.version, "OpenGL ES %d.%d V@%d.%d", // NOLINT(cert-err34-c)
&maj, &min, &driverMajor, &driverMinor);
if (c == 4) {
// Workarounds based on version here.
// notes:
// bugs.invalidate_end_only_if_invalidate_start
// - appeared at least in
// "OpenGL ES 3.2 V@0490.0 (GIT@85da404, I46ff5fc46f, 1606794520) (Date:11/30/20)"
// - wasn't present in
// "OpenGL ES 3.2 V@0490.0 (GIT@0905e9f, Ia11ce2d146, 1599072951) (Date:09/02/20)"
// - has been confirmed fixed in V@570.1 by Qualcomm
if (driverMajor < 490 || driverMajor > 570 ||
(driverMajor == 570 && driverMinor >= 1)) {
bugs.invalidate_end_only_if_invalidate_start = false;
}
}
// qualcomm seems to have no problem with this (which is good for us)
bugs.allow_read_only_ancillary_feedback_loop = true;
} else if (strstr(state.renderer, "Mali")) {
// ARM GPU
bugs.vao_doesnt_store_element_array_buffer_binding = true;
if (strstr(state.renderer, "Mali-T")) {
bugs.disable_glFlush = true;
bugs.disable_shared_context_draws = true;
bugs.texture_external_needs_rebind = true;
// We have not verified that timer queries work on Mali-T, so we disable to be safe.
bugs.dont_use_timer_query = true;
}
if (strstr(state.renderer, "Mali-G")) {
// assume we don't have working timer queries
bugs.dont_use_timer_query = true;
int maj, min, driverVersion, driverRevision, driverPatch;
int const c = sscanf(state.version, "OpenGL ES %d.%d v%d.r%dp%d", // NOLINT(cert-err34-c)
&maj, &min, &driverVersion, &driverRevision, &driverPatch);
if (c == 5) {
// Workarounds based on version here.
// notes:
// bugs.dont_use_timer_query : on some Mali-Gxx drivers timer query seems
// to cause memory corruptions in some cases on some devices (see b/233754398).
// - appeared at least in
// "OpenGL ES 3.2 v1.r26p0-01eac0"
// - wasn't present in
// "OpenGL ES 3.2 v1.r32p1-00pxl1"
if (driverVersion >= 2 || (driverVersion == 1 && driverRevision >= 32)) {
bugs.dont_use_timer_query = false;
}
}
}
// Mali seems to have no problem with this (which is good for us)
bugs.allow_read_only_ancillary_feedback_loop = true;
} else if (strstr(state.renderer, "Intel")) {
// Intel GPU
bugs.vao_doesnt_store_element_array_buffer_binding = true;
} else if (strstr(state.renderer, "PowerVR")) {
// PowerVR GPU
} else if (strstr(state.renderer, "Apple")) {
// Apple GPU
} else if (strstr(state.renderer, "Tegra") ||
strstr(state.renderer, "GeForce") ||
strstr(state.renderer, "NV")) {
// NVIDIA GPU
} else if (strstr(state.renderer, "Vivante")) {
// Vivante GPU
} else if (strstr(state.renderer, "AMD") ||
strstr(state.renderer, "ATI")) {
// AMD/ATI GPU
} else if (strstr(state.renderer, "Mozilla")) {
bugs.disable_invalidate_framebuffer = true;
}
} else {
// When running under ANGLE, it's a different set of workaround that we need.
if (strstr(state.renderer, "Adreno")) {
// Qualcomm GPU
// early exit condition is flattened in EASU code
// (that should be regardless of ANGLE, but we should double-check)
bugs.split_easu = true;
}
// TODO: see if we could use `bugs.allow_read_only_ancillary_feedback_loop = true`
}
slog.v << "Feature level: " << +mFeatureLevel << '\n';
slog.v << "Active workarounds: " << '\n';
@@ -344,14 +155,14 @@ OpenGLContext::OpenGLContext() noexcept {
#endif
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
assert_invariant(state.major <= 2 || gets.max_draw_buffers >= 4); // minspec
assert_invariant(mFeatureLevel == FeatureLevel::FEATURE_LEVEL_0 || gets.max_draw_buffers >= 4); // minspec
#endif
setDefaultState();
#ifdef GL_EXT_texture_filter_anisotropic
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
if (state.major > 2 && ext.EXT_texture_filter_anisotropic) {
if (mFeatureLevel >= FeatureLevel::FEATURE_LEVEL_1 && ext.EXT_texture_filter_anisotropic) {
// make sure we don't have any error flag
while (glGetError() != GL_NO_ERROR) { }
@@ -451,11 +262,293 @@ void OpenGLContext::setDefaultState() noexcept {
glClipControlEXT(GL_LOWER_LEFT_EXT, GL_ZERO_TO_ONE_EXT);
#endif
}
if (ext.EXT_clip_cull_distance) {
glEnable(GL_CLIP_DISTANCE0);
}
}
void OpenGLContext::initProcs(Procs* procs,
Extensions const& ext, GLint major, GLint) noexcept {
(void)ext;
(void)major;
// default procs that can be overridden based on runtime version
#ifdef BACKEND_OPENGL_LEVEL_GLES30
procs->genVertexArrays = glGenVertexArrays;
procs->bindVertexArray = glBindVertexArray;
procs->deleteVertexArrays = glDeleteVertexArrays;
// these are core in GL and GLES 3.x
procs->genQueries = glGenQueries;
procs->deleteQueries = glDeleteQueries;
procs->beginQuery = glBeginQuery;
procs->endQuery = glEndQuery;
procs->getQueryObjectuiv = glGetQueryObjectuiv;
# ifdef BACKEND_OPENGL_VERSION_GL
procs->getQueryObjectui64v = glGetQueryObjectui64v; // only core in GL
# elif defined(GL_EXT_disjoint_timer_query)
procs->getQueryObjectui64v = glGetQueryObjectui64vEXT;
# endif // BACKEND_OPENGL_VERSION_GL
// core in ES 3.0 and GL 4.3
procs->invalidateFramebuffer = glInvalidateFramebuffer;
#endif // BACKEND_OPENGL_LEVEL_GLES30
// no-op if not supported
procs->maxShaderCompilerThreadsKHR = +[](GLuint) {};
#ifdef BACKEND_OPENGL_VERSION_GLES
# ifndef IOS // IOS is guaranteed to have ES3.x
if (UTILS_UNLIKELY(major == 2)) {
// Runtime OpenGL version is ES 2.x
if (UTILS_LIKELY(ext.OES_vertex_array_object)) {
procs->genVertexArrays = glGenVertexArraysOES;
procs->bindVertexArray = glBindVertexArrayOES;
procs->deleteVertexArrays = glDeleteVertexArraysOES;
} else {
// if we don't have OES_vertex_array_object, just don't do anything with real VAOs,
// we'll just rebind everything each time. Most Mali-400 support this extension, but
// a few don't.
procs->genVertexArrays = +[](GLsizei, GLuint*) {};
procs->bindVertexArray = +[](GLuint) {};
procs->deleteVertexArrays = +[](GLsizei, GLuint const*) {};
}
// EXT_disjoint_timer_query is optional -- pointers will be null if not available
procs->genQueries = glGenQueriesEXT;
procs->deleteQueries = glDeleteQueriesEXT;
procs->beginQuery = glBeginQueryEXT;
procs->endQuery = glEndQueryEXT;
procs->getQueryObjectuiv = glGetQueryObjectuivEXT;
procs->getQueryObjectui64v = glGetQueryObjectui64vEXT;
procs->invalidateFramebuffer = glDiscardFramebufferEXT;
procs->maxShaderCompilerThreadsKHR = glMaxShaderCompilerThreadsKHR;
}
# endif // IOS
#else
procs->maxShaderCompilerThreadsKHR = glMaxShaderCompilerThreadsARB;
#endif
}
void OpenGLContext::initBugs(Bugs* bugs, Extensions const& exts,
GLint major, GLint minor,
char const* vendor,
char const* renderer,
char const* version,
char const* shader) {
(void)major;
(void)minor;
(void)vendor;
(void)renderer;
(void)version;
(void)shader;
const bool isAngle = strstr(renderer, "ANGLE");
if (!isAngle) {
if (strstr(renderer, "Adreno")) {
// Qualcomm GPU
bugs->invalidate_end_only_if_invalidate_start = true;
// On Adreno (As of 3/20) timer query seem to return the CPU time, not the GPU time.
bugs->dont_use_timer_query = true;
// Blits to texture arrays are failing
// This bug continues to reproduce, though at times we've seen it appear to "go away".
// The standalone sample app that was written to show this problem still reproduces.
// The working hypothesis is that some other state affects this behavior.
bugs->disable_blit_into_texture_array = true;
// early exit condition is flattened in EASU code
bugs->split_easu = true;
// initialize the non-used uniform array for Adreno drivers.
bugs->enable_initialize_non_used_uniform_array = true;
int maj, min, driverMajor, driverMinor;
int const c = sscanf(version, "OpenGL ES %d.%d V@%d.%d", // NOLINT(cert-err34-c)
&maj, &min, &driverMajor, &driverMinor);
if (c == 4) {
// Workarounds based on version here.
// Notes:
// bugs.invalidate_end_only_if_invalidate_start
// - appeared at least in
// "OpenGL ES 3.2 V@0490.0 (GIT@85da404, I46ff5fc46f, 1606794520) (Date:11/30/20)"
// - wasn't present in
// "OpenGL ES 3.2 V@0490.0 (GIT@0905e9f, Ia11ce2d146, 1599072951) (Date:09/02/20)"
// - has been confirmed fixed in V@570.1 by Qualcomm
if (driverMajor < 490 || driverMajor > 570 ||
(driverMajor == 570 && driverMinor >= 1)) {
bugs->invalidate_end_only_if_invalidate_start = false;
}
}
// qualcomm seems to have no problem with this (which is good for us)
bugs->allow_read_only_ancillary_feedback_loop = true;
// Older Adreno devices that support ES3.0 only tend to be extremely buggy, so we
// fall back to ES2.0.
if (major == 3 && minor == 0) {
bugs->force_feature_level0 = true;
}
} else if (strstr(renderer, "Mali")) {
// ARM GPU
bugs->vao_doesnt_store_element_array_buffer_binding = true;
if (strstr(renderer, "Mali-T")) {
bugs->disable_glFlush = true;
bugs->disable_shared_context_draws = true;
bugs->texture_external_needs_rebind = true;
// We have not verified that timer queries work on Mali-T, so we disable to be safe.
bugs->dont_use_timer_query = true;
}
if (strstr(renderer, "Mali-G")) {
// We have run into several problems with timer queries on Mali-Gxx:
// - timer queries seem to cause memory corruptions in some cases on some devices
// (see b/233754398)
// - appeared at least in: "OpenGL ES 3.2 v1.r26p0-01eac0"
// - wasn't present in: "OpenGL ES 3.2 v1.r32p1-00pxl1"
// - timer queries sometime crash with an NPE (see b/273759031)
bugs->dont_use_timer_query = true;
}
// Mali seems to have no problem with this (which is good for us)
bugs->allow_read_only_ancillary_feedback_loop = true;
} else if (strstr(renderer, "Intel")) {
// Intel GPU
bugs->vao_doesnt_store_element_array_buffer_binding = true;
} else if (strstr(renderer, "PowerVR")) {
// PowerVR GPU
// On PowerVR (Rogue GE8320) glFlush doesn't seem to do anything, in particular,
// it doesn't kick the GPU earlier, so don't issue these calls as they seem to slow
// things down.
bugs->disable_glFlush = true;
// On PowerVR (Rogue GE8320) using gl_InstanceID too early in the shader doesn't work.
bugs->powervr_shader_workarounds = true;
// On PowerVR (Rogue GE8320) destroying a fbo after glBlitFramebuffer is effectively
// equivalent to glFinish.
bugs->delay_fbo_destruction = true;
// PowerVR seems to have no problem with this (which is good for us)
bugs->allow_read_only_ancillary_feedback_loop = true;
// PowerVR has a shader compiler thread pinned on the last core
bugs->disable_thread_affinity = true;
} else if (strstr(renderer, "Apple")) {
// Apple GPU
} else if (strstr(renderer, "Tegra") ||
strstr(renderer, "GeForce") ||
strstr(renderer, "NV")) {
// NVIDIA GPU
} else if (strstr(renderer, "Vivante")) {
// Vivante GPU
} else if (strstr(renderer, "AMD") ||
strstr(renderer, "ATI")) {
// AMD/ATI GPU
} else if (strstr(renderer, "Mozilla")) {
bugs->disable_invalidate_framebuffer = true;
}
} else {
// When running under ANGLE, it's a different set of workaround that we need.
if (strstr(renderer, "Adreno")) {
// Qualcomm GPU
// early exit condition is flattened in EASU code
// (that should be regardless of ANGLE, but we should double-check)
bugs->split_easu = true;
}
// TODO: see if we could use `bugs.allow_read_only_ancillary_feedback_loop = true`
}
#ifdef BACKEND_OPENGL_VERSION_GLES
# ifndef IOS // IOS is guaranteed to have ES3.x
if (UTILS_UNLIKELY(major == 2)) {
if (UTILS_UNLIKELY(!exts.OES_vertex_array_object)) {
// we activate this workaround path, which does the reset of array buffer
bugs->vao_doesnt_store_element_array_buffer_binding = true;
}
}
# endif // IOS
#else
// feedback loops are allowed on GL desktop as long as writes are disabled
bugs->allow_read_only_ancillary_feedback_loop = true;
#endif
}
FeatureLevel OpenGLContext::resolveFeatureLevel(GLint major, GLint minor,
Extensions const& exts,
Gets const& gets,
Bugs const& bugs) noexcept {
constexpr auto const caps3 = FEATURE_LEVEL_CAPS[+FeatureLevel::FEATURE_LEVEL_3];
constexpr GLint MAX_VERTEX_SAMPLER_COUNT = caps3.MAX_VERTEX_SAMPLER_COUNT;
constexpr GLint MAX_FRAGMENT_SAMPLER_COUNT = caps3.MAX_FRAGMENT_SAMPLER_COUNT;
(void)exts;
(void)gets;
(void)bugs;
FeatureLevel featureLevel = FeatureLevel::FEATURE_LEVEL_1;
#ifdef BACKEND_OPENGL_VERSION_GLES
if (major == 3) {
// Runtime OpenGL version is ES 3.x
assert_invariant(gets.max_texture_image_units >= 16);
assert_invariant(gets.max_combined_texture_image_units >= 32);
if (minor >= 1) {
// figure out our feature level
if (exts.EXT_texture_cube_map_array) {
featureLevel = FeatureLevel::FEATURE_LEVEL_2;
if (gets.max_texture_image_units >= MAX_FRAGMENT_SAMPLER_COUNT &&
gets.max_combined_texture_image_units >=
(MAX_FRAGMENT_SAMPLER_COUNT + MAX_VERTEX_SAMPLER_COUNT)) {
featureLevel = FeatureLevel::FEATURE_LEVEL_3;
}
}
}
}
# ifndef IOS // IOS is guaranteed to have ES3.x
else if (UTILS_UNLIKELY(major == 2)) {
// Runtime OpenGL version is ES 2.x
# if defined(BACKEND_OPENGL_LEVEL_GLES30)
// mandatory extensions (all supported by Mali-400 and Adreno 304)
assert_invariant(exts.OES_depth_texture);
assert_invariant(exts.OES_depth24);
assert_invariant(exts.OES_packed_depth_stencil);
assert_invariant(exts.OES_rgb8_rgba8);
assert_invariant(exts.OES_standard_derivatives);
assert_invariant(exts.OES_texture_npot);
# endif
featureLevel = FeatureLevel::FEATURE_LEVEL_0;
}
# endif // IOS
#else
assert_invariant(gets.max_texture_image_units >= 16);
assert_invariant(gets.max_combined_texture_image_units >= 32);
if (major == 4) {
assert_invariant(minor >= 1);
if (minor >= 3) {
// cubemap arrays are available as of OpenGL 4.0
featureLevel = FeatureLevel::FEATURE_LEVEL_2;
// figure out our feature level
if (gets.max_texture_image_units >= MAX_FRAGMENT_SAMPLER_COUNT &&
gets.max_combined_texture_image_units >=
(MAX_FRAGMENT_SAMPLER_COUNT + MAX_VERTEX_SAMPLER_COUNT)) {
featureLevel = FeatureLevel::FEATURE_LEVEL_3;
}
}
}
#endif
if (bugs.force_feature_level0) {
featureLevel = FeatureLevel::FEATURE_LEVEL_0;
}
return featureLevel;
}
#ifdef BACKEND_OPENGL_VERSION_GLES
void OpenGLContext::initExtensionsGLES() noexcept {
void OpenGLContext::initExtensionsGLES(Extensions* ext, GLint major, GLint minor) noexcept {
const char * const extensions = (const char*)glGetString(GL_EXTENSIONS);
GLUtils::unordered_string_set const exts = GLUtils::split(extensions);
if constexpr (DEBUG_PRINT_EXTENSIONS) {
@@ -467,50 +560,50 @@ void OpenGLContext::initExtensionsGLES() noexcept {
// figure out and initialize the extensions we need
using namespace std::literals;
ext.APPLE_color_buffer_packed_float = exts.has("GL_APPLE_color_buffer_packed_float"sv);
ext.EXT_clip_control = exts.has("GL_EXT_clip_control"sv);
ext.EXT_color_buffer_float = exts.has("GL_EXT_color_buffer_float"sv);
ext.EXT_color_buffer_half_float = exts.has("GL_EXT_color_buffer_half_float"sv);
ext.EXT_debug_marker = exts.has("GL_EXT_debug_marker"sv);
ext.EXT_discard_framebuffer = exts.has("GL_EXT_discard_framebuffer"sv);
ext.EXT_disjoint_timer_query = exts.has("GL_EXT_disjoint_timer_query"sv);
ext.EXT_multisampled_render_to_texture = exts.has("GL_EXT_multisampled_render_to_texture"sv);
ext.EXT_multisampled_render_to_texture2 = exts.has("GL_EXT_multisampled_render_to_texture2"sv);
ext.EXT_shader_framebuffer_fetch = exts.has("GL_EXT_shader_framebuffer_fetch"sv);
ext->APPLE_color_buffer_packed_float = exts.has("GL_APPLE_color_buffer_packed_float"sv);
ext->EXT_clip_control = exts.has("GL_EXT_clip_control"sv);
ext->EXT_clip_cull_distance = exts.has("GL_EXT_clip_cull_distance"sv);
ext->EXT_color_buffer_float = exts.has("GL_EXT_color_buffer_float"sv);
ext->EXT_color_buffer_half_float = exts.has("GL_EXT_color_buffer_half_float"sv);
ext->EXT_debug_marker = exts.has("GL_EXT_debug_marker"sv);
ext->EXT_discard_framebuffer = exts.has("GL_EXT_discard_framebuffer"sv);
ext->EXT_disjoint_timer_query = exts.has("GL_EXT_disjoint_timer_query"sv);
ext->EXT_multisampled_render_to_texture = exts.has("GL_EXT_multisampled_render_to_texture"sv);
ext->EXT_multisampled_render_to_texture2 = exts.has("GL_EXT_multisampled_render_to_texture2"sv);
ext->EXT_shader_framebuffer_fetch = exts.has("GL_EXT_shader_framebuffer_fetch"sv);
#if !defined(__EMSCRIPTEN__)
ext.EXT_texture_compression_etc2 = true;
ext->EXT_texture_compression_etc2 = true;
#endif
ext.EXT_texture_compression_s3tc = exts.has("GL_EXT_texture_compression_s3tc"sv);
ext.EXT_texture_compression_s3tc_srgb = exts.has("GL_EXT_texture_compression_s3tc_srgb"sv);
ext.EXT_texture_compression_rgtc = exts.has("GL_EXT_texture_compression_rgtc"sv);
ext.EXT_texture_compression_bptc = exts.has("GL_EXT_texture_compression_bptc"sv);
ext.EXT_texture_cube_map_array = exts.has("GL_EXT_texture_cube_map_array"sv) || exts.has("GL_OES_texture_cube_map_array"sv);
ext.GOOGLE_cpp_style_line_directive = exts.has("GL_GOOGLE_cpp_style_line_directive"sv);
ext.KHR_debug = exts.has("GL_KHR_debug"sv);
ext.KHR_parallel_shader_compile = exts.has("GL_KHR_parallel_shader_compile"sv);
ext.KHR_texture_compression_astc_hdr = exts.has("GL_KHR_texture_compression_astc_hdr"sv);
ext.KHR_texture_compression_astc_ldr = exts.has("GL_KHR_texture_compression_astc_ldr"sv);
ext.OES_depth_texture = exts.has("GL_OES_depth_texture"sv);
ext.OES_depth24 = exts.has("GL_OES_depth24"sv);
ext.OES_packed_depth_stencil = exts.has("GL_OES_packed_depth_stencil"sv);
ext.OES_EGL_image_external_essl3 = exts.has("GL_OES_EGL_image_external_essl3"sv);
ext.OES_rgb8_rgba8 = exts.has("GL_OES_rgb8_rgba8"sv);
ext.OES_standard_derivatives = exts.has("GL_OES_standard_derivatives"sv);
ext.OES_texture_npot = exts.has("GL_OES_texture_npot"sv);
ext.OES_vertex_array_object = exts.has("GL_OES_vertex_array_object"sv);
ext.WEBGL_compressed_texture_etc = exts.has("WEBGL_compressed_texture_etc"sv);
ext.WEBGL_compressed_texture_s3tc = exts.has("WEBGL_compressed_texture_s3tc"sv);
ext.WEBGL_compressed_texture_s3tc_srgb = exts.has("WEBGL_compressed_texture_s3tc_srgb"sv);
ext->EXT_texture_compression_s3tc = exts.has("GL_EXT_texture_compression_s3tc"sv);
ext->EXT_texture_compression_s3tc_srgb = exts.has("GL_EXT_texture_compression_s3tc_srgb"sv);
ext->EXT_texture_compression_rgtc = exts.has("GL_EXT_texture_compression_rgtc"sv);
ext->EXT_texture_compression_bptc = exts.has("GL_EXT_texture_compression_bptc"sv);
ext->EXT_texture_cube_map_array = exts.has("GL_EXT_texture_cube_map_array"sv) || exts.has("GL_OES_texture_cube_map_array"sv);
ext->GOOGLE_cpp_style_line_directive = exts.has("GL_GOOGLE_cpp_style_line_directive"sv);
ext->KHR_debug = exts.has("GL_KHR_debug"sv);
ext->KHR_parallel_shader_compile = exts.has("GL_KHR_parallel_shader_compile"sv);
ext->KHR_texture_compression_astc_hdr = exts.has("GL_KHR_texture_compression_astc_hdr"sv);
ext->KHR_texture_compression_astc_ldr = exts.has("GL_KHR_texture_compression_astc_ldr"sv);
ext->OES_depth_texture = exts.has("GL_OES_depth_texture"sv);
ext->OES_depth24 = exts.has("GL_OES_depth24"sv);
ext->OES_packed_depth_stencil = exts.has("GL_OES_packed_depth_stencil"sv);
ext->OES_EGL_image_external_essl3 = exts.has("GL_OES_EGL_image_external_essl3"sv);
ext->OES_rgb8_rgba8 = exts.has("GL_OES_rgb8_rgba8"sv);
ext->OES_standard_derivatives = exts.has("GL_OES_standard_derivatives"sv);
ext->OES_texture_npot = exts.has("GL_OES_texture_npot"sv);
ext->OES_vertex_array_object = exts.has("GL_OES_vertex_array_object"sv);
ext->WEBGL_compressed_texture_etc = exts.has("WEBGL_compressed_texture_etc"sv);
ext->WEBGL_compressed_texture_s3tc = exts.has("WEBGL_compressed_texture_s3tc"sv);
ext->WEBGL_compressed_texture_s3tc_srgb = exts.has("WEBGL_compressed_texture_s3tc_srgb"sv);
// ES 3.2 implies EXT_color_buffer_float
if (state.major > 3 || (state.major == 3 && state.minor >= 2)) {
ext.EXT_color_buffer_float = true;
if (major > 3 || (major == 3 && minor >= 2)) {
ext->EXT_color_buffer_float = true;
}
// ES 3.x implies EXT_discard_framebuffer and OES_vertex_array_object
if (state.major >= 3) {
ext.EXT_color_buffer_float = true;
ext.OES_vertex_array_object = true;
if (major >= 3) {
ext->EXT_discard_framebuffer = true;
ext->OES_vertex_array_object = true;
}
}
@@ -518,7 +611,7 @@ void OpenGLContext::initExtensionsGLES() noexcept {
#ifdef BACKEND_OPENGL_VERSION_GL
void OpenGLContext::initExtensionsGL() noexcept {
void OpenGLContext::initExtensionsGL(Extensions* ext, GLint major, GLint minor) noexcept {
GLUtils::unordered_string_set exts;
GLint n = 0;
glGetIntegerv(GL_NUM_EXTENSIONS, &n);
@@ -533,54 +626,52 @@ void OpenGLContext::initExtensionsGL() noexcept {
}
using namespace std::literals;
ext.APPLE_color_buffer_packed_float = true; // Assumes core profile.
ext.ARB_shading_language_packing = exts.has("GL_ARB_shading_language_packing"sv);
ext.EXT_color_buffer_float = true; // Assumes core profile.
ext.EXT_color_buffer_half_float = true; // Assumes core profile.
ext.EXT_debug_marker = exts.has("GL_EXT_debug_marker"sv);
ext.EXT_discard_framebuffer = false;
ext.EXT_disjoint_timer_query = true;
ext.EXT_multisampled_render_to_texture = false;
ext.EXT_multisampled_render_to_texture2 = false;
ext.EXT_shader_framebuffer_fetch = exts.has("GL_EXT_shader_framebuffer_fetch"sv);
ext.EXT_texture_compression_bptc = exts.has("GL_EXT_texture_compression_bptc"sv);
ext.EXT_texture_compression_etc2 = exts.has("GL_ARB_ES3_compatibility"sv);
ext.EXT_texture_compression_rgtc = exts.has("GL_EXT_texture_compression_rgtc"sv);
ext.EXT_texture_compression_s3tc = exts.has("GL_EXT_texture_compression_s3tc"sv);
ext.EXT_texture_compression_s3tc_srgb = exts.has("GL_EXT_texture_compression_s3tc_srgb"sv);
ext.EXT_texture_cube_map_array = true;
ext.EXT_texture_filter_anisotropic = exts.has("GL_EXT_texture_filter_anisotropic"sv);
ext.EXT_texture_sRGB = exts.has("GL_EXT_texture_sRGB"sv);
ext.GOOGLE_cpp_style_line_directive = exts.has("GL_GOOGLE_cpp_style_line_directive"sv);
ext.KHR_parallel_shader_compile = exts.has("GL_KHR_parallel_shader_compile"sv);
ext.KHR_texture_compression_astc_hdr = exts.has("GL_KHR_texture_compression_astc_hdr"sv);
ext.KHR_texture_compression_astc_ldr = exts.has("GL_KHR_texture_compression_astc_ldr"sv);
ext.OES_depth_texture = true;
ext.OES_depth24 = true;
ext.OES_EGL_image_external_essl3 = false;
ext.OES_rgb8_rgba8 = true;
ext.OES_standard_derivatives = true;
ext.OES_texture_npot = true;
ext.OES_vertex_array_object = true;
ext.WEBGL_compressed_texture_etc = false;
ext.WEBGL_compressed_texture_s3tc = false;
ext.WEBGL_compressed_texture_s3tc_srgb = false;
auto const major = state.major;
auto const minor = state.minor;
ext->APPLE_color_buffer_packed_float = true; // Assumes core profile.
ext->ARB_shading_language_packing = exts.has("GL_ARB_shading_language_packing"sv);
ext->EXT_color_buffer_float = true; // Assumes core profile.
ext->EXT_color_buffer_half_float = true; // Assumes core profile.
ext->EXT_clip_cull_distance = true;
ext->EXT_debug_marker = exts.has("GL_EXT_debug_marker"sv);
ext->EXT_discard_framebuffer = false;
ext->EXT_disjoint_timer_query = true;
ext->EXT_multisampled_render_to_texture = false;
ext->EXT_multisampled_render_to_texture2 = false;
ext->EXT_shader_framebuffer_fetch = exts.has("GL_EXT_shader_framebuffer_fetch"sv);
ext->EXT_texture_compression_bptc = exts.has("GL_EXT_texture_compression_bptc"sv);
ext->EXT_texture_compression_etc2 = exts.has("GL_ARB_ES3_compatibility"sv);
ext->EXT_texture_compression_rgtc = exts.has("GL_EXT_texture_compression_rgtc"sv);
ext->EXT_texture_compression_s3tc = exts.has("GL_EXT_texture_compression_s3tc"sv);
ext->EXT_texture_compression_s3tc_srgb = exts.has("GL_EXT_texture_compression_s3tc_srgb"sv);
ext->EXT_texture_cube_map_array = true;
ext->EXT_texture_filter_anisotropic = exts.has("GL_EXT_texture_filter_anisotropic"sv);
ext->EXT_texture_sRGB = exts.has("GL_EXT_texture_sRGB"sv);
ext->GOOGLE_cpp_style_line_directive = exts.has("GL_GOOGLE_cpp_style_line_directive"sv);
ext->KHR_parallel_shader_compile = exts.has("GL_KHR_parallel_shader_compile"sv);
ext->KHR_texture_compression_astc_hdr = exts.has("GL_KHR_texture_compression_astc_hdr"sv);
ext->KHR_texture_compression_astc_ldr = exts.has("GL_KHR_texture_compression_astc_ldr"sv);
ext->OES_depth_texture = true;
ext->OES_depth24 = true;
ext->OES_EGL_image_external_essl3 = false;
ext->OES_rgb8_rgba8 = true;
ext->OES_standard_derivatives = true;
ext->OES_texture_npot = true;
ext->OES_vertex_array_object = true;
ext->WEBGL_compressed_texture_etc = false;
ext->WEBGL_compressed_texture_s3tc = false;
ext->WEBGL_compressed_texture_s3tc_srgb = false;
// OpenGL 4.2 implies ARB_shading_language_packing
if (major > 4 || (major == 4 && minor >= 2)) {
ext.ARB_shading_language_packing = true;
ext->ARB_shading_language_packing = true;
}
// OpenGL 4.3 implies EXT_discard_framebuffer
if (major > 4 || (major == 4 && minor >= 3)) {
ext.EXT_discard_framebuffer = true;
ext.KHR_debug = true;
ext->EXT_discard_framebuffer = true;
ext->KHR_debug = true;
}
// OpenGL 4.5 implies EXT_clip_control
if (major > 4 || (major == 4 && minor >= 5)) {
ext.EXT_clip_control = true;
ext->EXT_clip_control = true;
}
}
@@ -676,7 +767,7 @@ void OpenGLContext::deleteBuffers(GLsizei n, const GLuint* buffers, GLenum targe
}
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
assert_invariant(state.major > 2 ||
assert_invariant(mFeatureLevel >= FeatureLevel::FEATURE_LEVEL_1 ||
(target != GL_UNIFORM_BUFFER && target != GL_TRANSFORM_FEEDBACK_BUFFER));
if (target == GL_UNIFORM_BUFFER || target == GL_TRANSFORM_FEEDBACK_BUFFER) {
@@ -888,63 +979,4 @@ void OpenGLContext::resetState() noexcept {
}
OpenGLContext::FenceSync OpenGLContext::createFenceSync(
OpenGLPlatform& platform) noexcept {
if (UTILS_UNLIKELY(isES2())) {
assert_invariant(platform.canCreateFence());
return { .fence = platform.createFence() };
}
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
auto sync = glFenceSync(GL_SYNC_GPU_COMMANDS_COMPLETE, 0);
CHECK_GL_ERROR(utils::slog.e)
return { .sync = sync };
#else
return {};
#endif
}
void OpenGLContext::destroyFenceSync(
OpenGLPlatform& platform, FenceSync sync) noexcept {
if (UTILS_UNLIKELY(isES2())) {
platform.destroyFence(static_cast<Platform::Fence*>(sync.fence));
return;
}
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
glDeleteSync(sync.sync);
CHECK_GL_ERROR(utils::slog.e)
#endif
}
OpenGLContext::FenceSync::Status OpenGLContext::clientWaitSync(
OpenGLPlatform& platform, FenceSync sync) const noexcept {
if (UTILS_UNLIKELY(isES2())) {
using Status = OpenGLContext::FenceSync::Status;
auto const status = platform.waitFence(static_cast<Platform::Fence*>(sync.fence), 0u);
switch (status) {
case FenceStatus::ERROR: return Status::FAILURE;
case FenceStatus::CONDITION_SATISFIED: return Status::CONDITION_SATISFIED;
case FenceStatus::TIMEOUT_EXPIRED: return Status ::TIMEOUT_EXPIRED;
}
}
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
GLenum const status = glClientWaitSync(sync.sync, 0, 0u);
CHECK_GL_ERROR(utils::slog.e)
using Status = OpenGLContext::FenceSync::Status;
switch (status) {
case GL_ALREADY_SIGNALED: return Status::ALREADY_SIGNALED;
case GL_TIMEOUT_EXPIRED: return Status::TIMEOUT_EXPIRED;
case GL_CONDITION_SATISFIED: return Status::CONDITION_SATISFIED;
default: return Status::FAILURE;
}
#else
return FenceSync::Status::FAILURE;
#endif
}
} // namesapce filament

View File

@@ -92,7 +92,7 @@ public:
# ifndef BACKEND_OPENGL_LEVEL_GLES30
return true;
# else
return state.major == 2;
return mFeatureLevel == FeatureLevel::FEATURE_LEVEL_0;
# endif
#else
return false;
@@ -150,28 +150,8 @@ public:
void deleteBuffers(GLsizei n, const GLuint* buffers, GLenum target) noexcept;
void deleteVertexArrays(GLsizei n, const GLuint* arrays) noexcept;
// we abstract GL's sync because it's not available in ES2, but we can use EGL's sync
// instead, if available.
struct FenceSync {
enum class Status {
ALREADY_SIGNALED,
TIMEOUT_EXPIRED,
CONDITION_SATISFIED,
FAILURE
};
union {
void* fence;
GLsync sync;
};
};
FenceSync createFenceSync(OpenGLPlatform& platform) noexcept;
void destroyFenceSync(OpenGLPlatform& platform, FenceSync sync) noexcept;
FenceSync::Status clientWaitSync(OpenGLPlatform& platform, FenceSync sync) const noexcept;
// glGet*() values
struct {
struct Gets {
GLfloat max_anisotropy;
GLint max_draw_buffers;
GLint max_renderbuffer_size;
@@ -190,10 +170,11 @@ public:
} features = {};
// supported extensions detected at runtime
struct {
struct Extensions {
bool APPLE_color_buffer_packed_float;
bool ARB_shading_language_packing;
bool EXT_clip_control;
bool EXT_clip_cull_distance;
bool EXT_color_buffer_float;
bool EXT_color_buffer_half_float;
bool EXT_debug_marker;
@@ -228,7 +209,7 @@ public:
bool WEBGL_compressed_texture_s3tc_srgb;
} ext = {};
struct {
struct Bugs {
// Some drivers have issues with UBOs in the fragment shader when
// glFlush() is called between draw calls.
bool disable_glFlush;
@@ -280,6 +261,24 @@ public:
// Some Adreno drivers crash in glDrawXXX() when there's an uninitialized uniform block,
// even when the shader doesn't access it.
bool enable_initialize_non_used_uniform_array;
// Workarounds specific to PowerVR GPUs affecting shaders (currently, we lump them all
// under one specialization constant).
// - gl_InstanceID is invalid when used first in the vertex shader
bool powervr_shader_workarounds;
// On PowerVR destroying the destination of a glBlitFramebuffer operation is equivalent to
// a glFinish. So we must delay the destruction until we know the GPU is finished.
bool delay_fbo_destruction;
// The driver has some threads pinned, and we can't easily know on which core, it can hurt
// performance more if we end-up pinned on the same one.
bool disable_thread_affinity;
// Force feature level 0. Typically used for low end ES3 devices with significant driver
// bugs or performance issues.
bool force_feature_level0;
} bugs = {};
// state getters -- as needed.
@@ -402,7 +401,7 @@ public:
} window;
} state;
struct {
struct Procs {
void (* bindVertexArray)(GLuint array);
void (* deleteVertexArrays)(GLsizei n, const GLuint* arrays);
void (* genVertexArrays)(GLsizei n, GLuint* arrays);
@@ -463,18 +462,55 @@ private:
{ bugs.enable_initialize_non_used_uniform_array,
"enable_initialize_non_used_uniform_array",
""},
{ bugs.powervr_shader_workarounds,
"powervr_shader_workarounds",
""},
{ bugs.delay_fbo_destruction,
"delay_fbo_destruction",
""},
{ bugs.disable_thread_affinity,
"disable_thread_affinity",
""},
{ bugs.force_feature_level0,
"force_feature_level0",
""},
}};
RenderPrimitive mDefaultVAO;
// this is chosen to minimize code size
#if defined(BACKEND_OPENGL_VERSION_GLES)
void initExtensionsGLES() noexcept;
static void initExtensionsGLES(Extensions* ext, GLint major, GLint minor) noexcept;
#endif
#if defined(BACKEND_OPENGL_VERSION_GL)
void initExtensionsGL() noexcept;
static void initExtensionsGL(Extensions* ext, GLint major, GLint minor) noexcept;
#endif
static void initExtensions(Extensions* ext, GLint major, GLint minor) noexcept {
#if defined(BACKEND_OPENGL_VERSION_GLES)
initExtensionsGLES(ext, major, minor);
#endif
#if defined(BACKEND_OPENGL_VERSION_GL)
initExtensionsGL(ext, major, minor);
#endif
}
static void initBugs(Bugs* bugs, Extensions const& exts,
GLint major, GLint minor,
char const* vendor,
char const* renderer,
char const* version,
char const* shader
);
static void initProcs(Procs* procs,
Extensions const& exts, GLint major, GLint minor) noexcept;
static FeatureLevel resolveFeatureLevel(GLint major, GLint minor,
Extensions const& exts,
Gets const& gets,
Bugs const& bugs) noexcept;
template <typename T, typename F>
static inline void update_state(T& state, T const& expected, F functor, bool force = false) noexcept {
if (UTILS_UNLIKELY(force || state != expected)) {
@@ -567,7 +603,7 @@ void OpenGLContext::activeTexture(GLuint unit) noexcept {
void OpenGLContext::bindSampler(GLuint unit, GLuint sampler) noexcept {
assert_invariant(unit < MAX_TEXTURE_UNIT_COUNT);
assert_invariant(state.major > 2);
assert_invariant(mFeatureLevel >= FeatureLevel::FEATURE_LEVEL_1);
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
update_state(state.textures.units[unit].sampler, sampler, [&]() {
glBindSampler(unit, sampler);
@@ -613,7 +649,7 @@ void OpenGLContext::bindVertexArray(RenderPrimitive const* p) noexcept {
void OpenGLContext::bindBufferRange(GLenum target, GLuint index, GLuint buffer,
GLintptr offset, GLsizeiptr size) noexcept {
assert_invariant(state.major > 2);
assert_invariant(mFeatureLevel >= FeatureLevel::FEATURE_LEVEL_1);
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
# ifdef BACKEND_OPENGL_LEVEL_GLES31

View File

@@ -191,27 +191,7 @@ OpenGLDriver::OpenGLDriver(OpenGLPlatform* platform, const Platform::DriverConfi
assert_invariant(mContext.ext.EXT_disjoint_timer_query);
#endif
#if defined(BACKEND_OPENGL_VERSION_GL) || defined(GL_EXT_disjoint_timer_query)
if (mContext.ext.EXT_disjoint_timer_query) {
// timer queries are available
if (mContext.bugs.dont_use_timer_query && mPlatform.canCreateFence()) {
// however, they don't work well, revert to using fences if we can.
mTimerQueryImpl = new OpenGLTimerQueryFence(mPlatform);
} else {
mTimerQueryImpl = new TimerQueryNative(mContext);
}
mFrameTimeSupported = true;
} else
#endif
if (mPlatform.canCreateFence()) {
// no timer queries, but we can use fences
mTimerQueryImpl = new OpenGLTimerQueryFence(mPlatform);
mFrameTimeSupported = true;
} else {
// no queries, no fences -- that's a problem
mTimerQueryImpl = new TimerQueryFallback();
mFrameTimeSupported = false;
}
mTimerQueryImpl = OpenGLTimerQueryFactory::init(mPlatform, *this);
mShaderCompilerService.init();
}
@@ -231,13 +211,23 @@ void OpenGLDriver::terminate() {
// wait for the GPU to finish executing all commands
glFinish();
mShaderCompilerService.terminate();
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
// and make sure to execute all the GpuCommandCompleteOps callbacks
executeGpuCommandsCompleteOps();
// as well as the FrameCompleteOps callbacks
if (UTILS_UNLIKELY(!mFrameCompleteOps.empty())) {
for (auto&& op: mFrameCompleteOps) {
op();
}
mFrameCompleteOps.clear();
}
// because we called glFinish(), all callbacks should have been executed
assert_invariant(mGpuCommandCompleteOps.empty());
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
if (!getContext().isES2()) {
for (auto& item: mSamplerMap) {
mContext.unbindSampler(item.second);
@@ -249,8 +239,6 @@ void OpenGLDriver::terminate() {
delete mTimerQueryImpl;
mShaderCompilerService.terminate();
mPlatform.terminate();
}
@@ -436,11 +424,7 @@ Handle<HwRenderTarget> OpenGLDriver::createRenderTargetS() noexcept {
}
Handle<HwFence> OpenGLDriver::createFenceS() noexcept {
return initHandle<HwFence>();
}
Handle<HwSync> OpenGLDriver::createSyncS() noexcept {
return initHandle<GLSync>();
return initHandle<GLFence>();
}
Handle<HwSwapChain> OpenGLDriver::createSwapChainS() noexcept {
@@ -1352,28 +1336,25 @@ void OpenGLDriver::createRenderTargetR(Handle<HwRenderTarget> rth,
void OpenGLDriver::createFenceR(Handle<HwFence> fh, int) {
DEBUG_MARKER()
HwFence* f = handle_cast<HwFence*>(fh);
f->fence = mPlatform.createFence();
}
GLFence* f = handle_cast<GLFence*>(fh);
void OpenGLDriver::createSyncR(Handle<HwSync> fh, int) {
DEBUG_MARKER()
GLSync* f = handle_cast<GLSync *>(fh);
f->handle = mContext.createFenceSync(mPlatform);
// check the status of the sync once a frame, since we must do this from our thread
std::weak_ptr<GLSync::State> const weak = f->result;
runEveryNowAndThen(
[&platform = mPlatform, context = mContext, handle = f->handle, weak]() -> bool {
auto result = weak.lock();
if (result) {
auto const status = context.clientWaitSync(platform, handle);
result->status.store(status, std::memory_order_relaxed);
return (status != OpenGLContext::FenceSync::Status::TIMEOUT_EXPIRED);
}
return true;
});
if (mPlatform.canCreateFence() || mContext.isES2()) {
assert_invariant(mPlatform.canCreateFence());
f->fence = mPlatform.createFence();
}
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
else {
std::weak_ptr<GLFence::State> const weak = f->state;
whenGpuCommandsComplete([weak](){
auto state = weak.lock();
if (state) {
std::lock_guard const lock(state->lock);
state->status = FenceStatus::CONDITION_SATISFIED;
state->cond.notify_all();
}
});
}
#endif
}
void OpenGLDriver::createSwapChainR(Handle<HwSwapChain> sch, void* nativeWindow, uint64_t flags) {
@@ -1405,10 +1386,8 @@ void OpenGLDriver::createSwapChainHeadlessR(Handle<HwSwapChain> sch,
void OpenGLDriver::createTimerQueryR(Handle<HwTimerQuery> tqh, int) {
DEBUG_MARKER()
GLTimerQuery* tq = handle_cast<GLTimerQuery*>(tqh);
mContext.procs.genQueries(1u, &tq->gl.query);
CHECK_GL_ERROR(utils::slog.e)
mTimerQueryImpl->createTimerQuery(tq);
}
// ------------------------------------------------------------------------------------------------
@@ -1513,12 +1492,33 @@ void OpenGLDriver::destroyRenderTarget(Handle<HwRenderTarget> rth) {
if (rt->gl.fbo) {
// first unbind this framebuffer if needed
gl.bindFramebuffer(GL_FRAMEBUFFER, 0);
glDeleteFramebuffers(1, &rt->gl.fbo);
}
if (rt->gl.fbo_read) {
// first unbind this framebuffer if needed
gl.bindFramebuffer(GL_FRAMEBUFFER, 0);
glDeleteFramebuffers(1, &rt->gl.fbo_read);
}
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
if (UTILS_UNLIKELY(gl.bugs.delay_fbo_destruction)) {
if (rt->gl.fbo) {
whenFrameComplete([fbo = rt->gl.fbo]() {
glDeleteFramebuffers(1, &fbo);
});
}
if (rt->gl.fbo_read) {
whenFrameComplete([fbo_read = rt->gl.fbo_read]() {
glDeleteFramebuffers(1, &fbo_read);
});
}
} else
#endif
{
if (rt->gl.fbo) {
glDeleteFramebuffers(1, &rt->gl.fbo);
}
if (rt->gl.fbo_read) {
glDeleteFramebuffers(1, &rt->gl.fbo_read);
}
}
destruct(rth, rt);
}
@@ -1567,20 +1567,11 @@ void OpenGLDriver::destroyTimerQuery(Handle<HwTimerQuery> tqh) {
if (tqh) {
GLTimerQuery* tq = handle_cast<GLTimerQuery*>(tqh);
getContext().procs.deleteQueries(1u, &tq->gl.query);
mTimerQueryImpl->destroyTimerQuery(tq);
destruct(tqh, tq);
}
}
void OpenGLDriver::destroySync(Handle<HwSync> sh) {
DEBUG_MARKER()
if (sh) {
GLSync* s = handle_cast<GLSync*>(sh);
mContext.destroyFenceSync(mPlatform, s->handle);
destruct(sh, s);
}
}
// ------------------------------------------------------------------------------------------------
// Synchronous APIs
// These are called on the application's thread
@@ -1683,24 +1674,39 @@ int64_t OpenGLDriver::getStreamTimestamp(Handle<HwStream> sh) {
void OpenGLDriver::destroyFence(Handle<HwFence> fh) {
if (fh) {
HwFence* f = handle_cast<HwFence*>(fh);
mPlatform.destroyFence(f->fence);
GLFence* f = handle_cast<GLFence*>(fh);
if (mPlatform.canCreateFence() || mContext.isES2()) {
mPlatform.destroyFence(f->fence);
}
destruct(fh, f);
}
}
FenceStatus OpenGLDriver::wait(Handle<HwFence> fh, uint64_t timeout) {
FenceStatus OpenGLDriver::getFenceStatus(Handle<HwFence> fh) {
if (fh) {
HwFence* f = handle_cast<HwFence*>(fh);
if (f->fence == nullptr) {
// we can end-up here if:
// - the platform doesn't support h/w fences
// - wait() was called before the fence was asynchronously created.
// This case is not handled in OpenGLDriver but is handled by FFence.
// TODO: move FFence logic into the backend.
return FenceStatus::ERROR;
GLFence* f = handle_cast<GLFence*>(fh);
if (mPlatform.canCreateFence() || mContext.isES2()) {
if (f->fence == nullptr) {
// we can end-up here if:
// - the platform doesn't support h/w fences
if (UTILS_UNLIKELY(!mPlatform.canCreateFence())) {
return FenceStatus::ERROR;
}
// - wait() was called before the fence was asynchronously created.
return FenceStatus::TIMEOUT_EXPIRED;
}
return mPlatform.waitFence(f->fence, 0);
}
return mPlatform.waitFence(f->fence, timeout);
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
else {
assert_invariant(f->state);
std::unique_lock lock(f->state->lock);
f->state->cond.wait_for(lock, std::chrono::nanoseconds(0), [&state = f->state]() {
return state->status != FenceStatus::TIMEOUT_EXPIRED;
});
return f->state->status;
}
#endif
}
return FenceStatus::ERROR;
}
@@ -1848,7 +1854,7 @@ bool OpenGLDriver::isFrameBufferFetchMultiSampleSupported() {
}
bool OpenGLDriver::isFrameTimeSupported() {
return mFrameTimeSupported;
return OpenGLTimerQueryFactory::isGpuTimeSupported();
}
bool OpenGLDriver::isAutoDepthResolveSupported() {
@@ -1866,6 +1872,18 @@ bool OpenGLDriver::isSRGBSwapChainSupported() {
return mPlatform.isSRGBSwapChainSupported();
}
bool OpenGLDriver::isStereoSupported() {
// Stereo requires instancing and EXT_clip_cull_distance.
if (UTILS_UNLIKELY(mContext.isES2())) {
return false;
}
return mContext.ext.EXT_clip_cull_distance;
}
bool OpenGLDriver::isParallelShaderCompileSupported() {
return mShaderCompilerService.isParallelShaderCompileSupported();
}
bool OpenGLDriver::isWorkaroundNeeded(Workaround workaround) {
switch (workaround) {
case Workaround::SPLIT_EASU:
@@ -1876,6 +1894,10 @@ bool OpenGLDriver::isWorkaroundNeeded(Workaround workaround) {
return mContext.bugs.enable_initialize_non_used_uniform_array;
case Workaround::DISABLE_BLIT_INTO_TEXTURE_ARRAY:
return mContext.bugs.disable_blit_into_texture_array;
case Workaround::POWER_VR_SHADER_WORKAROUNDS:
return mContext.bugs.powervr_shader_workarounds;
case Workaround::DISABLE_THREAD_AFFINITY:
return mContext.bugs.disable_thread_affinity;
default:
return false;
}
@@ -1912,6 +1934,16 @@ void OpenGLDriver::commit(Handle<HwSwapChain> sch) {
GLSwapChain* sc = handle_cast<GLSwapChain*>(sch);
mPlatform.commit(sc->swapChain);
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
if (UTILS_UNLIKELY(!mFrameCompleteOps.empty())) {
whenGpuCommandsComplete([ops = std::move(mFrameCompleteOps)]() {
for (auto&& op: ops) {
op();
}
});
}
#endif
}
void OpenGLDriver::makeCurrent(Handle<HwSwapChain> schDraw, Handle<HwSwapChain> schRead) {
@@ -2556,8 +2588,6 @@ void OpenGLDriver::replaceStream(GLTexture* texture, GLStream* newStream) noexce
void OpenGLDriver::beginTimerQuery(Handle<HwTimerQuery> tqh) {
DEBUG_MARKER()
GLTimerQuery* tq = handle_cast<GLTimerQuery*>(tqh);
// reset the state of the result availability
tq->elapsed.store(0, std::memory_order_relaxed);
mTimerQueryImpl->beginTimeElapsedQuery(tq);
}
@@ -2565,50 +2595,15 @@ void OpenGLDriver::endTimerQuery(Handle<HwTimerQuery> tqh) {
DEBUG_MARKER()
GLTimerQuery* tq = handle_cast<GLTimerQuery*>(tqh);
mTimerQueryImpl->endTimeElapsedQuery(tq);
runEveryNowAndThen([this, tq]() -> bool {
if (!mTimerQueryImpl->queryResultAvailable(tq)) {
// we need to try this one again later
return false;
}
tq->elapsed.store(mTimerQueryImpl->queryResult(tq), std::memory_order_relaxed);
return true;
});
}
bool OpenGLDriver::getTimerQueryValue(Handle<HwTimerQuery> tqh, uint64_t* elapsedTime) {
GLTimerQuery* tq = handle_cast<GLTimerQuery*>(tqh);
uint64_t const d = tq->elapsed.load(std::memory_order_relaxed);
if (!d) {
return false;
}
if (elapsedTime) {
*elapsedTime = d;
}
return true;
return OpenGLTimerQueryInterface::getTimerQueryValue(tq, elapsedTime);
}
SyncStatus OpenGLDriver::getSyncStatus(Handle<HwSync> sh) {
GLSync* s = handle_cast<GLSync*>(sh);
if (!s->result) {
return SyncStatus::NOT_SIGNALED;
}
auto status = s->result->status.load(std::memory_order_relaxed);
using Status = OpenGLContext::FenceSync::Status;
switch (status) {
case Status::CONDITION_SATISFIED:
case Status::ALREADY_SIGNALED:
return SyncStatus::SIGNALED;
case Status::TIMEOUT_EXPIRED:
return SyncStatus::NOT_SIGNALED;
case Status::FAILURE:
default:
return SyncStatus::ERROR;
}
}
void OpenGLDriver::compilePrograms(CallbackHandler* handler,
CallbackHandler::Callback callback, void* user) {
void OpenGLDriver::compilePrograms(CompilerPriorityQueue priority,
CallbackHandler* handler, CallbackHandler::Callback callback, void* user) {
if (callback) {
getShaderCompilerService().notifyWhenAllProgramsAreReady(handler, callback, user);
}
@@ -2922,7 +2917,7 @@ void OpenGLDriver::bindSamplers(uint32_t index, Handle<HwSamplerGroup> sbh) {
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
GLuint OpenGLDriver::getSamplerSlow(SamplerParams params) const noexcept {
assert_invariant(mSamplerMap.find(params.u) == mSamplerMap.end());
assert_invariant(mSamplerMap.find(params) == mSamplerMap.end());
GLuint s;
glGenSamplers(1, &s);
@@ -2944,7 +2939,7 @@ GLuint OpenGLDriver::getSamplerSlow(SamplerParams params) const noexcept {
}
#endif
CHECK_GL_ERROR(utils::slog.e)
mSamplerMap[params.u] = s;
mSamplerMap[params] = s;
return s;
}
#endif
@@ -3180,37 +3175,11 @@ void OpenGLDriver::readBufferSubData(backend::BufferObjectHandle boh,
#endif
}
void OpenGLDriver::whenGpuCommandsComplete(std::function<void()> fn) noexcept {
OpenGLContext::FenceSync sync = mContext.createFenceSync(mPlatform);
mGpuCommandCompleteOps.emplace_back(sync, std::move(fn));
CHECK_GL_ERROR(utils::slog.e)
}
void OpenGLDriver::runEveryNowAndThen(std::function<bool()> fn) noexcept {
mEveryNowAndThenOps.push_back(std::move(fn));
}
void OpenGLDriver::executeGpuCommandsCompleteOps() noexcept {
auto& v = mGpuCommandCompleteOps;
auto it = v.begin();
while (it != v.end()) {
using Status = OpenGLContext::FenceSync::Status;
auto const status = mContext.clientWaitSync(mPlatform, it->first);
if (status == Status::ALREADY_SIGNALED || status == Status::CONDITION_SATISFIED) {
it->second();
mContext.destroyFenceSync(mPlatform, it->first);
it = v.erase(it);
} else if (UTILS_UNLIKELY(status == Status::FAILURE)) {
// This should never happen, but is very problematic if it does, as we might leak
// some data depending on what the callback does. However, we clean up our own state.
mContext.destroyFenceSync(mPlatform, it->first);
it = v.erase(it);
} else {
++it;
}
}
}
void OpenGLDriver::executeEveryNowAndThenOps() noexcept {
auto& v = mEveryNowAndThenOps;
auto it = v.begin();
@@ -3223,6 +3192,46 @@ void OpenGLDriver::executeEveryNowAndThenOps() noexcept {
}
}
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
void OpenGLDriver::whenFrameComplete(const std::function<void()>& fn) noexcept {
mFrameCompleteOps.push_back(fn);
}
void OpenGLDriver::whenGpuCommandsComplete(const std::function<void()>& fn) noexcept {
GLsync sync = glFenceSync(GL_SYNC_GPU_COMMANDS_COMPLETE, 0);
mGpuCommandCompleteOps.emplace_back(sync, fn);
CHECK_GL_ERROR(utils::slog.e)
}
void OpenGLDriver::executeGpuCommandsCompleteOps() noexcept {
auto& v = mGpuCommandCompleteOps;
auto it = v.begin();
while (it != v.end()) {
auto const& [sync, fn] = *it;
GLenum const syncStatus = glClientWaitSync(sync, 0, 0u);
switch (syncStatus) {
case GL_TIMEOUT_EXPIRED:
// not ready
++it;
break;
case GL_ALREADY_SIGNALED:
case GL_CONDITION_SATISFIED:
// ready
it->second();
glDeleteSync(sync);
it = v.erase(it);
break;
default:
// This should never happen, but is very problematic if it does, as we might leak
// some data depending on what the callback does. However, we clean up our own state.
glDeleteSync(sync);
it = v.erase(it);
break;
}
}
}
#endif
// ------------------------------------------------------------------------------------------------
// Rendering ops
// ------------------------------------------------------------------------------------------------
@@ -3261,7 +3270,7 @@ void OpenGLDriver::setFrameScheduledCallback(Handle<HwSwapChain> sch,
}
void OpenGLDriver::setFrameCompletedCallback(Handle<HwSwapChain> sch,
FrameCompletedCallback callback, void* user) {
CallbackHandler* handler, CallbackHandler::Callback callback, void* user) {
DEBUG_MARKER()
}
@@ -3296,19 +3305,19 @@ void OpenGLDriver::flush(int) {
if (!gl.bugs.disable_glFlush) {
glFlush();
}
mTimerQueryImpl->flush();
}
void OpenGLDriver::finish(int) {
DEBUG_MARKER()
glFinish();
mTimerQueryImpl->flush();
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
executeGpuCommandsCompleteOps();
assert_invariant(mGpuCommandCompleteOps.empty());
#endif
executeEveryNowAndThenOps();
// Note: since we executed a glFinish(), all pending tasks should be done
assert_invariant(mGpuCommandCompleteOps.empty());
// however, some tasks rely on a separated thread to publish their result (e.g.
// However, some tasks rely on a separated thread to publish their result (e.g.
// endTimerQuery), so the result could very well not be ready, and the task will
// linger a bit longer, this is only true for mEveryNowAndThenOps tasks.
// The fallout of this is that we can't assert that mEveryNowAndThenOps is empty.

View File

@@ -151,15 +151,12 @@ public:
struct GLTimerQuery : public HwTimerQuery {
struct State {
std::atomic<uint64_t> elapsed{};
std::atomic_bool available{};
struct {
GLuint query;
} gl;
std::atomic<int64_t> elapsed{};
};
struct {
GLuint query = 0;
std::shared_ptr<State> emulation;
} gl;
// 0 means not available, otherwise query result in ns.
std::atomic<uint64_t> elapsed{};
std::shared_ptr<State> state;
};
struct GLStream : public HwStream {
@@ -196,14 +193,14 @@ public:
TargetBufferFlags targets = {};
};
struct GLSync : public HwSync {
using HwSync::HwSync;
struct GLFence : public HwFence {
using HwFence::HwFence;
struct State {
std::atomic<OpenGLContext::FenceSync::Status> status{
OpenGLContext::FenceSync::Status::TIMEOUT_EXPIRED };
std::mutex lock;
std::condition_variable cond;
FenceStatus status{ FenceStatus::TIMEOUT_EXPIRED };
};
OpenGLContext::FenceSync handle{};
std::shared_ptr<State> result{ std::make_shared<GLSync::State>() };
std::shared_ptr<State> state{ std::make_shared<GLFence::State>() };
};
OpenGLDriver(OpenGLDriver const&) = delete;
@@ -214,6 +211,8 @@ private:
OpenGLContext mContext;
ShaderCompilerService mShaderCompilerService;
friend class OpenGLTimerQueryFactory;
friend class TimerQueryNative;
OpenGLContext& getContext() noexcept { return mContext; }
ShaderCompilerService& getShaderCompilerService() noexcept {
@@ -335,7 +334,7 @@ private:
assert_invariant(!sp.padding1);
assert_invariant(!sp.padding2);
auto& samplerMap = mSamplerMap;
auto pos = samplerMap.find(sp.u);
auto pos = samplerMap.find(sp);
if (UTILS_UNLIKELY(pos == samplerMap.end())) {
return getSamplerSlow(sp);
}
@@ -369,7 +368,8 @@ private:
// sampler buffer binding points (nullptr if not used)
std::array<GLSamplerGroup*, Program::SAMPLER_BINDING_COUNT> mSamplerBindings = {}; // 4 pointers
mutable tsl::robin_map<uint32_t, GLuint> mSamplerMap;
mutable tsl::robin_map<SamplerParams, GLuint,
SamplerParams::Hasher, SamplerParams::EqualTo> mSamplerMap;
// this must be accessed from the driver thread only
std::vector<GLTexture*> mTexturesWithStreamsAttached;
@@ -383,10 +383,15 @@ private:
void updateTextureLodRange(GLTexture* texture, int8_t targetLevel) noexcept;
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
// tasks executed on the main thread after the fence signaled
void whenGpuCommandsComplete(std::function<void()> fn) noexcept;
void whenGpuCommandsComplete(const std::function<void()>& fn) noexcept;
void executeGpuCommandsCompleteOps() noexcept;
std::vector<std::pair<OpenGLContext::FenceSync, std::function<void()>>> mGpuCommandCompleteOps;
std::vector<std::pair<GLsync, std::function<void()>>> mGpuCommandCompleteOps;
void whenFrameComplete(const std::function<void()>& fn) noexcept;
std::vector<std::function<void()>> mFrameCompleteOps;
#endif
// tasks regularly executed on the main thread at until they return true
void runEveryNowAndThen(std::function<bool()> fn) noexcept;
@@ -395,7 +400,6 @@ private:
// timer query implementation
OpenGLTimerQueryInterface* mTimerQueryImpl = nullptr;
bool mFrameTimeSupported = false;
// for ES2 sRGB support
GLSwapChain* mCurrentDrawSwapChain = nullptr;

View File

@@ -116,4 +116,7 @@ bool OpenGLPlatform::isExtraContextSupported() const noexcept {
void OpenGLPlatform::createContext(bool) {
}
void OpenGLPlatform::releaseContext() noexcept {
}
} // namespace filament::backend

View File

@@ -19,6 +19,7 @@
#include <backend/platforms/OpenGLPlatform.h>
#include <utils/compiler.h>
#include <utils/JobSystem.h>
#include <utils/Log.h>
#include <utils/Systrace.h>
#include <utils/debug.h>
@@ -30,43 +31,111 @@ using namespace GLUtils;
// ------------------------------------------------------------------------------------------------
bool OpenGLTimerQueryFactory::mGpuTimeSupported = false;
OpenGLTimerQueryInterface* OpenGLTimerQueryFactory::init(
OpenGLPlatform& platform, OpenGLDriver& driver) noexcept {
(void)driver;
OpenGLTimerQueryInterface* impl;
#if defined(BACKEND_OPENGL_VERSION_GL) || defined(GL_EXT_disjoint_timer_query)
auto& context = driver.getContext();
if (context.ext.EXT_disjoint_timer_query) {
// timer queries are available
if (context.bugs.dont_use_timer_query && platform.canCreateFence()) {
// however, they don't work well, revert to using fences if we can.
impl = new(std::nothrow) OpenGLTimerQueryFence(platform);
} else {
impl = new(std::nothrow) TimerQueryNative(driver);
}
mGpuTimeSupported = true;
} else
#endif
if (platform.canCreateFence()) {
// no timer queries, but we can use fences
impl = new(std::nothrow) OpenGLTimerQueryFence(platform);
mGpuTimeSupported = true;
} else {
// no queries, no fences -- that's a problem
impl = new(std::nothrow) TimerQueryFallback();
mGpuTimeSupported = false;
}
return impl;
}
// ------------------------------------------------------------------------------------------------
OpenGLTimerQueryInterface::~OpenGLTimerQueryInterface() = default;
// This is a backend synchronous call
bool OpenGLTimerQueryInterface::getTimerQueryValue(GLTimerQuery* tq, uint64_t* elapsedTime) noexcept {
if (UTILS_LIKELY(tq->state)) {
int64_t const elapsed = tq->state->elapsed.load(std::memory_order_relaxed);
bool const available = elapsed > 0;
if (available) {
*elapsedTime = elapsed;
}
return available;
}
return false;
}
// ------------------------------------------------------------------------------------------------
#if defined(BACKEND_OPENGL_VERSION_GL) || defined(GL_EXT_disjoint_timer_query)
TimerQueryNative::TimerQueryNative(OpenGLContext& context) : mContext(context) {
TimerQueryNative::TimerQueryNative(OpenGLDriver& driver)
: mDriver(driver) {
}
TimerQueryNative::~TimerQueryNative() = default;
void TimerQueryNative::flush() {
}
void TimerQueryNative::beginTimeElapsedQuery(GLTimerQuery* query) {
mContext.procs.beginQuery(GL_TIME_ELAPSED, query->gl.query);
void TimerQueryNative::createTimerQuery(GLTimerQuery* tq) {
if (UTILS_UNLIKELY(!tq->state)) {
tq->state = std::make_shared<GLTimerQuery::State>();
}
mDriver.getContext().procs.genQueries(1u, &tq->state->gl.query);
CHECK_GL_ERROR(utils::slog.e)
}
void TimerQueryNative::endTimeElapsedQuery(GLTimerQuery*) {
mContext.procs.endQuery(GL_TIME_ELAPSED);
void TimerQueryNative::destroyTimerQuery(GLTimerQuery* tq) {
assert_invariant(tq->state);
mDriver.getContext().procs.deleteQueries(1u, &tq->state->gl.query);
CHECK_GL_ERROR(utils::slog.e)
}
bool TimerQueryNative::queryResultAvailable(GLTimerQuery* query) {
GLuint available = 0;
mContext.procs.getQueryObjectuiv(query->gl.query, GL_QUERY_RESULT_AVAILABLE, &available);
void TimerQueryNative::beginTimeElapsedQuery(GLTimerQuery* tq) {
assert_invariant(tq->state);
tq->state->elapsed.store(0);
mDriver.getContext().procs.beginQuery(GL_TIME_ELAPSED, tq->state->gl.query);
CHECK_GL_ERROR(utils::slog.e)
return available != 0;
}
uint64_t TimerQueryNative::queryResult(GLTimerQuery* query) {
GLuint64 elapsedTime = 0;
// we won't end-up here if we're on ES and don't have GL_EXT_disjoint_timer_query
mContext.procs.getQueryObjectui64v(query->gl.query, GL_QUERY_RESULT, &elapsedTime);
void TimerQueryNative::endTimeElapsedQuery(GLTimerQuery* tq) {
assert_invariant(tq->state);
mDriver.getContext().procs.endQuery(GL_TIME_ELAPSED);
CHECK_GL_ERROR(utils::slog.e)
return elapsedTime;
std::weak_ptr<GLTimerQuery::State> const weak = tq->state;
mDriver.runEveryNowAndThen([context = mDriver.getContext(), weak]() -> bool {
auto state = weak.lock();
if (state) {
GLuint available = 0;
context.procs.getQueryObjectuiv(state->gl.query, GL_QUERY_RESULT_AVAILABLE, &available);
CHECK_GL_ERROR(utils::slog.e)
if (!available) {
// we need to try this one again later
return false;
}
GLuint64 elapsedTime = 0;
// we won't end-up here if we're on ES and don't have GL_EXT_disjoint_timer_query
context.procs.getQueryObjectui64v(state->gl.query, GL_QUERY_RESULT, &elapsedTime);
state->elapsed.store((int64_t)elapsedTime, std::memory_order_relaxed);
}
return true;
});
}
#endif
@@ -77,6 +146,8 @@ OpenGLTimerQueryFence::OpenGLTimerQueryFence(OpenGLPlatform& platform)
: mPlatform(platform) {
mQueue.reserve(2);
mThread = std::thread([this]() {
utils::JobSystem::setThreadName("OpenGLTimerQueryFence");
utils::JobSystem::setThreadPriority(utils::JobSystem::Priority::URGENT_DISPLAY);
auto& queue = mQueue;
bool exitRequested;
do {
@@ -101,7 +172,9 @@ OpenGLTimerQueryFence::~OpenGLTimerQueryFence() {
mExitRequested = true;
mCondition.notify_one();
lock.unlock();
mThread.join();
if (mThread.joinable()) {
mThread.join();
}
}
}
@@ -111,59 +184,60 @@ void OpenGLTimerQueryFence::enqueue(OpenGLTimerQueryFence::Job&& job) {
mCondition.notify_one();
}
void OpenGLTimerQueryFence::flush() {
// Use calls to flush() as a proxy for when the GPU work started.
GLTimerQuery* query = mActiveQuery;
if (query) {
uint64_t const elapsed = query->gl.emulation->elapsed.load(std::memory_order_relaxed);
if (!elapsed) {
uint64_t const now = clock::now().time_since_epoch().count();
query->gl.emulation->elapsed.store(now, std::memory_order_relaxed);
//SYSTRACE_CONTEXT();
//SYSTRACE_ASYNC_BEGIN("gpu", query->gl.query);
}
void OpenGLTimerQueryFence::createTimerQuery(GLTimerQuery* tq) {
if (UTILS_UNLIKELY(!tq->state)) {
tq->state = std::make_shared<GLTimerQuery::State>();
}
}
void OpenGLTimerQueryFence::beginTimeElapsedQuery(GLTimerQuery* query) {
assert_invariant(!mActiveQuery);
// We can't use a fence to figure out when a GPU operation starts (only when it finishes)
// so instead, we use when glFlush() was issued as a proxy.
if (UTILS_UNLIKELY(!query->gl.emulation)) {
query->gl.emulation = std::make_shared<GLTimerQuery::State>();
}
query->gl.emulation->elapsed.store(0, std::memory_order_relaxed);
query->gl.emulation->available.store(false);
mActiveQuery = query;
void OpenGLTimerQueryFence::destroyTimerQuery(GLTimerQuery* tq) {
assert_invariant(tq->state);
}
void OpenGLTimerQueryFence::endTimeElapsedQuery(GLTimerQuery* query) {
assert_invariant(mActiveQuery);
void OpenGLTimerQueryFence::beginTimeElapsedQuery(GLTimerQuery* tq) {
assert_invariant(tq->state);
tq->state->elapsed.store(0);
Platform::Fence* fence = mPlatform.createFence();
std::weak_ptr<GLTimerQuery::State> const weak = query->gl.emulation;
mActiveQuery = nullptr;
//uint32_t cookie = cookie = query->gl.query;
std::weak_ptr<GLTimerQuery::State> const weak = tq->state;
// FIXME: this implementation of beginTimeElapsedQuery is usually wrong; it ends up
// measuring the current CPU time because the fence signals immediately (usually there is
// no work on the GPU at this point). We could workaround this by sending a small glClear
// on a dummy target for instance, or somehow latch the begin time at the next renderpass
// start.
push([&platform = mPlatform, fence, weak]() {
auto emulation = weak.lock();
if (emulation) {
auto state = weak.lock();
if (state) {
platform.waitFence(fence, FENCE_WAIT_FOR_EVER);
auto now = clock::now().time_since_epoch().count();
auto then = emulation->elapsed.load(std::memory_order_relaxed);
emulation->elapsed.store(now - then, std::memory_order_relaxed);
emulation->available.store(true);
//SYSTRACE_CONTEXT();
//SYSTRACE_ASYNC_END("gpu", cookie);
int64_t const then = clock::now().time_since_epoch().count();
state->elapsed.store(-then, std::memory_order_relaxed);
SYSTRACE_CONTEXT();
SYSTRACE_ASYNC_BEGIN("OpenGLTimerQueryFence", intptr_t(state.get()));
}
platform.destroyFence(fence);
});
}
bool OpenGLTimerQueryFence::queryResultAvailable(GLTimerQuery* query) {
return query->gl.emulation->available.load();
}
void OpenGLTimerQueryFence::endTimeElapsedQuery(GLTimerQuery* tq) {
assert_invariant(tq->state);
Platform::Fence* fence = mPlatform.createFence();
std::weak_ptr<GLTimerQuery::State> const weak = tq->state;
uint64_t OpenGLTimerQueryFence::queryResult(GLTimerQuery* query) {
return query->gl.emulation->elapsed;
push([&platform = mPlatform, fence, weak]() {
auto state = weak.lock();
if (state) {
platform.waitFence(fence, FENCE_WAIT_FOR_EVER);
int64_t const now = clock::now().time_since_epoch().count();
int64_t const then = state->elapsed.load(std::memory_order_relaxed);
assert_invariant(then < 0);
state->elapsed.store(now + then, std::memory_order_relaxed);
SYSTRACE_CONTEXT();
SYSTRACE_ASYNC_END("OpenGLTimerQueryFence", intptr_t(state.get()));
}
platform.destroyFence(fence);
});
}
// ------------------------------------------------------------------------------------------------
@@ -172,30 +246,30 @@ TimerQueryFallback::TimerQueryFallback() = default;
TimerQueryFallback::~TimerQueryFallback() = default;
void TimerQueryFallback::flush() {
}
void TimerQueryFallback::beginTimeElapsedQuery(OpenGLTimerQueryInterface::GLTimerQuery* query) {
if (!query->gl.emulation) {
query->gl.emulation = std::make_shared<GLTimerQuery::State>();
void TimerQueryFallback::createTimerQuery(GLTimerQuery* tq) {
if (UTILS_UNLIKELY(!tq->state)) {
tq->state = std::make_shared<GLTimerQuery::State>();
}
// this implementation clearly doesn't work at all, but we have no h/w support
query->gl.emulation->available.store(false, std::memory_order_relaxed);
query->gl.emulation->elapsed = clock::now().time_since_epoch().count();
}
void TimerQueryFallback::endTimeElapsedQuery(OpenGLTimerQueryInterface::GLTimerQuery* query) {
// this implementation clearly doesn't work at all, but we have no h/w support
query->gl.emulation->elapsed = clock::now().time_since_epoch().count() - query->gl.emulation->elapsed;
query->gl.emulation->available.store(true, std::memory_order_relaxed);
void TimerQueryFallback::destroyTimerQuery(GLTimerQuery* tq) {
assert_invariant(tq->state);
}
bool TimerQueryFallback::queryResultAvailable(OpenGLTimerQueryInterface::GLTimerQuery* query) {
return query->gl.emulation->available.load(std::memory_order_relaxed);
void TimerQueryFallback::beginTimeElapsedQuery(OpenGLTimerQueryInterface::GLTimerQuery* tq) {
assert_invariant(tq->state);
// this implementation measures the CPU time, but we have no h/w support
int64_t const then = clock::now().time_since_epoch().count();
tq->state->elapsed.store(-then, std::memory_order_relaxed);
}
uint64_t TimerQueryFallback::queryResult(OpenGLTimerQueryInterface::GLTimerQuery* query) {
return query->gl.emulation->elapsed;
void TimerQueryFallback::endTimeElapsedQuery(OpenGLTimerQueryInterface::GLTimerQuery* tq) {
assert_invariant(tq->state);
// this implementation measures the CPU time, but we have no h/w support
int64_t const now = clock::now().time_since_epoch().count();
int64_t const then = tq->state->elapsed.load(std::memory_order_relaxed);
assert_invariant(then < 0);
tq->state->elapsed.store(now + then, std::memory_order_relaxed);
}
} // namespace filament::backend

View File

@@ -20,20 +20,36 @@
#include "OpenGLDriver.h"
#include <utils/Condition.h>
#include <utils/Mutex.h>
#include <thread>
#include <vector>
namespace filament::backend {
class OpenGLPlatform;
class OpenGLTimerQueryInterface;
/*
* we need two implementation of timer queries (only elapsed time), because
* We need two implementation of timer queries (only elapsed time), because
* on some gpu disjoint_timer_query/arb_timer_query is much less accurate than
* using fences.
*
* These classes implement the various strategies...
*/
class OpenGLTimerQueryFactory {
static bool mGpuTimeSupported;
public:
static OpenGLTimerQueryInterface* init(
OpenGLPlatform& platform, OpenGLDriver& driver) noexcept;
static bool isGpuTimeSupported() noexcept {
return mGpuTimeSupported;
}
};
class OpenGLTimerQueryInterface {
protected:
using GLTimerQuery = OpenGLDriver::GLTimerQuery;
@@ -41,26 +57,26 @@ protected:
public:
virtual ~OpenGLTimerQueryInterface();
virtual void flush() = 0;
virtual void createTimerQuery(GLTimerQuery* query) = 0;
virtual void destroyTimerQuery(GLTimerQuery* query) = 0;
virtual void beginTimeElapsedQuery(GLTimerQuery* query) = 0;
virtual void endTimeElapsedQuery(GLTimerQuery* query) = 0;
virtual bool queryResultAvailable(GLTimerQuery* query) = 0;
virtual uint64_t queryResult(GLTimerQuery* query) = 0;
static bool getTimerQueryValue(GLTimerQuery* tq, uint64_t* elapsedTime) noexcept;
};
#if defined(BACKEND_OPENGL_VERSION_GL) || defined(GL_EXT_disjoint_timer_query)
class TimerQueryNative : public OpenGLTimerQueryInterface {
public:
explicit TimerQueryNative(OpenGLContext& context);
explicit TimerQueryNative(OpenGLDriver& driver);
~TimerQueryNative() override;
private:
void flush() override;
void createTimerQuery(GLTimerQuery* query) override;
void destroyTimerQuery(GLTimerQuery* query) override;
void beginTimeElapsedQuery(GLTimerQuery* query) override;
void endTimeElapsedQuery(GLTimerQuery* query) override;
bool queryResultAvailable(GLTimerQuery* query) override;
uint64_t queryResult(GLTimerQuery* query) override;
OpenGLContext& mContext;
OpenGLDriver& mDriver;
};
#endif
@@ -71,13 +87,12 @@ public:
~OpenGLTimerQueryFence() override;
private:
using Job = std::function<void()>;
void flush() override;
void beginTimeElapsedQuery(GLTimerQuery* query) override;
void endTimeElapsedQuery(GLTimerQuery* query) override;
bool queryResultAvailable(GLTimerQuery* query) override;
uint64_t queryResult(GLTimerQuery* query) override;
void enqueue(Job&& job);
void createTimerQuery(GLTimerQuery* query) override;
void destroyTimerQuery(GLTimerQuery* query) override;
void beginTimeElapsedQuery(GLTimerQuery* tq) override;
void endTimeElapsedQuery(GLTimerQuery* tq) override;
void enqueue(Job&& job);
template<typename CALLABLE, typename ... ARGS>
void push(CALLABLE&& func, ARGS&& ... args) {
enqueue(Job(std::bind(std::forward<CALLABLE>(func), std::forward<ARGS>(args)...)));
@@ -89,7 +104,6 @@ private:
mutable utils::Condition mCondition;
std::vector<Job> mQueue;
bool mExitRequested = false;
GLTimerQuery* mActiveQuery = nullptr;
};
class TimerQueryFallback : public OpenGLTimerQueryInterface {
@@ -97,11 +111,10 @@ public:
explicit TimerQueryFallback();
~TimerQueryFallback() override;
private:
void flush() override;
void createTimerQuery(GLTimerQuery* query) override;
void destroyTimerQuery(GLTimerQuery* query) override;
void beginTimeElapsedQuery(GLTimerQuery* query) override;
void endTimeElapsedQuery(GLTimerQuery* query) override;
bool queryResultAvailable(GLTimerQuery* query) override;
uint64_t queryResult(GLTimerQuery* query) override;
};
} // namespace filament::backend

View File

@@ -32,7 +32,6 @@
#include <utils/Systrace.h>
#include <chrono>
#include <future>
#include <string>
#include <string_view>
#include <variant>
@@ -64,17 +63,18 @@ static inline std::string to_string(float f) noexcept {
// ------------------------------------------------------------------------------------------------
struct ShaderCompilerService::ProgramToken {
struct ProgramBinary {
GLenum format{};
struct ShaderCompilerService::OpenGLProgramToken : ProgramToken {
struct ProgramData {
GLuint program{};
std::array<GLuint, Program::SHADER_TYPE_COUNT> shaders{};
std::vector<char> blob;
};
ProgramToken(ShaderCompilerService& compiler, utils::CString const& name) noexcept
~OpenGLProgramToken() override;
OpenGLProgramToken(ShaderCompilerService& compiler, utils::CString const& name) noexcept
: compiler(compiler), name(name) {
}
ShaderCompilerService& compiler;
utils::CString const& name;
utils::FixedCapacityVector<std::pair<utils::CString, uint8_t>> attributes;
@@ -85,12 +85,44 @@ struct ShaderCompilerService::ProgramToken {
GLuint program = 0;
} gl; // 12 bytes
// Sets the programData, typically from the compiler thread, and signal the main thread.
// This is similar to std::promise::set_value.
void set(ProgramData const& data) noexcept {
std::unique_lock const l(lock);
programData = data;
signaled = true;
cond.notify_one();
}
// Get the programBinary, wait if necessary.
// This is similar to std::future::get
ProgramData const& get() const noexcept {
std::unique_lock l(lock);
cond.wait(l, [this](){ return signaled; });
return programData;
}
// Checks if the programBinary is ready.
// This is similar to std::future::wait_for(0s)
bool isReady() const noexcept {
std::unique_lock l(lock);
using namespace std::chrono_literals;
return cond.wait_for(l, 0s, [this](){ return signaled; });
}
CallbackManager::Handle handle{};
BlobCacheKey key;
std::future<ProgramBinary> binary;
CompilerPriorityQueue priorityQueue = CompilerPriorityQueue::HIGH;
bool canceled = false;
mutable utils::Mutex lock;
mutable utils::Condition cond;
ProgramData programData;
bool signaled = false;
bool canceled = false; // not part of the signaling
};
ShaderCompilerService::OpenGLProgramToken::~OpenGLProgramToken() = default;
void ShaderCompilerService::setUserData(const program_token_t& token, void* user) noexcept {
token->user = user;
}
@@ -101,261 +133,181 @@ void* ShaderCompilerService::getUserData(const program_token_t& token) noexcept
// ------------------------------------------------------------------------------------------------
void ShaderCompilerService::CompilerThreadPool::init(
bool useSharedContexts, uint32_t threadCount, OpenGLPlatform& platform) noexcept {
for (size_t i = 0; i < threadCount; i++) {
mCompilerThreads.emplace_back([this, useSharedContexts, &platform]() {
// give the thread a name
JobSystem::setThreadName("CompilerThreadPool");
// create a gl context current to this thread
platform.createContext(useSharedContexts);
// process jobs from the queue until we're asked to exit
while (!mExitRequested) {
std::unique_lock lock(mQueueLock);
mQueueCondition.wait(lock, [this]() {
return mExitRequested ||
mUrgentJob ||
(!std::all_of( std::begin(mQueues), std::end(mQueues),
[](auto&& q) { return q.empty(); }));
});
if (!mExitRequested) {
Job job{ std::move(mUrgentJob) };
if (!job) {
// use the first queue that's not empty
auto& queue = [this]() -> auto& {
for (auto& q: mQueues) {
if (!q.empty()) {
return q;
}
}
return mQueues[0]; // we should never end-up here.
}();
assert_invariant(!queue.empty());
std::swap(job, queue.front().second);
queue.pop_front();
}
// execute the job without holding any locks
lock.unlock();
job();
}
}
});
}
}
auto ShaderCompilerService::CompilerThreadPool::dequeue(program_token_t const& token) -> Job {
auto& q = mQueues[size_t(token->priorityQueue)];
auto pos = std::find_if(q.begin(), q.end(), [&token](auto&& item) {
return item.first == token;
});
Job job;
if (pos != q.end()) {
std::swap(job, pos->second);
q.erase(pos);
}
return job;
}
void ShaderCompilerService::CompilerThreadPool::makeUrgent(program_token_t const& token) {
std::unique_lock const lock(mQueueLock);
assert_invariant(!mUrgentJob);
Job job{ dequeue(token) };
std::swap(job, mUrgentJob);
mQueueCondition.notify_one();
}
void ShaderCompilerService::CompilerThreadPool::queue(program_token_t const& token, Job&& job) {
std::unique_lock const lock(mQueueLock);
mQueues[size_t(token->priorityQueue)].emplace_back(token, std::move(job));
mQueueCondition.notify_one();
}
void ShaderCompilerService::CompilerThreadPool::exit() noexcept {
std::unique_lock lock(mQueueLock);
mExitRequested = true;
mQueueCondition.notify_all();
lock.unlock();
for (auto& thread: mCompilerThreads) {
if (thread.joinable()) {
thread.join();
}
}
}
// ------------------------------------------------------------------------------------------------
ShaderCompilerService::ShaderCompilerService(OpenGLDriver& driver)
: mDriver(driver),
mCallbackManager(driver),
KHR_parallel_shader_compile(driver.getContext().ext.KHR_parallel_shader_compile) {
}
ShaderCompilerService::~ShaderCompilerService() noexcept = default;
bool ShaderCompilerService::isParallelShaderCompileSupported() const noexcept {
return KHR_parallel_shader_compile || mShaderCompilerThreadCount;
}
void ShaderCompilerService::init() noexcept {
// If we have KHR_parallel_shader_compile, we always use it, it should be more resource
// friendly.
if (!KHR_parallel_shader_compile) {
// - on Adreno there is a single compiler object. We can't use a pool > 1
// also glProgramBinary blocks if other threads are compiling.
// - on Mali shader compilation can be multithreaded, but program linking happens on
// - on Mali shader compilation can be multi-threaded, but program linking happens on
// a single service thread, so we don't bother using more than one thread either.
// - on desktop we could use more threads, tbd.
// - on PowerVR shader compilation and linking can be multi-threaded.
// How many threads should we use?
// - on macOS (M1 MacBook Pro/Ventura) there is global lock around all GL APIs when using
// a shared context, so parallel shader compilation yields no benefit.
// - on windows/linux we could use more threads, tbd.
if (mDriver.mPlatform.isExtraContextSupported()) {
mShaderCompilerThreadCount = 1;
mCompilerThreadPool.init(mUseSharedContext,
mShaderCompilerThreadCount, mDriver.mPlatform);
// By default, we use one thread at the same priority as the gl thread. This is the
// safest choice that avoids priority inversions.
uint32_t poolSize = 1;
JobSystem::Priority priority = JobSystem::Priority::DISPLAY;
auto const& renderer = mDriver.getContext().state.renderer;
if (UTILS_UNLIKELY(strstr(renderer, "PowerVR"))) {
// The PowerVR driver support parallel shader compilation well, so we use 2
// threads, we can use lower priority threads here because urgent compilations
// will most likely happen on the main gl thread. Using too many thread can
// increase memory pressure significantly.
poolSize = 2;
priority = JobSystem::Priority::BACKGROUND;
}
mShaderCompilerThreadCount = poolSize;
mCompilerThreadPool.init(mShaderCompilerThreadCount,
[&platform = mDriver.mPlatform, priority]() {
// give the thread a name
JobSystem::setThreadName("CompilerThreadPool");
// run at a slightly lower priority than other filament threads
JobSystem::setThreadPriority(priority);
// create a gl context current to this thread
platform.createContext(true);
},
[&platform = mDriver.mPlatform]() {
// release context and thread state
platform.releaseContext();
});
}
}
}
void ShaderCompilerService::terminate() noexcept {
// FIXME: could we have some user callbacks pending here?
mCompilerThreadPool.exit();
// Finally stop the thread pool immediately. Pending jobs will be discarded. We guarantee by
// construction that nobody is waiting on a token (because waiting is only done on the main
// backend thread, and if we're here, we're on the backend main thread).
mCompilerThreadPool.terminate();
mRunAtNextTickOps.clear();
// We could have some pending callbacks here, we need to execute them.
// This is equivalent to calling cancelTickOp() on all active tokens.
mCallbackManager.terminate();
}
ShaderCompilerService::program_token_t ShaderCompilerService::createProgram(
utils::CString const& name, Program&& program) {
auto& gl = mDriver.getContext();
auto token = std::make_shared<ProgramToken>(*this, name);
auto token = std::make_shared<OpenGLProgramToken>(*this, name);
if (UTILS_UNLIKELY(gl.isES2())) {
token->attributes = std::move(program.getAttributes());
}
token->gl.program = OpenGLBlobCache::retrieve(&token->key, mDriver.mPlatform, program);
if (!token->gl.program) {
if (mShaderCompilerThreadCount) {
// set the future in the token and pass the promise to the worker thread
std::promise<ProgramToken::ProgramBinary> promise;
token->binary = promise.get_future();
token->priorityQueue = program.getPriorityQueue();
// queue a compile job
mCompilerThreadPool.queue(token,
[this, &gl, promise = std::move(promise),
program = std::move(program), token]() mutable {
if (token->gl.program) {
return token;
}
// compile the shaders
std::array<GLuint, Program::SHADER_TYPE_COUNT> shaders{};
std::array<utils::CString, Program::SHADER_TYPE_COUNT> shaderSourceCode;
compileShaders(gl,
std::move(program.getShadersSource()),
program.getSpecializationConstants(),
shaders,
shaderSourceCode);
token->handle = mCallbackManager.get();
// link the program
GLuint const glProgram = linkProgram(gl, shaders, token->attributes);
CompilerPriorityQueue const priorityQueue = program.getPriorityQueue();
if (mShaderCompilerThreadCount) {
// queue a compile job
mCompilerThreadPool.queue(priorityQueue, token,
[this, &gl, program = std::move(program), token]() mutable {
// compile the shaders
std::array<GLuint, Program::SHADER_TYPE_COUNT> shaders{};
std::array<utils::CString, Program::SHADER_TYPE_COUNT> shaderSourceCode;
compileShaders(gl,
std::move(program.getShadersSource()),
program.getSpecializationConstants(),
shaders,
shaderSourceCode);
ProgramToken::ProgramBinary binary;
binary.shaders = shaders;
// link the program
GLuint const glProgram = linkProgram(gl, shaders, token->attributes);
if (UTILS_LIKELY(mUseSharedContext)) {
// We need to query the link status here to guarantee that the
// program is compiled and linked now (we don't want this to be
// deferred to later). We don't care about the result at this point.
GLint status;
glGetProgramiv(glProgram, GL_LINK_STATUS, &status);
binary.program = glProgram;
if (token->key) {
// Attempt to cache. This calls glGetProgramBinary.
OpenGLBlobCache::insert(mDriver.mPlatform,
token->key, token->gl.program);
}
}
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
else {
// retrieve the program binary
GLsizei programBinarySize = 0;
glGetProgramiv(glProgram, GL_PROGRAM_BINARY_LENGTH, &programBinarySize);
assert_invariant(programBinarySize);
if (programBinarySize) {
binary.blob.resize(programBinarySize);
glGetProgramBinary(glProgram, programBinarySize,
&programBinarySize, &binary.format, binary.blob.data());
}
// and we can destroy the program
glDeleteProgram(glProgram);
if (token->key) {
// attempt to cache
OpenGLBlobCache::insert(mDriver.mPlatform, token->key,
binary.format,
binary.blob.data(), GLsizei(binary.blob.size()));
}
}
#endif
// we don't need to check for success here, it'll be done on the
// main thread side.
promise.set_value(binary);
});
} else
{
// this cannot fail because we check compilation status after linking the program
// shaders[] is filled with id of shader stages present.
compileShaders(gl,
std::move(program.getShadersSource()),
program.getSpecializationConstants(),
token->gl.shaders,
token->shaderSourceCode);
OpenGLProgramToken::ProgramData programData;
programData.shaders = shaders;
}
runAtNextTick(token, [this, token]() {
if (mShaderCompilerThreadCount) {
if (!token->gl.program) {
// TODO: see if we could completely eliminate this callback here
// and instead just rely on token->gl.program being atomically
// set by the compiler thread.
assert_invariant(token->binary.valid());
// we're using the compiler thread, check if the program is ready, no-op if not.
using namespace std::chrono_literals;
if (token->binary.wait_for(0s) != std::future_status::ready) {
return false;
}
// program binary is ready, retrieve it without blocking
ShaderCompilerService::getProgramFromCompilerPool(
const_cast<program_token_t&>(token));
}
} else {
if (KHR_parallel_shader_compile) {
// don't attempt to link this program if all shaders are not done compiling
// We need to query the link status here to guarantee that the
// program is compiled and linked now (we don't want this to be
// deferred to later). We don't care about the result at this point.
GLint status;
if (token->gl.program) {
glGetProgramiv(token->gl.program, GL_COMPLETION_STATUS, &status);
if (status == GL_FALSE) {
return false;
}
} else {
for (auto shader: token->gl.shaders) {
if (shader) {
glGetShaderiv(shader, GL_COMPLETION_STATUS, &status);
if (status == GL_FALSE) {
return false;
}
glGetProgramiv(glProgram, GL_LINK_STATUS, &status);
programData.program = glProgram;
token->gl.program = programData.program;
// we don't need to check for success here, it'll be done on the
// main thread side.
token->set(programData);
mCallbackManager.put(token->handle);
// caching must be the last thing we do
if (token->key) {
// Attempt to cache. This calls glGetProgramBinary.
OpenGLBlobCache::insert(mDriver.mPlatform, token->key, glProgram);
}
});
} else {
// this cannot fail because we check compilation status after linking the program
// shaders[] is filled with id of shader stages present.
compileShaders(gl,
std::move(program.getShadersSource()),
program.getSpecializationConstants(),
token->gl.shaders,
token->shaderSourceCode);
runAtNextTick(priorityQueue, token, [this, token](Job const&) {
if (KHR_parallel_shader_compile) {
// don't attempt to link this program if all shaders are not done compiling
GLint status;
if (token->gl.program) {
glGetProgramiv(token->gl.program, GL_COMPLETION_STATUS, &status);
if (status == GL_FALSE) {
return false;
}
} else {
for (auto shader: token->gl.shaders) {
if (shader) {
glGetShaderiv(shader, GL_COMPLETION_STATUS, &status);
if (status == GL_FALSE) {
return false;
}
}
}
}
}
if (!token->gl.program) {
// link the program, this also cannot fail because status is checked later.
token->gl.program = linkProgram(mDriver.getContext(),
token->gl.shaders, token->attributes);
if (KHR_parallel_shader_compile) {
// wait until the link finishes...
return false;
}
if (!token->gl.program) {
// link the program, this also cannot fail because status is checked later.
token->gl.program = linkProgram(mDriver.getContext(),
token->gl.shaders, token->attributes);
if (KHR_parallel_shader_compile) {
// wait until the link finishes...
return false;
}
}
assert_invariant(token->gl.program);
if (token->key && !mShaderCompilerThreadCount) {
mCallbackManager.put(token->handle);
if (token->key) {
// TODO: technically we don't have to cache right now. Is it advantageous to
// do this later, maybe depending on CPU usage?
// attempt to cache if we don't have a thread pool (otherwise it's done
@@ -370,31 +322,12 @@ ShaderCompilerService::program_token_t ShaderCompilerService::createProgram(
return token;
}
bool ShaderCompilerService::isProgramReady(
const ShaderCompilerService::program_token_t& token) const noexcept {
assert_invariant(token);
if (!token->gl.program) {
return false;
}
if (KHR_parallel_shader_compile) {
GLint status = GL_FALSE;
glGetProgramiv(token->gl.program, GL_COMPLETION_STATUS, &status);
return (bool)status;
}
// If gl.program is set, this means the program was linked. Some drivers may defer the link
// in which case we might block in getProgram() when we check the program status.
// Unfortunately, this is nothing we can do about that.
return bool(token->gl.program);
}
GLuint ShaderCompilerService::getProgram(ShaderCompilerService::program_token_t& token) {
GLuint const program = initialize(token);
assert_invariant(token == nullptr);
#ifndef FILAMENT_ENABLE_MATDBG
assert_invariant(program);
#endif
return program;
}
@@ -418,74 +351,30 @@ GLuint ShaderCompilerService::getProgram(ShaderCompilerService::program_token_t&
glDeleteProgram(token->gl.program);
}
token = nullptr;
token.reset();
}
void ShaderCompilerService::tick() {
executeTickOps();
// we don't need to run executeTickOps() if we're using the thread-pool
if (UTILS_UNLIKELY(!mShaderCompilerThreadCount)) {
executeTickOps();
}
}
void ShaderCompilerService::notifyWhenAllProgramsAreReady(CallbackHandler* handler,
CallbackHandler::Callback callback, void* user) {
if (KHR_parallel_shader_compile || mShaderCompilerThreadCount) {
// list all programs up to this point
utils::FixedCapacityVector<program_token_t, std::allocator<program_token_t>, false> tokens;
tokens.reserve(mRunAtNextTickOps.size());
for (auto& [token, _] : mRunAtNextTickOps) {
if (token) {
tokens.push_back(token);
}
}
runAtNextTick(nullptr, [this, tokens = std::move(tokens), handler, user, callback]() {
for (auto const& token : tokens) {
assert_invariant(token);
if (!isProgramReady(token)) {
// one of the program is not ready, try next time
return false;
}
}
if (callback) {
// all programs are ready, we can call the callbacks
mDriver.scheduleCallback(handler, user, callback);
}
// and we're done
return true;
});
return;
void ShaderCompilerService::notifyWhenAllProgramsAreReady(
CallbackHandler* handler, CallbackHandler::Callback callback, void* user) {
if (callback) {
mCallbackManager.setCallback(handler, callback, user);
}
// we don't have KHR_parallel_shader_compile
runAtNextTick(nullptr, [this, handler, user, callback]() {
mDriver.scheduleCallback(handler, user, callback);
return true;
});
// TODO: we could spread the compiles over several frames, the tick() below then is not
// needed here. We keep it for now as to not change the current behavior too much.
// this will block until all programs are linked
tick();
}
// ------------------------------------------------------------------------------------------------
void ShaderCompilerService::getProgramFromCompilerPool(program_token_t& token) noexcept {
ProgramToken::ProgramBinary const binary{ token->binary.get() };
OpenGLProgramToken::ProgramData const& programData{ token->get() };
if (!token->canceled) {
token->gl.shaders = binary.shaders;
if (UTILS_LIKELY(mUseSharedContext)) {
token->gl.program = binary.program;
}
#ifndef FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2
else {
token->gl.program = glCreateProgram();
glProgramBinary(token->gl.program, binary.format,
binary.blob.data(), GLsizei(binary.blob.size()));
}
#endif
token->gl.shaders = programData.shaders;
token->gl.program = programData.program;
}
}
@@ -493,24 +382,36 @@ GLuint ShaderCompilerService::initialize(program_token_t& token) noexcept {
SYSTRACE_CALL();
if (!token->gl.program) {
if (mShaderCompilerThreadCount) {
// Block until the program is ready. This could take a very long time.
assert_invariant(token->binary.valid());
// we need this program right now, so move it to the head of the queue.
mCompilerThreadPool.makeUrgent(token);
// we need this program right now, remove it from the queue
auto job = mCompilerThreadPool.dequeue(token);
if (job) {
// if we were able to remove it, we execute the job now, otherwise it means
// it's being executed right now.
job();
}
if (!token->canceled) {
token->compiler.cancelTickOp(token);
}
// block until we get the program from the pool
// Block until we get the program from the pool. Generally this wouldn't block
// because we just compiled the program above, when executing job.
ShaderCompilerService::getProgramFromCompilerPool(token);
} else if (KHR_parallel_shader_compile) {
// we force the program link -- which might stall, either here or below in
// checkProgramStatus(), but we don't have a choice, we need to use the program now.
token->compiler.cancelTickOp(token);
token->gl.program = linkProgram(mDriver.getContext(),
token->gl.shaders, token->attributes);
assert_invariant(token->gl.program);
mCallbackManager.put(token->handle);
if (token->key) {
OpenGLBlobCache::insert(mDriver.mPlatform, token->key, token->gl.program);
}
} else {
// if we don't have a program yet, block until we get it.
tick();
@@ -661,8 +562,8 @@ float u16tofp32(highp uint v) {
v <<= 16u;
highp uint s = v & 0x80000000u;
highp uint n = v & 0x7FFFFFFFu;
highp uint nz = n == 0u ? 0u : 0xFFFFFFFF;
return uintBitsToFloat(s | ((((n >> 3u) + (0x70u << 23))) & nz));
highp uint nz = (n == 0u) ? 0u : 0xFFFFFFFFu;
return uintBitsToFloat(s | ((((n >> 3u) + (0x70u << 23u))) & nz));
}
vec2 unpackHalf2x16(highp uint v) {
return vec2(u16tofp32(v&0xFFFFu), u16tofp32(v>>16u));
@@ -670,11 +571,11 @@ vec2 unpackHalf2x16(highp uint v) {
uint fp32tou16(float val) {
uint f32 = floatBitsToUint(val);
uint f16 = 0u;
uint sign = (f32 >> 16) & 0x8000u;
int exponent = int((f32 >> 23) & 0xFFu) - 127;
uint sign = (f32 >> 16u) & 0x8000u;
int exponent = int((f32 >> 23u) & 0xFFu) - 127;
uint mantissa = f32 & 0x007FFFFFu;
if (exponent > 15) {
f16 = sign | (0x1Fu << 10);
f16 = sign | (0x1Fu << 10u);
} else if (exponent > -15) {
exponent += 15;
mantissa >>= 13;
@@ -687,7 +588,7 @@ uint fp32tou16(float val) {
highp uint packHalf2x16(vec2 v) {
highp uint x = fp32tou16(v.x);
highp uint y = fp32tou16(v.y);
return (y << 16) | x;
return (y << 16u) | x;
}
)"sv;
}
@@ -747,17 +648,15 @@ GLuint ShaderCompilerService::linkProgram(OpenGLContext& context,
// ------------------------------------------------------------------------------------------------
void ShaderCompilerService::runAtNextTick(
const program_token_t& token, std::function<bool()> fn) noexcept {
void ShaderCompilerService::runAtNextTick(CompilerPriorityQueue priority,
const program_token_t& token, Job job) noexcept {
// insert items in order of priority and at the end of the range
auto& ops = mRunAtNextTickOps;
using ContainerType = std::pair<program_token_t, std::function<bool()>>;
auto const pos = std::lower_bound(ops.begin(), ops.end(),
token->priorityQueue,
auto const pos = std::lower_bound(ops.begin(), ops.end(), priority,
[](ContainerType const& lhs, CompilerPriorityQueue priorityQueue) {
return lhs.first->priorityQueue < priorityQueue;
return std::get<0>(lhs) < priorityQueue;
});
ops.emplace(pos, token, std::move(fn));
ops.emplace(pos, priority, token, std::move(job));
SYSTRACE_CONTEXT();
SYSTRACE_VALUE32("ShaderCompilerService Jobs", mRunAtNextTickOps.size());
@@ -766,9 +665,8 @@ void ShaderCompilerService::runAtNextTick(
void ShaderCompilerService::cancelTickOp(program_token_t token) noexcept {
// We do a linear search here, but this is rare, and we know the list is pretty small.
auto& ops = mRunAtNextTickOps;
auto pos = std::find_if(ops.begin(), ops.end(),
[&](const auto& item) {
return item.first == token;
auto pos = std::find_if(ops.begin(), ops.end(), [&](const auto& item) {
return std::get<1>(item) == token;
});
if (pos != ops.end()) {
ops.erase(pos);
@@ -781,7 +679,8 @@ void ShaderCompilerService::executeTickOps() noexcept {
auto& ops = mRunAtNextTickOps;
auto it = ops.begin();
while (it != ops.end()) {
bool const remove = it->second();
Job const& job = std::get<2>(*it);
bool const remove = job.fn(job);
if (remove) {
it = ops.erase(it);
} else {

View File

@@ -19,12 +19,16 @@
#include "gl_headers.h"
#include "CallbackManager.h"
#include "CompilerThreadPool.h"
#include <backend/CallbackHandler.h>
#include <backend/Program.h>
#include <utils/CString.h>
#include <utils/Invocable.h>
#include <utils/FixedCapacityVector.h>
#include <utils/Invocable.h>
#include <utils/JobSystem.h>
#include <atomic>
#include <condition_variable>
@@ -33,6 +37,7 @@
#include <memory>
#include <mutex>
#include <thread>
#include <utility>
#include <vector>
namespace filament::backend {
@@ -47,10 +52,10 @@ class CallbackHandler;
* A class handling shader compilation that supports asynchronous compilation.
*/
class ShaderCompilerService {
struct ProgramToken;
struct OpenGLProgramToken;
public:
using program_token_t = std::shared_ptr<ProgramToken>;
using program_token_t = std::shared_ptr<OpenGLProgramToken>;
explicit ShaderCompilerService(OpenGLDriver& driver);
@@ -61,16 +66,14 @@ public:
~ShaderCompilerService() noexcept;
bool isParallelShaderCompileSupported() const noexcept;
void init() noexcept;
void terminate() noexcept;
// creates a program (compile + link) asynchronously if supported
program_token_t createProgram(utils::CString const& name, Program&& program);
// Returns true if the program is linked (successfully or not). Guarantees that
// getProgram() won't block. Does not block.
bool isProgramReady(const program_token_t& token) const noexcept;
// Return the GL program, blocks if necessary. The Token is destroyed and becomes invalid.
GLuint getProgram(program_token_t& token);
@@ -87,38 +90,17 @@ public:
static void* getUserData(const program_token_t& token) noexcept;
// call the callback when all active programs are ready
void notifyWhenAllProgramsAreReady(CallbackHandler* handler,
CallbackHandler::Callback callback, void* user);
void notifyWhenAllProgramsAreReady(
CallbackHandler* handler, CallbackHandler::Callback callback, void* user);
private:
class CompilerThreadPool {
public:
using Job = utils::Invocable<void()>;
void init(bool useSharedContexts, uint32_t threadCount, OpenGLPlatform& platform) noexcept;
void exit() noexcept;
void queue(program_token_t const& token, Job&& job);
void makeUrgent(program_token_t const& token);
private:
std::vector<std::thread> mCompilerThreads;
std::atomic_bool mExitRequested{ false };
std::mutex mQueueLock;
std::condition_variable mQueueCondition;
std::array<std::deque<std::pair<program_token_t, Job>>, 2> mQueues;
Job mUrgentJob;
Job dequeue(program_token_t const& token); // lock must be held
};
OpenGLDriver& mDriver;
CallbackManager mCallbackManager;
CompilerThreadPool mCompilerThreadPool;
const bool KHR_parallel_shader_compile;
uint32_t mShaderCompilerThreadCount = 0u;
// For now, we assume shared contexts are supported everywhere. If they are not,
// we don't use the shader compiler pool. However, the code supports it.
static constexpr bool mUseSharedContext = true;
GLuint initialize(ShaderCompilerService::program_token_t& token) noexcept;
static void getProgramFromCompilerPool(program_token_t& token) noexcept;
@@ -143,11 +125,27 @@ private:
static bool checkProgramStatus(program_token_t const& token) noexcept;
void runAtNextTick(const program_token_t& token, std::function<bool()> fn) noexcept;
struct Job {
template<typename FUNC>
Job(FUNC&& fn) : fn(std::forward<FUNC>(fn)) {}
Job(std::function<bool(Job const& job)> fn,
CallbackHandler* handler, void* user, CallbackHandler::Callback callback)
: fn(std::move(fn)), handler(handler), user(user), callback(callback) {
}
std::function<bool(Job const& job)> fn;
CallbackHandler* handler = nullptr;
void* user = nullptr;
CallbackHandler::Callback callback{};
};
void runAtNextTick(CompilerPriorityQueue priority,
const program_token_t& token, Job job) noexcept;
void executeTickOps() noexcept;
void cancelTickOp(program_token_t token) noexcept;
// order of insertion is important
std::vector<std::pair<program_token_t, std::function<bool()>>> mRunAtNextTickOps;
using ContainerType = std::tuple<CompilerPriorityQueue, program_token_t, Job>;
std::vector<ContainerType> mRunAtNextTickOps;
};
} // namespace filament::backend

View File

@@ -188,6 +188,12 @@ using namespace glext;
# define GL_TEXTURE_CUBE_MAP_ARRAY 0x9009
#endif
#if defined(GL_EXT_clip_cull_distance)
# define GL_CLIP_DISTANCE0 GL_CLIP_DISTANCE0_EXT
#else
# define GL_CLIP_DISTANCE0 0x3000
#endif
#if defined(GL_KHR_debug)
# define GL_DEBUG_OUTPUT GL_DEBUG_OUTPUT_KHR
# define GL_DEBUG_OUTPUT_SYNCHRONOUS GL_DEBUG_OUTPUT_SYNCHRONOUS_KHR

View File

@@ -173,6 +173,9 @@ Driver* PlatformCocoaGL::createDriver(void* sharedContext, const Platform::Drive
}
bool PlatformCocoaGL::isExtraContextSupported() const noexcept {
// macOS supports shared contexts however, it looks like the implementation uses a global
// lock around all GL APIs. It's a problem for API calls that take a long time to execute,
// one such call is e.g.: glCompileProgram.
return true;
}

View File

@@ -115,9 +115,14 @@ Driver* PlatformEGL::createDriver(void* sharedContext, const Platform::DriverCon
auto extensions = GLUtils::split(eglQueryString(mEGLDisplay, EGL_EXTENSIONS));
ext.egl.ANDROID_recordable = extensions.has("EGL_ANDROID_recordable");
ext.egl.KHR_create_context = extensions.has("EGL_KHR_create_context");
ext.egl.KHR_gl_colorspace = extensions.has("EGL_KHR_gl_colorspace");
ext.egl.KHR_create_context = extensions.has("EGL_KHR_create_context");
ext.egl.KHR_no_config_context = extensions.has("EGL_KHR_no_config_context");
ext.egl.KHR_surfaceless_context = extensions.has("KHR_surfaceless_context");
if (ext.egl.KHR_create_context) {
// KHR_create_context implies KHR_surfaceless_context for ES3.x contexts
ext.egl.KHR_surfaceless_context = true;
}
eglCreateSyncKHR = (PFNEGLCREATESYNCKHRPROC) eglGetProcAddress("eglCreateSyncKHR");
eglDestroySyncKHR = (PFNEGLDESTROYSYNCKHRPROC) eglGetProcAddress("eglDestroySyncKHR");
@@ -181,13 +186,6 @@ Driver* PlatformEGL::createDriver(void* sharedContext, const Platform::DriverCon
eglConfig = mEGLConfig;
}
// create the dummy surface, just for being able to make the context current.
mEGLDummySurface = eglCreatePbufferSurface(mEGLDisplay, mEGLConfig, pbufferAttribs);
if (UTILS_UNLIKELY(mEGLDummySurface == EGL_NO_SURFACE)) {
logEglError("eglCreatePbufferSurface");
goto error;
}
for (size_t tries = 0; tries < 3; tries++) {
mEGLContext = eglCreateContext(mEGLDisplay, eglConfig,
(EGLContext)sharedContext, contextAttribs.data());
@@ -220,6 +218,26 @@ Driver* PlatformEGL::createDriver(void* sharedContext, const Platform::DriverCon
goto error;
}
if (ext.egl.KHR_surfaceless_context) {
// Adreno 306 driver advertises KHR_create_context but doesn't support passing
// EGL_NO_SURFACE to eglMakeCurrent with a 3.0 context.
if (UTILS_UNLIKELY(!eglMakeCurrent(mEGLDisplay,
EGL_NO_SURFACE, EGL_NO_SURFACE, mEGLContext))) {
if (eglGetError() == EGL_BAD_MATCH) {
ext.egl.KHR_surfaceless_context = false;
}
}
}
if (UTILS_UNLIKELY(!ext.egl.KHR_surfaceless_context)) {
// create the dummy surface, just for being able to make the context current.
mEGLDummySurface = eglCreatePbufferSurface(mEGLDisplay, mEGLConfig, pbufferAttribs);
if (UTILS_UNLIKELY(mEGLDummySurface == EGL_NO_SURFACE)) {
logEglError("eglCreatePbufferSurface");
goto error;
}
}
if (UTILS_UNLIKELY(!makeCurrent(mEGLDummySurface, mEGLDummySurface))) {
// eglMakeCurrent failed
logEglError("eglMakeCurrent");
@@ -255,11 +273,13 @@ error:
}
bool PlatformEGL::isExtraContextSupported() const noexcept {
return ext.egl.KHR_no_config_context;
return ext.egl.KHR_surfaceless_context;
}
void PlatformEGL::createContext(bool shared) {
EGLContext context = eglCreateContext(mEGLDisplay, EGL_NO_CONFIG_KHR,
EGLConfig config = ext.egl.KHR_no_config_context ? EGL_NO_CONFIG_KHR : mEGLConfig;
EGLContext context = eglCreateContext(mEGLDisplay, config,
shared ? mEGLContext : EGL_NO_CONTEXT, mContextAttribs.data());
if (UTILS_UNLIKELY(context == EGL_NO_CONTEXT)) {
@@ -274,6 +294,22 @@ void PlatformEGL::createContext(bool shared) {
mAdditionalContexts.push_back(context);
}
void PlatformEGL::releaseContext() noexcept {
EGLContext context = eglGetCurrentContext();
eglMakeCurrent(mEGLDisplay, EGL_NO_SURFACE, EGL_NO_SURFACE, EGL_NO_CONTEXT);
if (context != EGL_NO_CONTEXT) {
eglDestroyContext(mEGLDisplay, context);
}
mAdditionalContexts.erase(
std::remove_if(mAdditionalContexts.begin(), mAdditionalContexts.end(),
[context](EGLContext c) {
return c == context;
}), mAdditionalContexts.end());
eglReleaseThread();
}
EGLBoolean PlatformEGL::makeCurrent(EGLSurface drawSurface, EGLSurface readSurface) noexcept {
if (UTILS_UNLIKELY((drawSurface != mCurrentDrawSurface || readSurface != mCurrentReadSurface))) {
mCurrentDrawSurface = drawSurface;
@@ -284,8 +320,11 @@ EGLBoolean PlatformEGL::makeCurrent(EGLSurface drawSurface, EGLSurface readSurfa
}
void PlatformEGL::terminate() noexcept {
// it's always allowed to use EGL_NO_SURFACE, EGL_NO_CONTEXT
eglMakeCurrent(mEGLDisplay, EGL_NO_SURFACE, EGL_NO_SURFACE, EGL_NO_CONTEXT);
eglDestroySurface(mEGLDisplay, mEGLDummySurface);
if (mEGLDummySurface) {
eglDestroySurface(mEGLDisplay, mEGLDummySurface);
}
eglDestroyContext(mEGLDisplay, mEGLContext);
for (auto context : mAdditionalContexts) {
eglDestroyContext(mEGLDisplay, context);

View File

@@ -16,7 +16,10 @@
#include "VulkanBlitter.h"
#include "VulkanContext.h"
#include "VulkanFboCache.h"
#include "VulkanHandles.h"
#include "VulkanSamplerCache.h"
#include "VulkanTexture.h"
#include <utils/FixedCapacityVector.h>
#include <utils/Panic.h>
@@ -134,16 +137,20 @@ void VulkanBlitter::blitColor(BlitArgs args) {
VkFormatProperties info;
vkGetPhysicalDeviceFormatProperties(gpu, src.getFormat(), &info);
if (!ASSERT_POSTCONDITION_NON_FATAL(info.optimalTilingFeatures & VK_FORMAT_FEATURE_BLIT_SRC_BIT,
"Source format is not blittable")) {
"Source format is not blittable %d", src.getFormat())) {
return;
}
vkGetPhysicalDeviceFormatProperties(gpu, dst.getFormat(), &info);
if (!ASSERT_POSTCONDITION_NON_FATAL(info.optimalTilingFeatures & VK_FORMAT_FEATURE_BLIT_DST_BIT,
"Destination format is not blittable")) {
"Destination format is not blittable %d", dst.getFormat())) {
return;
}
#endif
VkCommandBuffer const cmdbuffer = mCommands->get().cmdbuffer;
VulkanCommandBuffer& commands = mCommands->get();
VkCommandBuffer const cmdbuffer = commands.cmdbuffer;
commands.acquire(src.texture);
commands.acquire(dst.texture);
blitFast(cmdbuffer, aspect, args.filter, args.srcTarget->getExtent(), src, dst,
args.srcRectPair, args.dstRectPair);
}
@@ -158,12 +165,12 @@ void VulkanBlitter::blitDepth(BlitArgs args) {
VkFormatProperties info;
vkGetPhysicalDeviceFormatProperties(gpu, src.getFormat(), &info);
if (!ASSERT_POSTCONDITION_NON_FATAL(info.optimalTilingFeatures & VK_FORMAT_FEATURE_BLIT_SRC_BIT,
"Depth format is not blittable")) {
"Depth src format is not blittable %d", src.getFormat())) {
return;
}
vkGetPhysicalDeviceFormatProperties(gpu, dst.getFormat(), &info);
if (!ASSERT_POSTCONDITION_NON_FATAL(info.optimalTilingFeatures & VK_FORMAT_FEATURE_BLIT_DST_BIT,
"Depth format is not blittable")) {
"Depth dst format is not blittable %d", dst.getFormat())) {
return;
}
#endif
@@ -175,7 +182,11 @@ void VulkanBlitter::blitDepth(BlitArgs args) {
args.dstRectPair);
return;
}
VkCommandBuffer const cmdbuffer = mCommands->get().cmdbuffer;
VulkanCommandBuffer& commands = mCommands->get();
VkCommandBuffer const cmdbuffer = commands.cmdbuffer;
commands.acquire(src.texture);
commands.acquire(dst.texture);
blitFast(cmdbuffer, aspect, args.filter, args.srcTarget->getExtent(), src, dst, args.srcRectPair,
args.dstRectPair);
}
@@ -245,13 +256,16 @@ void VulkanBlitter::lazyInit() noexcept {
+1.0f, +1.0f,
};
mTriangleBuffer = new VulkanBuffer(mAllocator, mCommands, mStagePool,
VK_BUFFER_USAGE_VERTEX_BUFFER_BIT, sizeof(kTriangleVertices));
VulkanCommandBuffer& commands = mCommands->get();
VkCommandBuffer const cmdbuffer = commands.cmdbuffer;
mTriangleBuffer->loadFromCpu(kTriangleVertices, 0, sizeof(kTriangleVertices));
mTriangleBuffer = new VulkanBuffer(mAllocator, mStagePool, VK_BUFFER_USAGE_VERTEX_BUFFER_BIT,
sizeof(kTriangleVertices));
mParamsBuffer = new VulkanBuffer(mAllocator, mCommands, mStagePool,
VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT, sizeof(BlitterUniforms));
mTriangleBuffer->loadFromCpu(cmdbuffer, kTriangleVertices, 0, sizeof(kTriangleVertices));
mParamsBuffer = new VulkanBuffer(mAllocator, mStagePool, VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT,
sizeof(BlitterUniforms));
}
// At a high level, the procedure for resolving depth looks like this:
@@ -263,11 +277,16 @@ void VulkanBlitter::blitSlowDepth(VkFilter filter, const VkExtent2D srcExtent, V
VulkanAttachment dst, const VkOffset3D srcRect[2], const VkOffset3D dstRect[2]) {
lazyInit();
VulkanCommandBuffer* commands = &mCommands->get();
VkCommandBuffer const cmdbuffer = commands->cmdbuffer;
commands->acquire(src.texture);
commands->acquire(dst.texture);
BlitterUniforms const uniforms = {
.sampleCount = src.texture->samples,
.inverseSampleCount = 1.0f / float(src.texture->samples),
.sampleCount = src.texture->samples,
.inverseSampleCount = 1.0f / float(src.texture->samples),
};
mParamsBuffer->loadFromCpu(&uniforms, 0, sizeof(uniforms));
mParamsBuffer->loadFromCpu(cmdbuffer, &uniforms, 0, sizeof(uniforms));
VkImageAspectFlags const aspect = VK_IMAGE_ASPECT_DEPTH_BIT;
@@ -314,8 +333,6 @@ void VulkanBlitter::blitSlowDepth(VkFilter filter, const VkExtent2D srcExtent, V
renderPassInfo.renderArea.extent.width = dstRect[1].x - dstRect[0].x;
renderPassInfo.renderArea.extent.height = dstRect[1].y - dstRect[0].y;
const VkCommandBuffer cmdbuffer = mCommands->get().cmdbuffer;
// We need to transition the source into a sampler since it'll be sampled in the shader.
const VkImageSubresourceRange srcRange = {
.aspectMask = aspect,
@@ -376,6 +393,7 @@ void VulkanBlitter::blitSlowDepth(VkFilter filter, const VkExtent2D srcExtent, V
VkSampler vksampler = mSamplerCache.getSampler({});
VkDescriptorImageInfo samplers[VulkanPipelineCache::SAMPLER_BINDING_COUNT];
VulkanTexture* textures[VulkanPipelineCache::SAMPLER_BINDING_COUNT] = {nullptr};
for (auto& sampler : samplers) {
sampler = {
.sampler = vksampler,
@@ -389,8 +407,9 @@ void VulkanBlitter::blitSlowDepth(VkFilter filter, const VkExtent2D srcExtent, V
.imageView = src.getImageView(VK_IMAGE_ASPECT_DEPTH_BIT),
.imageLayout = ImgUtil::getVkLayout(samplerLayout),
};
textures[0] = src.texture;
mPipelineCache.bindSamplers(samplers,
mPipelineCache.bindSamplers(samplers, textures,
VulkanPipelineCache::getUsageFlags(0, ShaderStageFlags::FRAGMENT));
auto previousUbo = mPipelineCache.getUniformBufferBinding(0);
@@ -407,7 +426,7 @@ void VulkanBlitter::blitSlowDepth(VkFilter filter, const VkExtent2D srcExtent, V
mPipelineCache.bindScissor(cmdbuffer, scissor);
if (!mPipelineCache.bindPipeline(cmdbuffer)) {
if (!mPipelineCache.bindPipeline(commands)) {
assert_invariant(false);
}

View File

@@ -23,9 +23,11 @@ using namespace bluevk;
namespace filament::backend {
VulkanBuffer::VulkanBuffer(VmaAllocator allocator, VulkanCommands* commands,
VulkanStagePool& stagePool, VkBufferUsageFlags usage, uint32_t numBytes)
: mAllocator(allocator), mCommands(commands), mStagePool(stagePool), mUsage(usage) {
VulkanBuffer::VulkanBuffer(VmaAllocator allocator, VulkanStagePool& stagePool,
VkBufferUsageFlags usage, uint32_t numBytes)
: mAllocator(allocator),
mStagePool(stagePool),
mUsage(usage) {
// for now make sure that only 1 bit is set in usage
// (because loadFromCpu() assumes that somewhat)
@@ -53,7 +55,8 @@ void VulkanBuffer::terminate() {
mGpuBuffer = VK_NULL_HANDLE;
}
void VulkanBuffer::loadFromCpu(const void* cpuData, uint32_t byteOffset, uint32_t numBytes) const {
void VulkanBuffer::loadFromCpu(VkCommandBuffer cmdbuf, const void* cpuData, uint32_t byteOffset,
uint32_t numBytes) const {
assert_invariant(byteOffset == 0);
VulkanStage const* stage = mStagePool.acquireStage(numBytes);
void* mapped;
@@ -62,10 +65,8 @@ void VulkanBuffer::loadFromCpu(const void* cpuData, uint32_t byteOffset, uint32_
vmaUnmapMemory(mAllocator, stage->memory);
vmaFlushAllocation(mAllocator, stage->memory, byteOffset, numBytes);
VkCommandBuffer const cmdbuffer = mCommands->get(true).cmdbuffer;
VkBufferCopy region{ .size = numBytes };
vkCmdCopyBuffer(cmdbuffer, stage->buffer, mGpuBuffer, 1, &region);
vkCmdCopyBuffer(cmdbuf, stage->buffer, mGpuBuffer, 1, &region);
// Firstly, ensure that the copy finishes before the next draw call.
// Secondly, in case the user decides to upload another chunk (without ever using the first one)
@@ -99,7 +100,7 @@ void VulkanBuffer::loadFromCpu(const void* cpuData, uint32_t byteOffset, uint32_
.size = VK_WHOLE_SIZE,
};
vkCmdPipelineBarrier(cmdbuffer, VK_PIPELINE_STAGE_TRANSFER_BIT, dstStageMask, 0, 0, nullptr, 1,
vkCmdPipelineBarrier(cmdbuf, VK_PIPELINE_STAGE_TRANSFER_BIT, dstStageMask, 0, 0, nullptr, 1,
&barrier, 0, nullptr);
}

View File

@@ -25,15 +25,18 @@ namespace filament::backend {
// Encapsulates a Vulkan buffer, its attached DeviceMemory and a staging area.
class VulkanBuffer {
public:
VulkanBuffer(VmaAllocator allocator, VulkanCommands* commands, VulkanStagePool& stagePool,
VkBufferUsageFlags usage, uint32_t numBytes);
VulkanBuffer(VmaAllocator allocator, VulkanStagePool& stagePool, VkBufferUsageFlags usage,
uint32_t numBytes);
~VulkanBuffer();
void terminate();
void loadFromCpu(const void* cpuData, uint32_t byteOffset, uint32_t numBytes) const;
VkBuffer getGpuBuffer() const { return mGpuBuffer; }
void loadFromCpu(VkCommandBuffer cmdbuf, const void* cpuData, uint32_t byteOffset,
uint32_t numBytes) const;
VkBuffer getGpuBuffer() const {
return mGpuBuffer;
}
private:
VmaAllocator mAllocator;
VulkanCommands* mCommands;
VulkanStagePool& mStagePool;
VmaAllocation mGpuMemory = VK_NULL_HANDLE;

View File

@@ -23,7 +23,6 @@
#include "VulkanConstants.h"
#include "VulkanContext.h"
#include "VulkanDriver.h"
#include <utils/Log.h>
#include <utils/Panic.h>
@@ -69,22 +68,35 @@ static VkCommandPool createPool(VkDevice device, uint32_t queueFamilyIndex) {
}
void VulkanGroupMarkers::push(std::string const& marker, Timestamp start) noexcept {
mMarkers.push(marker);
mMarkers.push_back(marker);
#if FILAMENT_VULKAN_VERBOSE
mTimestamps.push(start.time_since_epoch().count() > 0.0
mTimestamps.push_back(start.time_since_epoch().count() > 0.0
? start
: std::chrono::high_resolution_clock::now());
#endif
}
std::pair<std::string, Timestamp> VulkanGroupMarkers::pop() noexcept {
auto const marker = mMarkers.top();
mMarkers.pop();
auto const marker = mMarkers.back();
mMarkers.pop_back();
#if FILAMENT_VULKAN_VERBOSE
auto const topTimestamp = mTimestamps.top();
mTimestamps.pop();
return std::make_pair(marker, topTimestamp);
auto const timestamp = mTimestamps.back();
mTimestamps.pop_back();
return std::make_pair(marker, timestamp);
#else
return std::make_pair(marker, Timestamp{});
#endif
}
std::pair<std::string, Timestamp> VulkanGroupMarkers::pop_bottom() noexcept {
auto const marker = mMarkers.front();
mMarkers.pop_front();
#if FILAMENT_VULKAN_VERBOSE
auto const timestamp = mTimestamps.front();
mTimestamps.pop_front();
return std::make_pair(marker, timestamp);
#else
return std::make_pair(marker, Timestamp{});
#endif
@@ -92,7 +104,7 @@ std::pair<std::string, Timestamp> VulkanGroupMarkers::pop() noexcept {
std::pair<std::string, Timestamp> VulkanGroupMarkers::top() const {
assert_invariant(!empty());
auto const marker = mMarkers.top();
auto const marker = mMarkers.back();
#if FILAMENT_VULKAN_VERBOSE
auto const topTimestamp = mTimestamps.top();
return std::make_pair(marker, topTimestamp);
@@ -106,15 +118,20 @@ bool VulkanGroupMarkers::empty() const noexcept {
}
VulkanCommands::VulkanCommands(VkDevice device, VkQueue queue, uint32_t queueFamilyIndex,
VulkanContext* context)
VulkanContext* context, VulkanResourceAllocator* allocator)
: mDevice(device),
mQueue(queue),
mPool(createPool(mDevice, queueFamilyIndex)),
mContext(context) {
mContext(context),
mStorage(CAPACITY) {
VkSemaphoreCreateInfo sci{.sType = VK_STRUCTURE_TYPE_SEMAPHORE_CREATE_INFO};
for (auto& semaphore: mSubmissionSignals) {
vkCreateSemaphore(mDevice, &sci, nullptr, &semaphore);
}
for (size_t i = 0; i < CAPACITY; ++i) {
mStorage[i] = std::make_unique<VulkanCommandBuffer>(allocator);
}
}
VulkanCommands::~VulkanCommands() {
@@ -126,10 +143,9 @@ VulkanCommands::~VulkanCommands() {
}
}
VulkanCommandBuffer const& VulkanCommands::get(bool blockOnGC) {
if (mCurrent) {
mCurrent->blockOnGC = mCurrent->blockOnGC || blockOnGC;
return *mCurrent;
VulkanCommandBuffer& VulkanCommands::get() {
if (mCurrentCommandBufferIndex >= 0) {
return *mStorage[mCurrentCommandBufferIndex].get();
}
// If we ran out of available command buffers, stall until one finishes. This is very rare.
@@ -145,15 +161,18 @@ VulkanCommandBuffer const& VulkanCommands::get(bool blockOnGC) {
gc();
}
VulkanCommandBuffer* currentbuf = nullptr;
// Find an available slot.
for (VulkanCommandBuffer& wrapper : mStorage) {
if (wrapper.cmdbuffer == VK_NULL_HANDLE) {
mCurrent = &wrapper;
for (size_t i = 0; i < CAPACITY; ++i) {
auto wrapper = mStorage[i].get();
if (wrapper->cmdbuffer == VK_NULL_HANDLE) {
mCurrentCommandBufferIndex = static_cast<int8_t>(i);
currentbuf = wrapper;
break;
}
}
assert_invariant(mCurrent);
assert_invariant(currentbuf);
--mAvailableCount;
// Create the low-level command buffer.
@@ -163,47 +182,46 @@ VulkanCommandBuffer const& VulkanCommands::get(bool blockOnGC) {
.level = VK_COMMAND_BUFFER_LEVEL_PRIMARY,
.commandBufferCount = 1
};
vkAllocateCommandBuffers(mDevice, &allocateInfo, &mCurrent->cmdbuffer);
mCurrent->blockOnGC = blockOnGC;
vkAllocateCommandBuffers(mDevice, &allocateInfo, &currentbuf->cmdbuffer);
// Note that the fence wrapper uses shared_ptr because a DriverAPI fence can also have ownership
// over it. The destruction of the low-level fence occurs either in VulkanCommands::gc(), or in
// VulkanDriver::destroyFence(), both of which are safe spots.
mCurrent->fence = std::make_shared<VulkanCmdFence>(mDevice);
currentbuf->fence = std::make_shared<VulkanCmdFence>(mDevice);
// Begin writing into the command buffer.
const VkCommandBufferBeginInfo binfo {
.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO,
.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT,
};
vkBeginCommandBuffer(mCurrent->cmdbuffer, &binfo);
vkBeginCommandBuffer(currentbuf->cmdbuffer, &binfo);
// Notify the observer that a new command buffer has been activated.
if (mObserver) {
mObserver->onCommandBuffer(*mCurrent);
mObserver->onCommandBuffer(*currentbuf);
}
// We push the current markers onto a temporary stack. This must be placed after mCurrent is set
// to the new command buffer since pushGroupMarker also calls get().
// We push the current markers onto a temporary stack. This must be placed after currentbuf is
// set to the new command buffer since pushGroupMarker also calls get().
while (mCarriedOverMarkers && !mCarriedOverMarkers->empty()) {
auto [marker, time] = mCarriedOverMarkers->pop();
pushGroupMarker(marker.c_str(), time);
}
return *mCurrent;
return *currentbuf;
}
bool VulkanCommands::flush() {
// It's perfectly fine to call flush when no commands have been written.
if (mCurrent == nullptr) {
if (mCurrentCommandBufferIndex < 0) {
return false;
}
const int64_t index = mCurrent - &mStorage[0];
VkSemaphore renderingFinished = mSubmissionSignals[index];
int8_t const index = mCurrentCommandBufferIndex;
VulkanCommandBuffer const* currentbuf = mStorage[index].get();
VkSemaphore const renderingFinished = mSubmissionSignals[index];
vkEndCommandBuffer(mCurrent->cmdbuffer);
vkEndCommandBuffer(currentbuf->cmdbuffer);
// If the injected semaphore is an "image available" semaphore that has not yet been signaled,
// it is sometimes fine to start executing commands anyway, as along as we stall the GPU at the
@@ -227,7 +245,7 @@ bool VulkanCommands::flush() {
.pWaitSemaphores = signals,
.pWaitDstStageMask = waitDestStageMasks,
.commandBufferCount = 1,
.pCommandBuffers = &mCurrent->cmdbuffer,
.pCommandBuffers = &currentbuf->cmdbuffer,
.signalSemaphoreCount = 1u,
.pSignalSemaphores = &renderingFinished,
};
@@ -240,13 +258,18 @@ bool VulkanCommands::flush() {
signals[submitInfo.waitSemaphoreCount++] = mInjectedSignal;
}
if (FILAMENT_VULKAN_VERBOSE) {
slog.i << "Submitting cmdbuffer=" << mCurrent->cmdbuffer
<< " wait=(" << signals[0] << ", " << signals[1] << ") "
<< " signal=" << renderingFinished
<< io::endl;
// To address a validation warning.
if (submitInfo.waitSemaphoreCount == 0) {
submitInfo.pWaitSemaphores = VK_NULL_HANDLE;
}
#if FILAMENT_VULKAN_VERBOSE
slog.i << "Submitting cmdbuffer=" << currentbuf->cmdbuffer
<< " wait=(" << signals[0] << ", " << signals[1] << ") "
<< " signal=" << renderingFinished
<< io::endl;
#endif
// Before actually submitting, we need to pop any leftover group markers.
while (mGroupMarkers && !mGroupMarkers->empty()) {
if (!mCarriedOverMarkers) {
@@ -258,44 +281,51 @@ bool VulkanCommands::flush() {
popGroupMarker();
}
auto& cmdfence = mCurrent->fence;
auto& cmdfence = currentbuf->fence;
std::unique_lock<utils::Mutex> lock(cmdfence->mutex);
cmdfence->status.store(VK_NOT_READY);
UTILS_UNUSED_IN_RELEASE VkResult result = vkQueueSubmit(mQueue, 1, &submitInfo, cmdfence->fence);
cmdfence->condition.notify_all();
lock.unlock();
#if FILAMENT_VULKAN_VERBOSE
if (result != VK_SUCCESS) {
utils::slog.d <<"Failed command buffer submission result: " << result << utils::io::endl;
}
#endif
assert_invariant(result == VK_SUCCESS);
mSubmissionSignal = renderingFinished;
mInjectedSignal = VK_NULL_HANDLE;
mCurrent = nullptr;
mCurrentCommandBufferIndex = -1;
return true;
}
VkSemaphore VulkanCommands::acquireFinishedSignal() {
VkSemaphore semaphore = mSubmissionSignal;
mSubmissionSignal = VK_NULL_HANDLE;
if (FILAMENT_VULKAN_VERBOSE) {
slog.i << "Acquiring " << semaphore << " (e.g. for vkQueuePresentKHR)" << io::endl;
}
#if FILAMENT_VULKAN_VERBOSE
slog.i << "Acquiring " << semaphore << " (e.g. for vkQueuePresentKHR)" << io::endl;
#endif
return semaphore;
}
void VulkanCommands::injectDependency(VkSemaphore next) {
assert_invariant(mInjectedSignal == VK_NULL_HANDLE);
mInjectedSignal = next;
if (FILAMENT_VULKAN_VERBOSE) {
slog.i << "Injecting " << next << " (e.g. due to vkAcquireNextImageKHR)" << io::endl;
}
#if FILAMENT_VULKAN_VERBOSE
slog.i << "Injecting " << next << " (e.g. due to vkAcquireNextImageKHR)" << io::endl;
#endif
}
void VulkanCommands::wait() {
VkFence fences[CAPACITY];
uint32_t count = 0;
for (auto& wrapper : mStorage) {
if (wrapper.cmdbuffer != VK_NULL_HANDLE && mCurrent != &wrapper) {
fences[count++] = wrapper.fence->fence;
size_t count = 0;
for (size_t i = 0; i < CAPACITY; i++) {
auto wrapper = mStorage[i].get();
if (wrapper->cmdbuffer != VK_NULL_HANDLE
&& mCurrentCommandBufferIndex != static_cast<int8_t>(i)) {
fences[count++] = wrapper->fence->fence;
}
}
if (count > 0) {
@@ -304,26 +334,34 @@ void VulkanCommands::wait() {
}
void VulkanCommands::gc() {
for (auto& wrapper : mStorage) {
if (wrapper.cmdbuffer != VK_NULL_HANDLE) {
uint64_t const timeout = wrapper.blockOnGC ? UINT64_MAX : 0;
VkResult const result
= vkWaitForFences(mDevice, 1, &wrapper.fence->fence, VK_TRUE, timeout);
if (result == VK_SUCCESS) {
vkFreeCommandBuffers(mDevice, mPool, 1, &wrapper.cmdbuffer);
wrapper.cmdbuffer = VK_NULL_HANDLE;
wrapper.fence->status.store(VK_SUCCESS);
wrapper.fence.reset();
++mAvailableCount;
}
VkCommandBuffer buffers[CAPACITY];
size_t count = 0;
for (size_t i = 0; i < CAPACITY; i++) {
auto wrapper = mStorage[i].get();
if (wrapper->cmdbuffer == VK_NULL_HANDLE) {
continue;
}
VkResult const result = vkWaitForFences(mDevice, 1, &wrapper->fence->fence, VK_TRUE, 0);
if (result != VK_SUCCESS) {
continue;
}
buffers[count++] = wrapper->cmdbuffer;
wrapper->cmdbuffer = VK_NULL_HANDLE;
wrapper->fence->status.store(VK_SUCCESS);
wrapper->fence.reset();
wrapper->clearResources();
++mAvailableCount;
}
if (count > 0) {
vkFreeCommandBuffers(mDevice, mPool, count, buffers);
}
}
void VulkanCommands::updateFences() {
for (auto& wrapper : mStorage) {
if (wrapper.cmdbuffer != VK_NULL_HANDLE) {
VulkanCmdFence* fence = wrapper.fence.get();
for (size_t i = 0; i < CAPACITY; i++) {
auto wrapper = mStorage[i].get();
if (wrapper->cmdbuffer != VK_NULL_HANDLE) {
VulkanCmdFence* fence = wrapper->fence.get();
if (fence) {
VkResult status = vkGetFenceStatus(mDevice, fence->fence);
// This is either VK_SUCCESS, VK_NOT_READY, or VK_ERROR_DEVICE_LOST.
@@ -389,8 +427,9 @@ void VulkanCommands::popGroupMarker() {
}
} else if (mCarriedOverMarkers && !mCarriedOverMarkers->empty()) {
// It could be that pop is called between flush() and get() (new command buffer), in which
// case the marker is in "carried over" state. We'd just remove that
mCarriedOverMarkers->pop();
// case the marker is in "carried over" state, we'd just remove that. Since the
// mCarriedOverMarkers is in the opposite order, we pop the bottom instead of the top.
mCarriedOverMarkers->pop_bottom();
}
}

View File

@@ -19,15 +19,19 @@
#include <bluevk/BlueVK.h>
#include "DriverBase.h"
#include "VulkanConstants.h"
#include "VulkanResources.h"
#include <utils/Condition.h>
#include <utils/FixedCapacityVector.h>
#include <utils/Mutex.h>
#include <atomic>
#include <chrono>
#include <stack>
#include <list>
#include <string>
#include <utility>
@@ -41,13 +45,14 @@ public:
void push(std::string const& marker, Timestamp start = {}) noexcept;
std::pair<std::string, Timestamp> pop() noexcept;
std::pair<std::string, Timestamp> pop_bottom() noexcept;
std::pair<std::string, Timestamp> top() const;
bool empty() const noexcept;
private:
std::stack<std::string> mMarkers;
std::list<std::string> mMarkers;
#if FILAMENT_VULKAN_VERBOSE
std::stack<Timestamp> mTimestamps;
std::list<Timestamp> mTimestamps;
#endif
};
@@ -66,12 +71,28 @@ struct VulkanCmdFence {
// DriverApi fence object and should not be destroyed until both the DriverApi object is freed and
// we're done waiting on the most recent submission of the given command buffer.
struct VulkanCommandBuffer {
VulkanCommandBuffer() {}
VulkanCommandBuffer(VulkanResourceAllocator* allocator)
: mResourceManager(allocator) {}
VulkanCommandBuffer(VulkanCommandBuffer const&) = delete;
VulkanCommandBuffer& operator=(VulkanCommandBuffer const&) = delete;
VkCommandBuffer cmdbuffer = VK_NULL_HANDLE;
std::shared_ptr<VulkanCmdFence> fence;
bool blockOnGC = false;
inline void acquire(VulkanResource* resource) {
mResourceManager.acquire(resource);
}
inline void acquire(VulkanAcquireOnlyResourceManager* srcResources) {
mResourceManager.acquire(srcResources);
}
inline void clearResources() {
mResourceManager.clear();
}
private:
VulkanAcquireOnlyResourceManager mResourceManager;
};
// Allows classes to be notified after a new command buffer has been activated.
@@ -110,14 +131,11 @@ public:
class VulkanCommands {
public:
VulkanCommands(VkDevice device, VkQueue queue, uint32_t queueFamilyIndex,
VulkanContext* context);
VulkanContext* context, VulkanResourceAllocator* allocator);
~VulkanCommands();
// Creates a "current" command buffer if none exists, otherwise returns the current one.
// `blockOnGC` guarrantees that this buffer will be waited on when gc() is called on it so
// that dependent resources can be gc'd safetly after the buffer is sumbitted, completed,
// and gc'd.
VulkanCommandBuffer const& get(bool blockOnGC = false);
VulkanCommandBuffer& get();
// Submits the current command buffer if it exists, then sets "current" to null.
// If there are no outstanding commands then nothing happens and this returns false.
@@ -160,10 +178,12 @@ class VulkanCommands {
VkCommandPool const mPool;
VulkanContext const* mContext;
VulkanCommandBuffer* mCurrent = nullptr;
// int8 only goes up to 127, therefore capacity must be less than that.
static_assert(CAPACITY < 128);
int8_t mCurrentCommandBufferIndex = -1;
VkSemaphore mSubmissionSignal = {};
VkSemaphore mInjectedSignal = {};
VulkanCommandBuffer mStorage[CAPACITY] = {};
utils::FixedCapacityVector<std::unique_ptr<VulkanCommandBuffer>> mStorage;
VkSemaphore mSubmissionSignals[CAPACITY] = {};
size_t mAvailableCount = CAPACITY;
CommandBufferObserver* mObserver = nullptr;

View File

@@ -97,8 +97,7 @@ public:
}
flags >>= 1;
}
ASSERT_POSTCONDITION(false, "Unable to find a memory type that meets requirements.");
return (uint32_t) ~0ul;
return (uint32_t) VK_MAX_MEMORY_TYPES;
}
inline VkFormat getDepthFormat() const {

View File

@@ -1,97 +0,0 @@
/*
* Copyright (C) 2019 The Android Open Source Project
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include "VulkanDisposer.h"
#include "VulkanConstants.h"
#include <utils/debug.h>
#include <utils/Log.h>
namespace filament::backend {
// Always wait at least 3 frames after a DriverAPI-level resource has been destroyed for safe
// destruction, due to potential usage by outstanding command buffers and triple buffering.
static constexpr uint32_t FRAMES_BEFORE_EVICTION = VK_MAX_COMMAND_BUFFERS;
void VulkanDisposer::createDisposable(Key resource, std::function<void()> destructor) noexcept {
mDisposables[resource].destructor = destructor;
}
void VulkanDisposer::removeReference(Key resource) noexcept {
// Null can be passed in as a no-op, this is not an error.
if (resource == nullptr) {
return;
}
assert_invariant(mDisposables[resource].refcount > 0);
--mDisposables[resource].refcount;
}
void VulkanDisposer::acquire(Key resource) noexcept {
// It's fine to "acquire" a non-managed resource, it's just a no-op.
if (resource == nullptr) {
return;
}
auto iter = mDisposables.find(resource);
if (iter == mDisposables.end()) {
return;
}
Disposable& disposable = iter.value();
assert_invariant(disposable.refcount > 0 && disposable.refcount < 65535);
// If an auto-decrement is already in place, do not increase the ref count.
if (disposable.remainingFrames == 0) {
++disposable.refcount;
}
disposable.remainingFrames = FRAMES_BEFORE_EVICTION;
}
void VulkanDisposer::gc() noexcept {
// First decrement the frame count of all resources that were held by a command buffer.
// If any of these reaches zero, decrement its reference count.
for (auto iter = mDisposables.begin(); iter != mDisposables.end(); ++iter) {
Disposable& disposable = iter.value();
if (disposable.refcount > 0 && disposable.remainingFrames > 0) {
if (--disposable.remainingFrames == 0) {
removeReference(iter.key());
}
}
}
// Next, destroy all resources with a zero refcount.
decltype(mDisposables) disposables;
for (auto iter : mDisposables) {
Disposable& disposable = iter.second;
if (disposable.refcount == 0) {
disposable.destructor();
} else {
disposables.insert({iter.first, disposable});
}
}
disposables.swap(mDisposables);
}
void VulkanDisposer::terminate() noexcept {
#ifndef NDEBUG
utils::slog.i << mDisposables.size() << " disposables are outstanding." << utils::io::endl;
#endif
for (auto iter : mDisposables) {
iter.second.destructor();
}
mDisposables.clear();
}
} // namespace filament::backend

View File

@@ -1,60 +0,0 @@
/*
* Copyright (C) 2019 The Android Open Source Project
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef TNT_FILAMENT_BACKEND_VULKANDISPOSER_H
#define TNT_FILAMENT_BACKEND_VULKANDISPOSER_H
#include <tsl/robin_map.h>
#include <functional>
namespace filament::backend {
// VulkanDisposer tracks resources (such as textures or vertex buffers) that need deferred
// destruction due to potential use by a Vulkan command buffer. Resources are represented with void*
// to allow callers to use any type of handle.
class VulkanDisposer {
public:
using Key = const void*;
// Adds the given resource to the disposer and sets its reference count to 1.
void createDisposable(Key resource, std::function<void()> destructor) noexcept;
// Decrements the reference count.
void removeReference(Key resource) noexcept;
// Increments the reference count and auto-decrements it after FRAMES_BEFORE_EVICTION frames.
// This is used to indicate that the current command buffer has a reference to the resource.
void acquire(Key resource) noexcept;
// Invokes the destructor function for each disposable with a 0 refcount.
void gc() noexcept;
// Invokes the destructor function for all disposables, regardless of reference count.
void terminate() noexcept;
private:
struct Disposable {
uint16_t refcount = 1;
uint16_t remainingFrames = 0;
std::function<void()> destructor = []() {};
};
tsl::robin_map<Key, Disposable> mDisposables;
};
} // namespace filament::backend
#endif // TNT_FILAMENT_BACKEND_VULKANDISPOSER_H

File diff suppressed because it is too large Load Diff

View File

@@ -17,22 +17,23 @@
#ifndef TNT_FILAMENT_BACKEND_VULKANDRIVER_H
#define TNT_FILAMENT_BACKEND_VULKANDRIVER_H
#include "VulkanPipelineCache.h"
#include "VulkanBlitter.h"
#include "VulkanDisposer.h"
#include "VulkanConstants.h"
#include "VulkanContext.h"
#include "VulkanFboCache.h"
#include "VulkanHandles.h"
#include "VulkanPipelineCache.h"
#include "VulkanReadPixels.h"
#include "VulkanResourceAllocator.h"
#include "VulkanSamplerCache.h"
#include "VulkanStagePool.h"
#include "VulkanUtility.h"
#include "private/backend/Driver.h"
#include "private/backend/HandleAllocator.h"
#include "DriverBase.h"
#include "private/backend/Driver.h"
#include <utils/compiler.h>
#include <utils/Allocator.h>
#include <utils/compiler.h>
namespace filament::backend {
@@ -45,8 +46,8 @@ public:
Platform::DriverConfig const& driverConfig) noexcept;
private:
void debugCommandBegin(CommandStream* cmds, bool synchronous, const char* methodName) noexcept override;
void debugCommandBegin(CommandStream* cmds, bool synchronous,
const char* methodName) noexcept override;
inline VulkanDriver(VulkanPlatform* platform, VulkanContext const& context,
Platform::DriverConfig const& driverConfig) noexcept;
@@ -60,77 +61,24 @@ private:
template<typename T>
friend class ConcreteDispatcher;
#define DECL_DRIVER_API(methodName, paramsDecl, params) \
#define DECL_DRIVER_API(methodName, paramsDecl, params) \
UTILS_ALWAYS_INLINE inline void methodName(paramsDecl);
#define DECL_DRIVER_API_SYNCHRONOUS(RetType, methodName, paramsDecl, params) \
#define DECL_DRIVER_API_SYNCHRONOUS(RetType, methodName, paramsDecl, params) \
RetType methodName(paramsDecl) override;
#define DECL_DRIVER_API_RETURN(RetType, methodName, paramsDecl, params) \
RetType methodName##S() noexcept override; \
#define DECL_DRIVER_API_RETURN(RetType, methodName, paramsDecl, params) \
RetType methodName##S() noexcept override; \
UTILS_ALWAYS_INLINE inline void methodName##R(RetType, paramsDecl);
#include "private/backend/DriverAPI.inc"
VulkanDriver(VulkanDriver const&) = delete;
VulkanDriver& operator = (VulkanDriver const&) = delete;
VulkanDriver& operator=(VulkanDriver const&) = delete;
private:
template<typename D, typename ... ARGS>
Handle<D> initHandle(ARGS&& ... args) noexcept {
return mHandleAllocator.allocateAndConstruct<D>(std::forward<ARGS>(args) ...);
}
template<typename D>
Handle<D> allocHandle() noexcept {
return mHandleAllocator.allocate<D>();
}
template<typename D, typename B, typename ... ARGS>
typename std::enable_if<std::is_base_of<B, D>::value, D>::type*
construct(Handle<B> const& handle, ARGS&& ... args) noexcept {
return mHandleAllocator.construct<D, B>(handle, std::forward<ARGS>(args) ...);
}
template<typename B, typename D,
typename = typename std::enable_if<std::is_base_of<B, D>::value, D>::type>
void destruct(Handle<B> handle, D const* p) noexcept {
return mHandleAllocator.deallocate(handle, p);
}
template<typename Dp, typename B>
typename std::enable_if_t<
std::is_pointer_v<Dp> &&
std::is_base_of_v<B, typename std::remove_pointer_t<Dp>>, Dp>
handle_cast(Handle<B>& handle) noexcept {
return mHandleAllocator.handle_cast<Dp, B>(handle);
}
template<typename Dp, typename B>
inline typename std::enable_if_t<
std::is_pointer_v<Dp> &&
std::is_base_of_v<B, typename std::remove_pointer_t<Dp>>, Dp>
handle_cast(Handle<B> const& handle) noexcept {
return mHandleAllocator.handle_cast<Dp, B>(handle);
}
template<typename D, typename B>
void destruct(Handle<B> handle) noexcept {
destruct(handle, handle_cast<D const*>(handle));
}
// This version of destruct takes a VulkanContext and calls a terminate(VulkanContext&)
// on the handle before calling the dtor
template<typename Dp, typename B>
void destructBuffer(Handle<B> handle) noexcept {
auto ptr = handle_cast<Dp*>(handle);
ptr->terminate();
mHandleAllocator.deallocate(handle, ptr);
}
inline void setRenderPrimitiveBuffer(Handle<HwRenderPrimitive> rph,
Handle<HwVertexBuffer> vbh, Handle<HwIndexBuffer> ibh);
inline void setRenderPrimitiveBuffer(Handle<HwRenderPrimitive> rph, Handle<HwVertexBuffer> vbh,
Handle<HwIndexBuffer> ibh);
inline void setRenderPrimitiveRange(Handle<HwRenderPrimitive> rph, PrimitiveType pt,
uint32_t offset, uint32_t minIndex, uint32_t maxIndex, uint32_t count);
@@ -150,14 +98,20 @@ private:
VkDebugUtilsMessengerEXT mDebugMessenger = VK_NULL_HANDLE;
VulkanContext mContext = {};
HandleAllocatorVK mHandleAllocator;
VulkanResourceAllocator mResourceAllocator;
VulkanResourceManager mResourceManager;
// Used for resources that are created synchronously and used and destroyed on the backend
// thread.
VulkanThreadSafeResourceManager mThreadSafeResourceManager;
VulkanPipelineCache mPipelineCache;
VulkanDisposer mDisposer;
VulkanStagePool mStagePool;
VulkanFboCache mFramebufferCache;
VulkanSamplerCache mSamplerCache;
VulkanBlitter mBlitter;
VulkanSamplerGroup* mSamplerBindings[VulkanPipelineCache::SAMPLER_BINDING_COUNT] = {};
VulkanReadPixels mReadPixels;
};
} // namespace filament::backend

View File

@@ -46,10 +46,12 @@ static void clampToFramebuffer(VkRect2D* rect, uint32_t fbWidth, uint32_t fbHeig
rect->extent.height = std::max(top - y, 0);
}
VulkanProgram::VulkanProgram(VkDevice device, const Program& builder) noexcept :
HwProgram(builder.getName()), mDevice(device) {
VulkanProgram::VulkanProgram(VkDevice device, const Program& builder) noexcept
: HwProgram(builder.getName()),
VulkanResource(VulkanResourceType::PROGRAM),
mDevice(device) {
auto const& blobs = builder.getShadersSource();
VkShaderModule* modules[2] = { &bundle.vertex, &bundle.fragment };
VkShaderModule* modules[2] = {&bundle.vertex, &bundle.fragment};
// TODO: handle compute shaders.
for (size_t i = 0; i < 2; i++) {
const auto& blob = blobs[i];
@@ -113,8 +115,9 @@ VulkanProgram::VulkanProgram(VkDevice device, const Program& builder) noexcept :
}
}
VulkanProgram::VulkanProgram(VkDevice device, VkShaderModule vs, VkShaderModule fs) noexcept :
mDevice(device) {
VulkanProgram::VulkanProgram(VkDevice device, VkShaderModule vs, VkShaderModule fs) noexcept
: VulkanResource(VulkanResourceType::PROGRAM),
mDevice(device) {
bundle.vertex = vs;
bundle.fragment = fs;
}
@@ -126,7 +129,10 @@ VulkanProgram::~VulkanProgram() {
}
// Creates a special "default" render target (i.e. associated with the swap chain)
VulkanRenderTarget::VulkanRenderTarget() : HwRenderTarget(0, 0), mOffscreen(false), mSamples(1) {}
VulkanRenderTarget::VulkanRenderTarget() :
HwRenderTarget(0, 0),
VulkanResource(VulkanResourceType::RENDER_TARGET),
mOffscreen(false), mSamples(1) {}
void VulkanRenderTarget::bindToSwapChain(VulkanSwapChain& swapChain) {
assert_invariant(!mOffscreen);
@@ -138,11 +144,14 @@ void VulkanRenderTarget::bindToSwapChain(VulkanSwapChain& swapChain) {
}
VulkanRenderTarget::VulkanRenderTarget(VkDevice device, VkPhysicalDevice physicalDevice,
VulkanContext const& context, VmaAllocator allocator,
VulkanCommands* commands, uint32_t width, uint32_t height, uint8_t samples,
VulkanContext const& context, VmaAllocator allocator, VulkanCommands* commands,
uint32_t width, uint32_t height, uint8_t samples,
VulkanAttachment color[MRT::MAX_SUPPORTED_RENDER_TARGET_COUNT],
VulkanAttachment depthStencil[2], VulkanStagePool& stagePool)
: HwRenderTarget(width, height), mOffscreen(true), mSamples(samples) {
: HwRenderTarget(width, height),
VulkanResource(VulkanResourceType::RENDER_TARGET),
mOffscreen(true),
mSamples(samples) {
for (int index = 0; index < MRT::MAX_SUPPORTED_RENDER_TARGET_COUNT; index++) {
mColor[index] = color[index];
}
@@ -166,10 +175,11 @@ VulkanRenderTarget::VulkanRenderTarget(VkDevice device, VkPhysicalDevice physica
if (texture && texture->samples == 1) {
auto msTexture = texture->getSidecar();
if (UTILS_UNLIKELY(!msTexture)) {
msTexture = new VulkanTexture(device, physicalDevice, context,
allocator, commands, texture->target,
((VulkanTexture const*) texture)->levels, texture->format, samples,
texture->width, texture->height, texture->depth, texture->usage, stagePool);
// TODO: This should be allocated with the ResourceAllocator.
msTexture = new VulkanTexture(device, physicalDevice, context, allocator, commands,
texture->target, ((VulkanTexture const*) texture)->levels, texture->format,
samples, texture->width, texture->height, texture->depth, texture->usage,
stagePool, true /* heap allocated */);
texture->setSidecar(msTexture);
}
mMsaaAttachments[index] = {.texture = msTexture};
@@ -198,7 +208,7 @@ VulkanRenderTarget::VulkanRenderTarget(VkDevice device, VkPhysicalDevice physica
msTexture = new VulkanTexture(device, physicalDevice, context, allocator,
commands, depthTexture->target, msLevel, depthTexture->format, samples,
depthTexture->width, depthTexture->height, depthTexture->depth, depthTexture->usage,
stagePool);
stagePool, true /* heap allocated */);
depthTexture->setSidecar(msTexture);
}
@@ -257,16 +267,23 @@ uint8_t VulkanRenderTarget::getColorTargetCount(const VulkanRenderPass& pass) co
}
VulkanVertexBuffer::VulkanVertexBuffer(VulkanContext& context, VulkanStagePool& stagePool,
uint8_t bufferCount, uint8_t attributeCount,
uint32_t elementCount, AttributeArray const& attribs) :
HwVertexBuffer(bufferCount, attributeCount, elementCount, attribs),
buffers(bufferCount, nullptr) {}
VulkanResourceAllocator* allocator, uint8_t bufferCount, uint8_t attributeCount,
uint32_t elementCount, AttributeArray const& attribs)
: HwVertexBuffer(bufferCount, attributeCount, elementCount, attribs),
VulkanResource(VulkanResourceType::VERTEX_BUFFER),
buffers(bufferCount, nullptr),
mResources(allocator) {}
VulkanBufferObject::VulkanBufferObject(VmaAllocator allocator,
VulkanCommands* commands, VulkanStagePool& stagePool, uint32_t byteCount,
BufferObjectBinding bindingType, BufferUsage usage)
void VulkanVertexBuffer::setBuffer(VulkanBufferObject* bufferObject, uint32_t index) {
buffers[index] = &bufferObject->buffer;
mResources.acquire(bufferObject);
}
VulkanBufferObject::VulkanBufferObject(VmaAllocator allocator, VulkanStagePool& stagePool,
uint32_t byteCount, BufferObjectBinding bindingType, BufferUsage usage)
: HwBufferObject(byteCount),
buffer(allocator, commands, stagePool, getBufferObjectUsage(bindingType), byteCount),
VulkanResource(VulkanResourceType::BUFFER_OBJECT),
buffer(allocator, stagePool, getBufferObjectUsage(bindingType), byteCount),
bindingType(bindingType) {}
void VulkanRenderPrimitive::setPrimitiveType(PrimitiveType pt) {
@@ -294,10 +311,14 @@ void VulkanRenderPrimitive::setBuffers(VulkanVertexBuffer* vertexBuffer,
VulkanIndexBuffer* indexBuffer) {
this->vertexBuffer = vertexBuffer;
this->indexBuffer = indexBuffer;
mResources.acquire(vertexBuffer);
mResources.acquire(indexBuffer);
}
VulkanTimerQuery::VulkanTimerQuery(std::tuple<uint32_t, uint32_t> indices)
: mStartingQueryIndex(std::get<0>(indices)), mStoppingQueryIndex(std::get<1>(indices)) {}
: VulkanThreadSafeResource(VulkanResourceType::TIMER_QUERY),
mStartingQueryIndex(std::get<0>(indices)),
mStoppingQueryIndex(std::get<1>(indices)) {}
void VulkanTimerQuery::setFence(std::shared_ptr<VulkanCmdFence> fence) noexcept {
std::unique_lock<utils::Mutex> lock(mFenceMutex);

View File

@@ -14,25 +14,28 @@
* limitations under the License.
*/
#ifndef TNT_FILAMENT_BACKEND_VULKANHANDLES_H
#define TNT_FILAMENT_BACKEND_VULKANHANDLES_H
#ifndef TNT_FILAMENT_BACKEND_VULKANHANDLES_H
#define TNT_FILAMENT_BACKEND_VULKANHANDLES_H
// This needs to be at the top
#include "DriverBase.h"
#include "VulkanDriver.h"
#include "VulkanPipelineCache.h"
#include "VulkanBuffer.h"
#include "VulkanPipelineCache.h"
#include "VulkanResources.h"
#include "VulkanSwapChain.h"
#include "VulkanTexture.h"
#include "VulkanUtility.h"
#include "private/backend/SamplerGroup.h"
#include "utils/Mutex.h"
#include <utils/Mutex.h>
namespace filament::backend {
class VulkanTimestamps;
struct VulkanProgram : public HwProgram {
struct VulkanProgram : public HwProgram, VulkanResource {
VulkanProgram(VkDevice device, const Program& builder) noexcept;
VulkanProgram(VkDevice device, VkShaderModule vs, VkShaderModule fs) noexcept;
~VulkanProgram();
@@ -51,7 +54,7 @@ private:
//
// We use private inheritance to shield clients from the width / height fields in HwRenderTarget,
// which are not representative when this is the default render target.
struct VulkanRenderTarget : private HwRenderTarget {
struct VulkanRenderTarget : private HwRenderTarget, VulkanResource {
// Creates an offscreen render target.
VulkanRenderTarget(VkDevice device, VkPhysicalDevice physicalDevice,
VulkanContext const& context, VmaAllocator allocator,
@@ -84,29 +87,43 @@ private:
uint8_t mSamples : 7;
};
struct VulkanVertexBuffer : public HwVertexBuffer {
struct VulkanBufferObject;
struct VulkanVertexBuffer : public HwVertexBuffer, VulkanResource {
VulkanVertexBuffer(VulkanContext& context, VulkanStagePool& stagePool,
uint8_t bufferCount, uint8_t attributeCount, uint32_t elementCount,
AttributeArray const& attributes);
VulkanResourceAllocator* allocator, uint8_t bufferCount, uint8_t attributeCount,
uint32_t elementCount, AttributeArray const& attributes);
void setBuffer(VulkanBufferObject* bufferObject, uint32_t index);
inline void terminate() {
mResources.clear();
}
utils::FixedCapacityVector<VulkanBuffer const*> buffers;
private:
FixedSizeVulkanResourceManager mResources;
};
struct VulkanIndexBuffer : public HwIndexBuffer {
VulkanIndexBuffer(VmaAllocator allocator, VulkanCommands* commands,
VulkanStagePool& stagePool, uint8_t elementSize, uint32_t indexCount)
struct VulkanIndexBuffer : public HwIndexBuffer, VulkanResource {
VulkanIndexBuffer(VmaAllocator allocator, VulkanStagePool& stagePool, uint8_t elementSize,
uint32_t indexCount)
: HwIndexBuffer(elementSize, indexCount),
buffer(allocator, commands, stagePool, VK_BUFFER_USAGE_INDEX_BUFFER_BIT,
elementSize * indexCount),
VulkanResource(VulkanResourceType::INDEX_BUFFER),
buffer(allocator, stagePool, VK_BUFFER_USAGE_INDEX_BUFFER_BIT, elementSize * indexCount),
indexType(elementSize == 2 ? VK_INDEX_TYPE_UINT16 : VK_INDEX_TYPE_UINT32) {}
void terminate() { buffer.terminate(); }
void terminate() {
buffer.terminate();
}
VulkanBuffer buffer;
const VkIndexType indexType;
};
struct VulkanBufferObject : public HwBufferObject {
VulkanBufferObject(VmaAllocator allocator, VulkanCommands* commands,
VulkanStagePool& stagePool, uint32_t byteCount, BufferObjectBinding bindingType,
BufferUsage usage);
struct VulkanBufferObject : public HwBufferObject, VulkanResource {
VulkanBufferObject(VmaAllocator allocator, VulkanStagePool& stagePool, uint32_t byteCount,
BufferObjectBinding bindingType, BufferUsage usage);
void terminate() {
buffer.terminate();
}
@@ -114,32 +131,45 @@ struct VulkanBufferObject : public HwBufferObject {
const BufferObjectBinding bindingType;
};
struct VulkanSamplerGroup : public HwSamplerGroup {
struct VulkanSamplerGroup : public HwSamplerGroup, VulkanResource {
// NOTE: we have to use out-of-line allocation here because the size of a Handle<> is limited
std::unique_ptr<SamplerGroup> sb; // FIXME: this shouldn't depend on filament::SamplerGroup
explicit VulkanSamplerGroup(size_t size) noexcept : sb(new SamplerGroup(size)) { }
std::unique_ptr<SamplerGroup> sb;// FIXME: this shouldn't depend on filament::SamplerGroup
explicit VulkanSamplerGroup(size_t size) noexcept
: VulkanResource(VulkanResourceType::SAMPLER_GROUP),
sb(new SamplerGroup(size)) {}
};
struct VulkanRenderPrimitive : public HwRenderPrimitive {
struct VulkanRenderPrimitive : public HwRenderPrimitive, VulkanResource {
VulkanRenderPrimitive(VulkanResourceAllocator* allocator)
: VulkanResource(VulkanResourceType::RENDER_PRIMITIVE),
mResources(allocator) {}
~VulkanRenderPrimitive() {
mResources.clear();
}
void setPrimitiveType(PrimitiveType pt);
void setBuffers(VulkanVertexBuffer* vertexBuffer, VulkanIndexBuffer* indexBuffer);
VulkanVertexBuffer* vertexBuffer = nullptr;
VulkanIndexBuffer* indexBuffer = nullptr;
VkPrimitiveTopology primitiveTopology;
private:
FixedSizeVulkanResourceManager mResources;
};
struct VulkanFence : public HwFence {
explicit VulkanFence(const VulkanCommandBuffer& commands) : fence(commands.fence) {}
struct VulkanFence : public HwFence, VulkanResource {
VulkanFence()
: VulkanResource(VulkanResourceType::FENCE) {}
explicit VulkanFence(std::shared_ptr<VulkanCmdFence> fence)
: VulkanResource(VulkanResourceType::FENCE),
fence(fence) {}
std::shared_ptr<VulkanCmdFence> fence;
};
struct VulkanSync : public HwSync {
VulkanSync() = default;
explicit VulkanSync(const VulkanCommandBuffer& commands) : fence(commands.fence) {}
std::shared_ptr<VulkanCmdFence> fence;
};
struct VulkanTimerQuery : public HwTimerQuery {
struct VulkanTimerQuery : public HwTimerQuery, VulkanThreadSafeResource {
explicit VulkanTimerQuery(std::tuple<uint32_t, uint32_t> indices);
~VulkanTimerQuery();

View File

@@ -22,6 +22,7 @@
#include "VulkanConstants.h"
#include "VulkanHandles.h"
#include "VulkanTexture.h"
#include "VulkanUtility.h"
// Vulkan functions often immediately dereference pointers, so it's fine to pass in a pointer
@@ -64,7 +65,10 @@ VulkanPipelineCache::getUsageFlags(uint16_t binding, ShaderStageFlags flags, Usa
return src;
}
VulkanPipelineCache::VulkanPipelineCache() : mCurrentRasterState(createDefaultRasterState()) {
VulkanPipelineCache::VulkanPipelineCache(VulkanResourceAllocator* allocator)
: mCurrentRasterState(createDefaultRasterState()),
mResourceAllocator(allocator),
mPipelineBoundResources(allocator) {
mDummyBufferWriteInfo.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
mDummyBufferWriteInfo.pNext = nullptr;
mDummyBufferWriteInfo.dstArrayElement = 0;
@@ -144,6 +148,15 @@ bool VulkanPipelineCache::bindDescriptors(VkCommandBuffer cmdbuffer) noexcept {
cacheEntry->lastUsed = mCurrentTime;
mBoundDescriptor = mDescriptorRequirements;
// This passes the currently "bound" uniform buffer objects to pipeline that will be used in the
// draw call.
auto resourceEntry = mDescriptorResources.find(cacheEntry->id);
if (resourceEntry == mDescriptorResources.end()) {
mDescriptorResources[cacheEntry->id]
= std::make_unique<VulkanAcquireOnlyResourceManager>(mResourceAllocator);
resourceEntry = mDescriptorResources.find(cacheEntry->id);
}
resourceEntry->second->acquire(&mPipelineBoundResources);
vkCmdBindDescriptorSets(cmdbuffer, VK_PIPELINE_BIND_POINT_GRAPHICS,
getOrCreatePipelineLayout()->handle, 0, VulkanPipelineCache::DESCRIPTOR_TYPE_COUNT,
@@ -152,7 +165,9 @@ bool VulkanPipelineCache::bindDescriptors(VkCommandBuffer cmdbuffer) noexcept {
return true;
}
bool VulkanPipelineCache::bindPipeline(VkCommandBuffer cmdbuffer) noexcept {
bool VulkanPipelineCache::bindPipeline(VulkanCommandBuffer* commands) noexcept {
VkCommandBuffer const cmdbuffer = commands->cmdbuffer;
PipelineMap::iterator pipelineIter = mPipelines.find(mPipelineRequirements);
// Check if the required pipeline is already bound.
@@ -191,7 +206,10 @@ void VulkanPipelineCache::bindScissor(VkCommandBuffer cmdbuffer, VkRect2D scisso
VulkanPipelineCache::DescriptorCacheEntry* VulkanPipelineCache::createDescriptorSets() noexcept {
PipelineLayoutCacheEntry* layoutCacheEntry = getOrCreatePipelineLayout();
DescriptorCacheEntry descriptorCacheEntry = { .pipelineLayout = mPipelineRequirements.layout };
DescriptorCacheEntry descriptorCacheEntry = {
.pipelineLayout = mPipelineRequirements.layout,
.id = mDescriptorCacheEntryCount++,
};
// Each of the arenas for this particular layout are guaranteed to have the same size. Check
// the first arena to see if any descriptor sets are available that can be re-claimed. If not,
@@ -234,9 +252,9 @@ VulkanPipelineCache::DescriptorCacheEntry* VulkanPipelineCache::createDescriptor
// Rewrite every binding in the new descriptor sets.
VkDescriptorBufferInfo descriptorBuffers[UBUFFER_BINDING_COUNT];
VkDescriptorImageInfo descriptorSamplers[SAMPLER_BINDING_COUNT];
VkDescriptorImageInfo descriptorInputAttachments[TARGET_BINDING_COUNT];
VkDescriptorImageInfo descriptorInputAttachments[INPUT_ATTACHMENT_COUNT];
VkWriteDescriptorSet descriptorWrites[UBUFFER_BINDING_COUNT + SAMPLER_BINDING_COUNT +
TARGET_BINDING_COUNT];
INPUT_ATTACHMENT_COUNT];
uint32_t nwrites = 0;
VkWriteDescriptorSet* writes = descriptorWrites;
nwrites = 0;
@@ -286,9 +304,9 @@ VulkanPipelineCache::DescriptorCacheEntry* VulkanPipelineCache::createDescriptor
writeInfo.dstBinding = binding;
}
}
for (uint32_t binding = 0; binding < TARGET_BINDING_COUNT; binding++) {
VkWriteDescriptorSet& writeInfo = writes[nwrites++];
for (uint32_t binding = 0; binding < INPUT_ATTACHMENT_COUNT; binding++) {
if (mDescriptorRequirements.inputAttachments[binding].imageView) {
VkWriteDescriptorSet& writeInfo = writes[nwrites++];
VkDescriptorImageInfo& imageInfo = descriptorInputAttachments[binding];
imageInfo = mDescriptorRequirements.inputAttachments[binding];
writeInfo.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
@@ -299,13 +317,11 @@ VulkanPipelineCache::DescriptorCacheEntry* VulkanPipelineCache::createDescriptor
writeInfo.pImageInfo = &imageInfo;
writeInfo.pBufferInfo = nullptr;
writeInfo.pTexelBufferView = nullptr;
} else {
writeInfo = mDummyTargetWriteInfo;
assert_invariant(mDummyTargetInfo.imageView);
writeInfo.dstSet = descriptorCacheEntry.handles[2];
writeInfo.dstBinding = binding;
}
writeInfo.dstSet = descriptorCacheEntry.handles[2];
writeInfo.dstBinding = binding;
}
vkUpdateDescriptorSets(mDevice, nwrites, writes, 0, nullptr);
return &mDescriptorSets.emplace(mDescriptorRequirements, descriptorCacheEntry).first.value();
@@ -518,14 +534,14 @@ VulkanPipelineCache::PipelineLayoutCacheEntry* VulkanPipelineCache::getOrCreateP
vkCreateDescriptorSetLayout(mDevice, &dlinfo, VKALLOC, &cacheEntry.descriptorSetLayouts[1]);
// Next create the descriptor set layout for input attachments.
VkDescriptorSetLayoutBinding tbindings[TARGET_BINDING_COUNT];
VkDescriptorSetLayoutBinding tbindings[INPUT_ATTACHMENT_COUNT];
binding.descriptorType = VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT;
binding.stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT;
for (uint32_t i = 0; i < TARGET_BINDING_COUNT; i++) {
for (uint32_t i = 0; i < INPUT_ATTACHMENT_COUNT; i++) {
binding.binding = i;
tbindings[i] = binding;
}
dlinfo.bindingCount = TARGET_BINDING_COUNT;
dlinfo.bindingCount = INPUT_ATTACHMENT_COUNT;
dlinfo.pBindings = tbindings;
vkCreateDescriptorSetLayout(mDevice, &dlinfo, VKALLOC, &cacheEntry.descriptorSetLayouts[2]);
@@ -604,13 +620,19 @@ void VulkanPipelineCache::unbindImageView(VkImageView imageView) noexcept {
}
}
void VulkanPipelineCache::bindUniformBuffer(uint32_t bindingIndex, VkBuffer uniformBuffer,
void VulkanPipelineCache::bindUniformBufferObject(uint32_t bindingIndex,
VulkanBufferObject* bufferObject, VkDeviceSize offset, VkDeviceSize size) noexcept {
bindUniformBuffer(bindingIndex, bufferObject->buffer.getGpuBuffer(), offset, size);
mPipelineBoundResources.acquire(bufferObject);
}
void VulkanPipelineCache::bindUniformBuffer(uint32_t bindingIndex, VkBuffer buffer,
VkDeviceSize offset, VkDeviceSize size) noexcept {
ASSERT_POSTCONDITION(bindingIndex < UBUFFER_BINDING_COUNT,
"Uniform bindings overflow: index = %d, capacity = %d.",
bindingIndex, UBUFFER_BINDING_COUNT);
"Uniform bindings overflow: index = %d, capacity = %d.", bindingIndex,
UBUFFER_BINDING_COUNT);
auto& key = mDescriptorRequirements;
key.uniformBuffers[bindingIndex] = uniformBuffer;
key.uniformBuffers[bindingIndex] = buffer;
if (size == VK_WHOLE_SIZE) {
size = WHOLE_SIZE;
@@ -624,18 +646,21 @@ void VulkanPipelineCache::bindUniformBuffer(uint32_t bindingIndex, VkBuffer unif
}
void VulkanPipelineCache::bindSamplers(VkDescriptorImageInfo samplers[SAMPLER_BINDING_COUNT],
UsageFlags flags) noexcept {
VulkanTexture* textures[SAMPLER_BINDING_COUNT], UsageFlags flags) noexcept {
for (uint32_t bindingIndex = 0; bindingIndex < SAMPLER_BINDING_COUNT; bindingIndex++) {
mDescriptorRequirements.samplers[bindingIndex] = samplers[bindingIndex];
if (textures[bindingIndex]) {
mPipelineBoundResources.acquire(textures[bindingIndex]);
}
}
mPipelineRequirements.layout = flags;
}
void VulkanPipelineCache::bindInputAttachment(uint32_t bindingIndex,
VkDescriptorImageInfo targetInfo) noexcept {
ASSERT_POSTCONDITION(bindingIndex < TARGET_BINDING_COUNT,
ASSERT_POSTCONDITION(bindingIndex < INPUT_ATTACHMENT_COUNT,
"Input attachment bindings overflow: index = %d, capacity = %d.",
bindingIndex, TARGET_BINDING_COUNT);
bindingIndex, INPUT_ATTACHMENT_COUNT);
mDescriptorRequirements.inputAttachments[bindingIndex] = targetInfo;
}
@@ -645,6 +670,7 @@ void VulkanPipelineCache::terminate() noexcept {
for (auto& iter : mPipelines) {
vkDestroyPipeline(mDevice, iter.second.handle, VKALLOC);
}
mPipelineBoundResources.clear();
mPipelines.clear();
mBoundPipeline = {};
vmaDestroyBuffer(mAllocator, mDummyBuffer, mDummyMemory);
@@ -678,6 +704,7 @@ void VulkanPipelineCache::onCommandBuffer(const VulkanCommandBuffer& cmdbuffer)
arenas[i].push_back(cacheEntry.handles[i]);
}
++mDescriptorArenasCount;
mDescriptorResources.erase(cacheEntry.id);
iter = mDescriptorSets.erase(iter);
} else {
++iter;
@@ -738,6 +765,10 @@ void VulkanPipelineCache::onCommandBuffer(const VulkanCommandBuffer& cmdbuffer)
vkDestroyDescriptorPool(mDevice, pool, VKALLOC);
}
mExtinctDescriptorPools.clear();
for (auto const& entry : mExtinctDescriptorBundles) {
mDescriptorResources.erase(entry.id);
}
mExtinctDescriptorBundles.clear();
}
}
@@ -757,7 +788,7 @@ VkDescriptorPool VulkanPipelineCache::createDescriptorPool(uint32_t size) const
poolSizes[1].type = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER;
poolSizes[1].descriptorCount = poolInfo.maxSets * SAMPLER_BINDING_COUNT;
poolSizes[2].type = VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT;
poolSizes[2].descriptorCount = poolInfo.maxSets * TARGET_BINDING_COUNT;
poolSizes[2].descriptorCount = poolInfo.maxSets * INPUT_ATTACHMENT_COUNT;
VkDescriptorPool pool;
const UTILS_UNUSED VkResult result = vkCreateDescriptorPool(mDevice, &poolInfo, VKALLOC, &pool);
@@ -800,6 +831,10 @@ void VulkanPipelineCache::destroyLayoutsAndDescriptors() noexcept {
mExtinctDescriptorPools.clear();
mExtinctDescriptorBundles.clear();
// Both mDescriptorSets and mExtinctDescriptorBundles have been cleared, so it's safe to call
// clear() on mDescriptorResources.
mDescriptorResources.clear();
mBoundDescriptor = {};
}
@@ -864,7 +899,7 @@ bool VulkanPipelineCache::DescEqual::operator()(const DescriptorKey& k1,
return false;
}
}
for (uint32_t i = 0; i < TARGET_BINDING_COUNT; i++) {
for (uint32_t i = 0; i < INPUT_ATTACHMENT_COUNT; i++) {
if (k1.inputAttachments[i].imageView != k2.inputAttachments[i].imageView ||
k1.inputAttachments[i].imageLayout != k2.inputAttachments[i].imageLayout) {
return false;

View File

@@ -28,9 +28,11 @@
#include <utils/compiler.h>
#include <utils/Hash.h>
#include <list>
#include <tsl/robin_map.h>
#include <type_traits>
#include <vector>
#include <unordered_map>
#include "VulkanCommands.h"
@@ -41,6 +43,9 @@ VK_DEFINE_HANDLE(VmaPool)
namespace filament::backend {
struct VulkanProgram;
struct VulkanBufferObject;
struct VulkanTexture;
class VulkanResourceAllocator;
// VulkanPipelineCache manages a cache of descriptor sets and pipelines.
//
@@ -58,7 +63,11 @@ public:
static constexpr uint32_t UBUFFER_BINDING_COUNT = Program::UNIFORM_BINDING_COUNT;
static constexpr uint32_t SAMPLER_BINDING_COUNT = MAX_SAMPLER_COUNT;
static constexpr uint32_t TARGET_BINDING_COUNT = MRT::MAX_SUPPORTED_RENDER_TARGET_COUNT;
// We assume only one possible input attachment between two subpasses. See also the subpasses
// definition in VulkanFboCache.
static constexpr uint32_t INPUT_ATTACHMENT_COUNT = 1;
static constexpr uint32_t SHADER_MODULE_COUNT = 2;
static constexpr uint32_t VERTEX_ATTRIBUTE_COUNT = MAX_VERTEX_ATTRIBUTE_COUNT;
@@ -127,7 +136,7 @@ public:
// Upon construction, the pipeCache initializes some internal state but does not make any Vulkan
// calls. On destruction it will free any cached Vulkan objects that haven't already been freed.
VulkanPipelineCache();
VulkanPipelineCache(VulkanResourceAllocator* allocator);
~VulkanPipelineCache();
void setDevice(VkDevice device, VmaAllocator allocator);
@@ -137,7 +146,7 @@ public:
// Creates a new pipeline if necessary and binds it using vkCmdBindPipeline.
// Returns false if an error occurred.
bool bindPipeline(VkCommandBuffer cmdbuffer) noexcept;
bool bindPipeline(VulkanCommandBuffer* commands) noexcept;
// Sets up a new scissor rectangle if it has been dirtied.
void bindScissor(VkCommandBuffer cmdbuffer, VkRect2D scissor) noexcept;
@@ -147,9 +156,12 @@ public:
void bindRasterState(const RasterState& rasterState) noexcept;
void bindRenderPass(VkRenderPass renderPass, int subpassIndex) noexcept;
void bindPrimitiveTopology(VkPrimitiveTopology topology) noexcept;
void bindUniformBuffer(uint32_t bindingIndex, VkBuffer uniformBuffer,
void bindUniformBufferObject(uint32_t bindingIndex, VulkanBufferObject* bufferObject,
VkDeviceSize offset = 0, VkDeviceSize size = VK_WHOLE_SIZE) noexcept;
void bindSamplers(VkDescriptorImageInfo samplers[SAMPLER_BINDING_COUNT], UsageFlags flags) noexcept;
void bindUniformBuffer(uint32_t bindingIndex, VkBuffer buffer,
VkDeviceSize offset = 0, VkDeviceSize size = VK_WHOLE_SIZE) noexcept;
void bindSamplers(VkDescriptorImageInfo samplers[SAMPLER_BINDING_COUNT],
VulkanTexture* textures[SAMPLER_BINDING_COUNT], UsageFlags flags) noexcept;
void bindInputAttachment(uint32_t bindingIndex, VkDescriptorImageInfo imageInfo) noexcept;
void bindVertexArray(const VertexArray& varray) noexcept;
@@ -181,17 +193,22 @@ public:
mDummyTargetInfo.imageView = imageView;
}
// Acquires a resource to be bound to the current pipeline. The ownership of the resource
// will be transferred to the corresponding pipeline when pipeline is bound.
void acquireResource(VulkanResource* resource) {
mPipelineBoundResources.acquire(resource);
}
inline RasterState getCurrentRasterState() const noexcept {
return mCurrentRasterState;
return mCurrentRasterState;
}
// We need to update this outside of bindRasterState due to VulkanDriver::draw.
inline void setCurrentRasterState(RasterState const& rasterState) noexcept {
mCurrentRasterState = rasterState;
mCurrentRasterState = rasterState;
}
private:
// PIPELINE LAYOUT CACHE KEY
// -------------------------
@@ -298,17 +315,17 @@ private:
// Represents all the Vulkan state that comprises a bound descriptor set.
struct DescriptorKey {
VkBuffer uniformBuffers[UBUFFER_BINDING_COUNT]; // 80 0
DescriptorImageInfo samplers[SAMPLER_BINDING_COUNT]; // 1488 80
DescriptorImageInfo inputAttachments[TARGET_BINDING_COUNT]; // 192 1568
uint32_t uniformBufferOffsets[UBUFFER_BINDING_COUNT]; // 40 1760
uint32_t uniformBufferSizes[UBUFFER_BINDING_COUNT]; // 40 1080
VkBuffer uniformBuffers[UBUFFER_BINDING_COUNT]; // 80 0
DescriptorImageInfo samplers[SAMPLER_BINDING_COUNT]; // 1488 80
DescriptorImageInfo inputAttachments[INPUT_ATTACHMENT_COUNT]; // 24 1568
uint32_t uniformBufferOffsets[UBUFFER_BINDING_COUNT]; // 40 1592
uint32_t uniformBufferSizes[UBUFFER_BINDING_COUNT]; // 40 1632
};
static_assert(offsetof(DescriptorKey, samplers) == 80);
static_assert(offsetof(DescriptorKey, inputAttachments) == 1568);
static_assert(offsetof(DescriptorKey, uniformBufferOffsets) == 1760);
static_assert(offsetof(DescriptorKey, uniformBufferSizes) == 1800);
static_assert(sizeof(DescriptorKey) == 1840, "DescriptorKey must not have implicit padding.");
static_assert(offsetof(DescriptorKey, uniformBufferOffsets) == 1592);
static_assert(offsetof(DescriptorKey, uniformBufferSizes) == 1632);
static_assert(sizeof(DescriptorKey) == 1672, "DescriptorKey must not have implicit padding.");
using DescHashFn = utils::hash::MurmurHashFn<DescriptorKey>;
@@ -333,7 +350,10 @@ private:
std::array<VkDescriptorSet, DESCRIPTOR_TYPE_COUNT> handles;
Timestamp lastUsed;
PipelineLayoutKey pipelineLayout;
uint32_t id;
};
uint32_t mDescriptorCacheEntryCount = 0;
struct PipelineCacheEntry {
VkPipeline handle;
@@ -368,12 +388,15 @@ private:
PipelineLayoutKeyHashFn, PipelineLayoutKeyEqual>;
using PipelineMap = tsl::robin_map<PipelineKey, PipelineCacheEntry,
PipelineHashFn, PipelineEqual>;
using DescriptorMap = tsl::robin_map<DescriptorKey, DescriptorCacheEntry,
DescHashFn, DescEqual>;
using DescriptorMap
= tsl::robin_map<DescriptorKey, DescriptorCacheEntry, DescHashFn, DescEqual>;
using DescriptorResourceMap
= std::unordered_map<uint32_t, std::unique_ptr<VulkanAcquireOnlyResourceManager>>;
PipelineLayoutMap mPipelineLayouts;
PipelineMap mPipelines;
DescriptorMap mDescriptorSets;
DescriptorResourceMap mDescriptorResources;
// These helpers all return unstable pointers that should not be stored.
DescriptorCacheEntry* createDescriptorSets() noexcept;
@@ -421,8 +444,8 @@ private:
// After a growth event (i.e. when the VkDescriptorPool is replaced with a bigger version), all
// currently used descriptors are moved into the "extinct" sets so that they can be safely
// destroyed a few frames later.
std::vector<VkDescriptorPool> mExtinctDescriptorPools;
std::vector<DescriptorCacheEntry> mExtinctDescriptorBundles;
std::list<VkDescriptorPool> mExtinctDescriptorPools;
std::list<DescriptorCacheEntry> mExtinctDescriptorBundles;
VkDescriptorBufferInfo mDummyBufferInfo = {};
VkWriteDescriptorSet mDummyBufferWriteInfo = {};
@@ -431,6 +454,9 @@ private:
VkBuffer mDummyBuffer;
VmaAllocation mDummyMemory;
VulkanResourceAllocator* mResourceAllocator;
VulkanAcquireOnlyResourceManager mPipelineBoundResources;
};
} // namespace filament::backend

View File

@@ -0,0 +1,345 @@
/*
* Copyright (C) 2023 The Android Open Source Project
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include "VulkanReadPixels.h"
#include "DataReshaper.h"
#include "VulkanCommands.h"
#include "VulkanHandles.h"
#include "VulkanImageUtility.h"
#include "VulkanTexture.h"
#include <utils/Log.h>
using namespace bluevk;
namespace filament::backend {
using ImgUtil = VulkanImageUtility;
using TaskHandler = VulkanReadPixels::TaskHandler;
using WorkloadFunc = TaskHandler::WorkloadFunc;
using OnCompleteFunc = TaskHandler::OnCompleteFunc;
TaskHandler::TaskHandler()
: mShouldStop(false),
mThread(&TaskHandler::loop, this) {}
void TaskHandler::post(WorkloadFunc&& workload, OnCompleteFunc&& oncomplete) {
assert_invariant(!mShouldStop);
{
std::unique_lock<std::mutex> lock(mTaskQueueMutex);
mTaskQueue.push(std::make_pair(std::move(workload), std::move(oncomplete)));
}
mHasTaskCondition.notify_one();
}
void TaskHandler::drain() {
assert_invariant(!mShouldStop);
std::mutex syncPointMutex;
std::condition_variable syncCondition;
bool done = false;
post([] {},
[&syncPointMutex, &syncCondition, &done] {
{
std::unique_lock<std::mutex> lock(syncPointMutex);
done = true;
syncCondition.notify_one();
}
});
std::unique_lock<std::mutex> lock(syncPointMutex);
syncCondition.wait(lock, [&done] { return done; });
}
void TaskHandler::shutdown() {
{
std::unique_lock<std::mutex> lock(mTaskQueueMutex);
mShouldStop = true;
}
mHasTaskCondition.notify_one();
mThread.join();
ASSERT_POSTCONDITION(mTaskQueue.empty(),
"ReadPixels handler has tasks in the queue after shutdown");
}
void TaskHandler::loop() {
while (true) {
std::unique_lock<std::mutex> lock(mTaskQueueMutex);
mHasTaskCondition.wait(lock, [this] { return !mTaskQueue.empty() || mShouldStop; });
if (mShouldStop) {
break;
}
auto [workload, oncomplete] = mTaskQueue.front();
mTaskQueue.pop();
lock.unlock();
workload();
oncomplete();
}
// Clean-up: we still need to call oncomplete for clients to do clean-up.
while (true) {
std::unique_lock<std::mutex> lock(mTaskQueueMutex);
if (mTaskQueue.empty()) {
break;
}
auto [workload, oncomplete] = mTaskQueue.front();
mTaskQueue.pop();
lock.unlock();
oncomplete();
}
}
void VulkanReadPixels::terminate() noexcept {
assert_invariant(mDevice != VK_NULL_HANDLE);
if (mCommandPool == VK_NULL_HANDLE) {
return;
}
vkDestroyCommandPool(mDevice, mCommandPool, VKALLOC);
mDevice = VK_NULL_HANDLE;
mTaskHandler->shutdown();
mTaskHandler.reset();
}
VulkanReadPixels::VulkanReadPixels(VkDevice device)
: mDevice(device) {}
void VulkanReadPixels::run(VulkanRenderTarget const* srcTarget, uint32_t const x, uint32_t const y,
uint32_t const width, uint32_t const height, uint32_t const graphicsQueueFamilyIndex,
PixelBufferDescriptor&& pbd, SelecteMemoryFunction const& selectMemoryFunc,
OnReadCompleteFunction const& readCompleteFunc) {
assert_invariant(mDevice != VK_NULL_HANDLE);
VkDevice& device = mDevice;
if (mCommandPool == VK_NULL_HANDLE) {
// Create a command pool if one has not been created.
VkCommandPoolCreateInfo createInfo = {
.sType = VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO,
.flags = VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT
| VK_COMMAND_POOL_CREATE_TRANSIENT_BIT,
.queueFamilyIndex = graphicsQueueFamilyIndex,
};
vkCreateCommandPool(device, &createInfo, VKALLOC, &mCommandPool);
}
// We don't create a task handler (start a thread) unless readPixels is called.
if (!mTaskHandler) {
mTaskHandler = std::make_unique<TaskHandler>();
}
VkCommandPool& cmdpool = mCommandPool;
VulkanTexture* srcTexture = srcTarget->getColor(0).texture;
assert_invariant(srcTexture);
VkFormat const srcFormat = srcTexture->getVkFormat();
bool const swizzle
= srcFormat == VK_FORMAT_B8G8R8A8_UNORM || srcFormat == VK_FORMAT_B8G8R8A8_SRGB;
// Create a host visible, linearly tiled image as a staging area.
VkImageCreateInfo const imageInfo{
.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO,
.imageType = VK_IMAGE_TYPE_2D,
.format = srcFormat,
.extent = {width, height, 1},
.mipLevels = 1,
.arrayLayers = 1,
.samples = VK_SAMPLE_COUNT_1_BIT,
.tiling = VK_IMAGE_TILING_LINEAR,
.usage = VK_IMAGE_USAGE_TRANSFER_DST_BIT,
.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED,
};
VkImage stagingImage;
vkCreateImage(device, &imageInfo, VKALLOC, &stagingImage);
#if FILAMENT_VULKAN_VERBOSE
utils::slog.d << "readPixels created image=" << stagingImage
<< " to copy from image=" << srcTexture->getVkImage()
<< " src-layout=" << srcTexture->getLayout(0, 0) << utils::io::endl;
#endif
VkMemoryRequirements memReqs;
VkDeviceMemory stagingMemory;
vkGetImageMemoryRequirements(device, stagingImage, &memReqs);
uint32_t memoryTypeIndex = selectMemoryFunc(memReqs.memoryTypeBits,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
| VK_MEMORY_PROPERTY_HOST_CACHED_BIT);
// If VK_MEMORY_PROPERTY_HOST_CACHED_BIT is not supported, we try only
// HOST_VISIBLE+HOST_COHERENT. HOST_CACHED helps a lot with readpixels performance.
if (memoryTypeIndex >= VK_MAX_MEMORY_TYPES) {
memoryTypeIndex = selectMemoryFunc(memReqs.memoryTypeBits,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT);
utils::slog.w
<< "readPixels is slow because VK_MEMORY_PROPERTY_HOST_CACHED_BIT is not available"
<< utils::io::endl;
}
ASSERT_POSTCONDITION(memoryTypeIndex < VK_MAX_MEMORY_TYPES,
"VulkanReadPixels: unable to find a memory type that meets requirements.");
VkMemoryAllocateInfo const allocInfo = {
.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO,
.allocationSize = memReqs.size,
.memoryTypeIndex = memoryTypeIndex,
};
vkAllocateMemory(device, &allocInfo, VKALLOC, &stagingMemory);
vkBindImageMemory(device, stagingImage, stagingMemory, 0);
VkCommandBuffer cmdbuffer;
VkCommandBufferAllocateInfo const allocateInfo{
.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO,
.commandPool = cmdpool,
.level = VK_COMMAND_BUFFER_LEVEL_PRIMARY,
.commandBufferCount = 1,
};
vkAllocateCommandBuffers(device, &allocateInfo, &cmdbuffer);
VkCommandBufferBeginInfo const binfo{
.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO,
.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT,
};
vkBeginCommandBuffer(cmdbuffer, &binfo);
ImgUtil::transitionLayout(cmdbuffer, {
.image = stagingImage,
.oldLayout = VulkanLayout::UNDEFINED,
.newLayout = VulkanLayout::TRANSFER_DST,
.subresources = {
.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT,
.baseMipLevel = 0,
.levelCount = 1,
.baseArrayLayer = 0,
.layerCount = 1,
},
});
VulkanAttachment const srcAttachment = srcTarget->getColor(0);
const VkImageSubresourceRange srcRange
= srcAttachment.getSubresourceRange(VK_IMAGE_ASPECT_COLOR_BIT);
srcTexture->transitionLayout(cmdbuffer, srcRange, VulkanLayout::TRANSFER_SRC);
VkImageCopy const imageCopyRegion = {
.srcSubresource = {
.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT,
.mipLevel = srcAttachment.level,
.baseArrayLayer = srcAttachment.layer,
.layerCount = 1,
},
.srcOffset = {
.x = (int32_t)x,
.y = (int32_t)(srcTarget->getExtent().height - (height + y)),
},
.dstSubresource = {
.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT,
.layerCount = 1,
},
.extent = {
.width = width,
.height = height,
.depth = 1,
},
};
// Perform the copy into the staging area. At this point we know that the src
// layout is TRANSFER_SRC_OPTIMAL and the staging area is GENERAL.
UTILS_UNUSED_IN_RELEASE VkExtent2D srcExtent = srcAttachment.getExtent2D();
assert_invariant(imageCopyRegion.srcOffset.x + imageCopyRegion.extent.width <= srcExtent.width);
assert_invariant(
imageCopyRegion.srcOffset.y + imageCopyRegion.extent.height <= srcExtent.height);
vkCmdCopyImage(cmdbuffer, srcAttachment.getImage(),
ImgUtil::getVkLayout(VulkanLayout::TRANSFER_SRC), stagingImage,
ImgUtil::getVkLayout(VulkanLayout::TRANSFER_DST), 1, &imageCopyRegion);
// Restore the source image layout.
srcTexture->transitionLayout(cmdbuffer, srcRange, VulkanLayout::COLOR_ATTACHMENT);
vkEndCommandBuffer(cmdbuffer);
VkQueue queue;
vkGetDeviceQueue(device, graphicsQueueFamilyIndex, 0, &queue);
VkFence readCompleteFence;
VkFenceCreateInfo const fenceCreateInfo{
.sType = VK_STRUCTURE_TYPE_FENCE_CREATE_INFO,
};
vkCreateFence(device, &fenceCreateInfo, VKALLOC, &readCompleteFence);
VkSubmitInfo const submitInfo{
.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO,
.waitSemaphoreCount = 0,
.pWaitSemaphores = VK_NULL_HANDLE,
.pWaitDstStageMask = VK_NULL_HANDLE,
.commandBufferCount = 1,
.pCommandBuffers = &cmdbuffer,
.signalSemaphoreCount = 0,
.pSignalSemaphores = VK_NULL_HANDLE,
};
vkQueueSubmit(queue, 1, &submitInfo, readCompleteFence);
auto* const pUserBuffer = new PixelBufferDescriptor(std::move(pbd));
auto cleanPbdFunc = [pUserBuffer, readCompleteFunc]() {
PixelBufferDescriptor& p = *pUserBuffer;
readCompleteFunc(std::move(p));
delete pUserBuffer;
};
auto waitFenceFunc = [device, width, height, swizzle, srcFormat, stagingImage, stagingMemory,
cmdpool, cmdbuffer, pUserBuffer,
fence = readCompleteFence]() mutable {
VkResult status = vkWaitForFences(device, 1, &fence, VK_TRUE, UINT64_MAX);
// Fence hasn't been reached. Try waiting again.
if (status != VK_SUCCESS) {
utils::slog.e << "Failed to wait for readPixels fence" << utils::io::endl;
return;
}
PixelBufferDescriptor& p = *pUserBuffer;
VkImageSubresource subResource{.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT};
VkSubresourceLayout subResourceLayout;
vkGetImageSubresourceLayout(device, stagingImage, &subResource, &subResourceLayout);
// Map image memory so that we can start copying from it.
uint8_t const* srcPixels;
vkMapMemory(device, stagingMemory, 0, VK_WHOLE_SIZE, 0, (void**) &srcPixels);
srcPixels += subResourceLayout.offset;
if (!DataReshaper::reshapeImage(&p, getComponentType(srcFormat),
getComponentCount(srcFormat), srcPixels,
static_cast<int>(subResourceLayout.rowPitch), static_cast<int>(width),
static_cast<int>(height), swizzle)) {
utils::slog.e << "Unsupported PixelDataFormat or PixelDataType" << utils::io::endl;
}
vkUnmapMemory(device, stagingMemory);
vkDestroyImage(device, stagingImage, VKALLOC);
vkFreeMemory(device, stagingMemory, VKALLOC);
vkDestroyFence(device, fence, VKALLOC);
vkFreeCommandBuffers(device, cmdpool, 1, &cmdbuffer);
};
mTaskHandler->post(std::move(waitFenceFunc), std::move(cleanPbdFunc));
}
void VulkanReadPixels::runUntilComplete() noexcept {
if (!mTaskHandler) {
return;
}
mTaskHandler->drain();
}
}// namespace filament::backend

View File

@@ -0,0 +1,93 @@
/*
* Copyright (C) 2023 The Android Open Source Project
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef TNT_FILAMENT_BACKEND_VULKANREADPIXELS_H
#define TNT_FILAMENT_BACKEND_VULKANREADPIXELS_H
#include "private/backend/Driver.h"
#include <bluevk/BlueVK.h>
#include <math/vec4.h>
#include <condition_variable>
#include <functional>
#include <mutex>
#include <queue>
#include <set>
#include <thread>
#include <vector>
namespace filament::backend {
struct VulkanRenderTarget;
class VulkanReadPixels {
public:
// A helper class that runs tasks on a separate thread.
class TaskHandler {
public:
using WorkloadFunc = std::function<void()>;
using OnCompleteFunc = std::function<void()>;
using Task = std::pair<WorkloadFunc, OnCompleteFunc>;
TaskHandler();
// In addition to the workload that the handler will call, client must also provide an
// oncomplete function that the handler will call either when the workload completes or when
// the handler is shutdown (so that we can clean-up even when the task was not carried out).
void post(WorkloadFunc&& workload, OnCompleteFunc&& oncomplete);
// This will block until all of the tasks are done.
void drain();
// This will quit without running the workloads, but oncomplete callbacks will still be
// called.
void shutdown();
private:
void loop();
bool mShouldStop;
std::condition_variable mHasTaskCondition;
std::mutex mTaskQueueMutex;
std::queue<Task> mTaskQueue;
std::thread mThread;
};
using OnReadCompleteFunction = std::function<void(PixelBufferDescriptor&&)>;
using SelecteMemoryFunction = std::function<uint32_t(uint32_t, VkFlags)>;
explicit VulkanReadPixels(VkDevice device);
void terminate() noexcept;
void run(VulkanRenderTarget const* srcTarget, uint32_t x, uint32_t y, uint32_t width,
uint32_t height, uint32_t graphicsQueueFamilyIndex, PixelBufferDescriptor&& pbd,
SelecteMemoryFunction const& selectMemoryFunc,
OnReadCompleteFunction const& readCompleteFunc);
// This method will block until all of the in-flight requests are complete.
void runUntilComplete() noexcept;
private:
VkDevice mDevice = VK_NULL_HANDLE;
VkCommandPool mCommandPool = VK_NULL_HANDLE;
std::unique_ptr<TaskHandler> mTaskHandler;
};
}// namespace filament::backend
#endif// TNT_FILAMENT_BACKEND_VULKANREADPIXELS_H

View File

@@ -0,0 +1,136 @@
/*
* Copyright (C) 2023 The Android Open Source Project
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef TNT_FILAMENT_BACKEND_VULKANRESOURCEALLOCATOR_H
#define TNT_FILAMENT_BACKEND_VULKANRESOURCEALLOCATOR_H
#include "VulkanHandles.h"
#include <private/backend/HandleAllocator.h>
#include <utils/FixedCapacityVector.h>
#include <utils/Log.h>
#include <type_traits>
#include <unordered_set>
namespace filament::backend {
// RESOURCE_TYPE_COUNT matches the count of enum VulkanResourceType.
#define RESOURCE_TYPE_COUNT 12
#define DEBUG_RESOURCE_LEAKS 0
#if DEBUG_RESOURCE_LEAKS
#define TRACK_INCREMENT() \
if (!IS_MANAGED_TYPE(obj->mType)) { \
mDebugOnlyResourceCount[static_cast<size_t>(obj->mType)]++; \
}
#define TRACK_DECREMENT() \
if (!IS_MANAGED_TYPE(obj->mType)) { \
mDebugOnlyResourceCount[static_cast<size_t>(obj->mType)]--; \
}
#else
// No-op
#define TRACK_INCREMENT()
#define TRACK_DECREMENT()
#endif
class VulkanResourceAllocator {
public:
VulkanResourceAllocator(size_t arenaSize)
: mHandleAllocatorImpl("Handles", arenaSize)
#if DEBUG_RESOURCE_LEAKS
, mDebugOnlyResourceCount(RESOURCE_TYPE_COUNT) {
std::memset(mDebugOnlyResourceCount.data(), 0, sizeof(size_t) * RESOURCE_TYPE_COUNT);
}
#else
{}
#endif
template<typename D, typename... ARGS>
inline Handle<D> initHandle(ARGS&&... args) noexcept {
auto handle = mHandleAllocatorImpl.allocateAndConstruct<D>(std::forward<ARGS>(args)...);
auto obj = handle_cast<D*>(handle);
obj->initResource(handle.getId());
TRACK_INCREMENT();
return handle;
}
template<typename D>
inline Handle<D> allocHandle() noexcept {
return mHandleAllocatorImpl.allocate<D>();
}
template<typename D, typename B, typename... ARGS>
inline typename std::enable_if<std::is_base_of<B, D>::value, D>::type* construct(
Handle<B> const& handle, ARGS&&... args) noexcept {
auto obj = mHandleAllocatorImpl.construct<D, B>(handle, std::forward<ARGS>(args)...);
obj->initResource(handle.getId());
TRACK_INCREMENT();
return obj;
}
template<typename Dp, typename B>
inline typename std::enable_if_t<
std::is_pointer_v<Dp> && std::is_base_of_v<B, typename std::remove_pointer_t<Dp>>, Dp>
handle_cast(Handle<B>& handle) noexcept {
return mHandleAllocatorImpl.handle_cast<Dp, B>(handle);
}
template<typename Dp, typename B>
inline typename std::enable_if_t<
std::is_pointer_v<Dp> && std::is_base_of_v<B, typename std::remove_pointer_t<Dp>>, Dp>
handle_cast(Handle<B> const& handle) noexcept {
return mHandleAllocatorImpl.handle_cast<Dp, B>(handle);
}
template<typename D, typename B>
inline void destruct(Handle<B> handle) noexcept {
auto obj = handle_cast<D*>(handle);
if constexpr (std::is_base_of_v<VulkanIndexBuffer, D>
|| std::is_base_of_v<VulkanBufferObject, D>) {
obj->terminate();
}
TRACK_DECREMENT();
mHandleAllocatorImpl.deallocate(handle, obj);
}
private:
HandleAllocatorVK mHandleAllocatorImpl;
#if DEBUG_RESOURCE_LEAKS
public:
void print() {
utils::slog.d << "Resource Allocator state (debug only)" << utils::io::endl;
for (size_t i = 0; i < RESOURCE_TYPE_COUNT; i++) {
utils::slog.d << "[" << i << "]=" << mDebugOnlyResourceCount[i] << utils::io::endl;
}
utils::slog.d << "+++++++++++++++++++++++++++++++++++++" << utils::io::endl;
}
private:
utils::FixedCapacityVector<size_t> mDebugOnlyResourceCount;
#endif
};
#undef TRACK_INCREMENT
#undef TRACK_DECREMENT
#undef DEBUG_RESOURCE_LEAKS
} // namespace filament::backend
#endif // TNT_FILAMENT_BACKEND_VULKANRESOURCEALLOCATOR_H

View File

@@ -0,0 +1,70 @@
/*
* Copyright (C) 2023 The Android Open Source Project
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include "VulkanResources.h"
#include "VulkanHandles.h"
#include "VulkanResourceAllocator.h"
namespace filament::backend {
void deallocateResource(VulkanResourceAllocator* allocator, VulkanResourceType type,
HandleBase::HandleId id) {
if (IS_HEAP_ALLOC_TYPE(type)) {
return;
}
switch (type) {
case VulkanResourceType::BUFFER_OBJECT:
allocator->destruct<VulkanBufferObject>(Handle<HwBufferObject>(id));
break;
case VulkanResourceType::INDEX_BUFFER:
allocator->destruct<VulkanIndexBuffer>(Handle<HwIndexBuffer>(id));
break;
case VulkanResourceType::PROGRAM:
allocator->destruct<VulkanProgram>(Handle<HwProgram>(id));
break;
case VulkanResourceType::RENDER_TARGET:
allocator->destruct<VulkanRenderTarget>(Handle<HwRenderTarget>(id));
break;
case VulkanResourceType::SAMPLER_GROUP:
allocator->destruct<VulkanSamplerGroup>(Handle<HwSamplerGroup>(id));
break;
case VulkanResourceType::SWAP_CHAIN:
allocator->destruct<VulkanSwapChain>(Handle<HwSwapChain>(id));
break;
case VulkanResourceType::TEXTURE:
allocator->destruct<VulkanTexture>(Handle<HwTexture>(id));
break;
case VulkanResourceType::TIMER_QUERY:
allocator->destruct<VulkanTimerQuery>(Handle<HwTimerQuery>(id));
break;
case VulkanResourceType::VERTEX_BUFFER:
allocator->destruct<VulkanVertexBuffer>(Handle<HwVertexBuffer>(id));
break;
case VulkanResourceType::RENDER_PRIMITIVE:
allocator->destruct<VulkanRenderPrimitive>(Handle<VulkanRenderPrimitive>(id));
break;
// If the resource is heap allocated, then the resource manager just skip refcounted
// destruction.
case VulkanResourceType::FENCE:
case VulkanResourceType::HEAP_ALLOCATED:
break;
}
}
} // namespace filament::backend

View File

@@ -0,0 +1,337 @@
/*
* Copyright (C) 2023 The Android Open Source Project
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef TNT_FILAMENT_BACKEND_VULKANRESOURCES_H
#define TNT_FILAMENT_BACKEND_VULKANRESOURCES_H
#include <backend/DriverEnums.h>// For MAX_VERTEX_BUFFER_COUNT
#include <backend/Handle.h>
#include <tsl/robin_set.h>
#include <utils/Mutex.h>
#include <utils/Panic.h>
#include <mutex>
#include <unordered_set>
namespace filament::backend {
class VulkanResourceAllocator;
struct VulkanThreadSafeResource;
// Subclasses of VulkanResource must provide this enum in their construction.
enum class VulkanResourceType : uint8_t {
BUFFER_OBJECT,
INDEX_BUFFER,
PROGRAM,
RENDER_TARGET,
SAMPLER_GROUP,
SWAP_CHAIN,
RENDER_PRIMITIVE,
TEXTURE,
TIMER_QUERY,
VERTEX_BUFFER,
// Below are resources that are managed manually (i.e. not ref counted).
FENCE,
HEAP_ALLOCATED,
};
#define IS_HEAP_ALLOC_TYPE(f) \
(f == VulkanResourceType::FENCE || f == VulkanResourceType::HEAP_ALLOCATED)
// This is a ref-counting base class that tracks how many references of this resource exist. This
// class is paired with VulkanResourceManagerImpl which is responsible for incrementing or
// decrementing the count. Once mRefCount == 0, VulkanResourceManagerImpl will also call the
// appropriate destructor. VulkanCommandBuffer, VulkanDriver, and composite structure like
// VulkanRenderPrimitive are owners of VulkanResourceManagerImpl instances.
struct VulkanResourceBase {
protected:
explicit VulkanResourceBase(VulkanResourceType type)
: mRefCount(IS_HEAP_ALLOC_TYPE(type) ? 1 : 0),
mType(type),
mHandleId(0) {}
private:
inline VulkanResourceType getType() {
return mType;
}
inline HandleBase::HandleId getId() {
return mHandleId;
}
inline void initResource(HandleBase::HandleId id) noexcept {
mHandleId = id;
}
inline void ref() noexcept {
if (IS_HEAP_ALLOC_TYPE(mType)) {
return;
}
++mRefCount;
}
inline void deref() noexcept {
if (IS_HEAP_ALLOC_TYPE(mType)) {
return;
}
--mRefCount;
}
inline size_t refcount() noexcept {
return mRefCount;
}
size_t mRefCount = 0;
VulkanResourceType mType = VulkanResourceType::BUFFER_OBJECT;
HandleBase::HandleId mHandleId;
friend struct VulkanThreadSafeResource;
friend class VulkanResourceAllocator;
template<typename RT, typename ST>
friend class VulkanResourceManagerImpl;
};
struct VulkanThreadSafeResource {
protected:
explicit VulkanThreadSafeResource(VulkanResourceType type)
: mImpl(type) {}
private:
inline VulkanResourceType getType() {
return mImpl.getType();
}
inline HandleBase::HandleId getId() {
return mImpl.getId();
}
inline void initResource(HandleBase::HandleId id) noexcept {
std::unique_lock<utils::Mutex> lock(mMutex);
mImpl.initResource(id);
}
inline void ref() noexcept {
std::unique_lock<utils::Mutex> lock(mMutex);
mImpl.ref();
}
inline void deref() noexcept {
std::unique_lock<utils::Mutex> lock(mMutex);
mImpl.deref();
}
inline size_t refcount() noexcept {
std::unique_lock<utils::Mutex> lock(mMutex);
return mImpl.refcount();
}
utils::Mutex mMutex;
VulkanResourceBase mImpl;
friend class VulkanResourceAllocator;
template<typename RT, typename ST>
friend class VulkanResourceManagerImpl;
};
using VulkanResource = VulkanResourceBase;
namespace {
// When the size of the resource set is known to be small, (for example for VulkanRenderPrimitive),
// we just use a std::array to back the set.
class FixedCapacityResourceSet {
private:
constexpr static size_t const SIZE = MAX_VERTEX_BUFFER_COUNT;
using FixedSizeArray = std::array<VulkanResource*, SIZE>;
public:
using const_iterator = FixedSizeArray::const_iterator;
inline const_iterator begin() {
if (mInd == 0) {
return mArray.cend();
}
return mArray.cbegin();
}
inline const_iterator end() {
if (mInd == 0) {
return mArray.cend();
}
if (mInd < SIZE) {
return mArray.begin() + mInd;
}
return mArray.cend();
}
inline const_iterator find(VulkanResource* resource) {
return std::find(mArray.begin(), mArray.end(), resource);
}
inline void insert(VulkanResource* resource) {
assert_invariant(mInd < SIZE);
mArray[mInd++] = resource;
}
inline void erase(VulkanResource* resource) {
assert_invariant(false && "FixedCapacityResourceSet::erase should not be called");
}
inline void clear() {
if (mInd == 0) {
return;
}
mInd = 0;
}
private:
FixedSizeArray mArray{nullptr};
size_t mInd = 0;
};
// robin_set/map are useful for sets that are acquire only and the set will be iterated when the set
// is cleared.
using FastIterationResourceSet = tsl::robin_set<VulkanResource*>;
// unoredered_set is used in the general case where insert/erase can occur at will. This is useful
// for the basic object ownership count - i.e. VulkanDriver.
using ResourceSet = std::unordered_set<VulkanResource*>;
using ThreadSafeResourceSet = std::unordered_set<VulkanThreadSafeResource*>;
} // anonymous namespace
class VulkanResourceAllocator;
#define LOCK_IF_NEEDED() \
if constexpr (std::is_base_of_v<VulkanThreadSafeResource, ResourceType>) { \
mMutex->lock(); \
}
#define UNLOCK_IF_NEEDED() \
if constexpr (std::is_base_of_v<VulkanThreadSafeResource, ResourceType>) { \
mMutex->unlock(); \
}
void deallocateResource(VulkanResourceAllocator* allocator, VulkanResourceType type,
HandleBase::HandleId id);
template<typename ResourceType, typename SetType>
class VulkanResourceManagerImpl {
public:
explicit VulkanResourceManagerImpl(VulkanResourceAllocator* allocator)
: mAllocator(allocator) {
if constexpr (std::is_base_of_v<VulkanThreadSafeResource, ResourceType>) {
mMutex = std::make_unique<utils::Mutex>();
}
}
VulkanResourceManagerImpl(const VulkanResourceManagerImpl& other) = delete;
void operator=(const VulkanResourceManagerImpl& other) = delete;
VulkanResourceManagerImpl(const VulkanResourceManagerImpl&& other) = delete;
void operator=(const VulkanResourceManagerImpl&& other) = delete;
~VulkanResourceManagerImpl() {
clear();
}
inline void acquire(ResourceType* resource) {
if (IS_HEAP_ALLOC_TYPE(resource->getType())) {
return;
}
LOCK_IF_NEEDED();
if (mResources.find(resource) != mResources.end()) {
UNLOCK_IF_NEEDED();
return;
}
mResources.insert(resource);
UNLOCK_IF_NEEDED();
resource->ref();
}
// Transfers ownership from one resource set to another
inline void acquire(VulkanResourceManagerImpl<ResourceType, SetType>* srcResources) {
LOCK_IF_NEEDED();
for (auto iter = srcResources->mResources.begin(); iter != srcResources->mResources.end();
iter++) {
acquire(*iter);
}
UNLOCK_IF_NEEDED();
srcResources->clear();
}
inline void release(ResourceType* resource) {
if (IS_HEAP_ALLOC_TYPE(resource->getType())) {
return;
}
LOCK_IF_NEEDED();
auto resItr = mResources.find(resource);
if (resItr == mResources.end()) {
UNLOCK_IF_NEEDED();
return;
}
mResources.erase(resItr);
UNLOCK_IF_NEEDED();
derefImpl(resource);
}
inline void clear() {
LOCK_IF_NEEDED();
for (auto iter = mResources.begin(); iter != mResources.end(); iter++) {
derefImpl(*iter);
}
mResources.clear();
UNLOCK_IF_NEEDED();
}
inline size_t size() {
return mResources.size();
}
private:
inline void derefImpl(ResourceType* resource) {
resource->deref();
if (resource->refcount() != 0) {
return;
}
deallocateResource(mAllocator, resource->getType(), resource->getId());
}
VulkanResourceAllocator* mAllocator;
SetType mResources;
std::unique_ptr<utils::Mutex> mMutex;
};
using VulkanAcquireOnlyResourceManager
= VulkanResourceManagerImpl<VulkanResource, FastIterationResourceSet>;
using VulkanResourceManager = VulkanResourceManagerImpl<VulkanResource, ResourceSet>;
using FixedSizeVulkanResourceManager
= VulkanResourceManagerImpl<VulkanResource, FixedCapacityResourceSet>;
using VulkanThreadSafeResourceManager
= VulkanResourceManagerImpl<VulkanThreadSafeResource, ThreadSafeResourceSet>;
#undef LOCK_IF_NEEDED
#undef UNLOCK_IF_NEEDED
} // namespace filament::backend
#endif // TNT_FILAMENT_BACKEND_VULKANRESOURCES_H

View File

@@ -87,8 +87,7 @@ constexpr inline float getMaxLod(SamplerMinFilter filter) noexcept {
case SamplerMinFilter::LINEAR_MIPMAP_NEAREST:
case SamplerMinFilter::NEAREST_MIPMAP_LINEAR:
case SamplerMinFilter::LINEAR_MIPMAP_LINEAR:
// Assuming our maximum texture size is 4k, we'll never need more than 12 miplevels.
return 12.0f;
return VK_LOD_CLAMP_NONE;
}
}
@@ -99,7 +98,7 @@ constexpr inline VkBool32 getCompareEnable(SamplerCompareMode mode) noexcept {
void VulkanSamplerCache::initialize(VkDevice device) { mDevice = device; }
VkSampler VulkanSamplerCache::getSampler(SamplerParams params) noexcept {
auto iter = mCache.find(params.u);
auto iter = mCache.find(params);
if (UTILS_LIKELY(iter != mCache.end())) {
return iter->second;
}
@@ -123,7 +122,7 @@ VkSampler VulkanSamplerCache::getSampler(SamplerParams params) noexcept {
VkSampler sampler;
VkResult error = vkCreateSampler(mDevice, &samplerInfo, VKALLOC, &sampler);
ASSERT_POSTCONDITION(!error, "Unable to create sampler.");
mCache.insert({params.u, sampler});
mCache.insert({params, sampler});
return sampler;
}

View File

@@ -32,7 +32,7 @@ public:
void terminate() noexcept;
private:
VkDevice mDevice;
tsl::robin_map<uint32_t, VkSampler> mCache;
tsl::robin_map<SamplerParams, VkSampler, SamplerParams::Hasher, SamplerParams::EqualTo> mCache;
};
} // namespace filament::backend

View File

@@ -28,7 +28,8 @@ namespace filament::backend {
VulkanSwapChain::VulkanSwapChain(VulkanPlatform* platform, VulkanContext const& context,
VmaAllocator allocator, VulkanCommands* commands, VulkanStagePool& stagePool,
void* nativeWindow, uint64_t flags, VkExtent2D extent)
: mPlatform(platform),
: VulkanResource(VulkanResourceType::SWAP_CHAIN),
mPlatform(platform),
mCommands(commands),
mAllocator(allocator),
mStagePool(stagePool),
@@ -67,11 +68,11 @@ void VulkanSwapChain::update() {
for (auto const color: bundle.colors) {
mColors.push_back(std::make_unique<VulkanTexture>(device, mAllocator, mCommands, color,
bundle.colorFormat, 1, bundle.extent.width, bundle.extent.height,
TextureUsage::COLOR_ATTACHMENT, mStagePool));
TextureUsage::COLOR_ATTACHMENT, mStagePool, true /* heap allocated */));
}
mDepth = std::make_unique<VulkanTexture>(device, mAllocator, mCommands, bundle.depth,
bundle.depthFormat, 1, bundle.extent.width, bundle.extent.height,
TextureUsage::DEPTH_ATTACHMENT, mStagePool);
TextureUsage::DEPTH_ATTACHMENT, mStagePool, true /* heap allocated */);
mExtent = bundle.extent;
}

View File

@@ -17,8 +17,10 @@
#ifndef TNT_FILAMENT_BACKEND_VULKANSWAPCHAIN_H
#define TNT_FILAMENT_BACKEND_VULKANSWAPCHAIN_H
#include "DriverBase.h"
#include "VulkanContext.h"
#include "VulkanDriver.h"
#include "VulkanResources.h"
#include <backend/platforms/VulkanPlatform.h>
@@ -35,7 +37,7 @@ struct VulkanHeadlessSwapChain;
struct VulkanSurfaceSwapChain;
// A wrapper around the platform implementation of swapchain.
struct VulkanSwapChain : public HwSwapChain {
struct VulkanSwapChain : public HwSwapChain, VulkanResource {
VulkanSwapChain(VulkanPlatform* platform, VulkanContext const& context, VmaAllocator allocator,
VulkanCommands* commands, VulkanStagePool& stagePool,
void* nativeWindow, uint64_t flags, VkExtent2D extent = {0, 0});

View File

@@ -29,12 +29,16 @@ using namespace bluevk;
namespace filament::backend {
using ImgUtil = VulkanImageUtility;
VulkanTexture::VulkanTexture(VkDevice device, VmaAllocator allocator,
VulkanCommands* commands, VkImage image, VkFormat format, uint8_t samples,
uint32_t width, uint32_t height, TextureUsage tusage, VulkanStagePool& stagePool)
VulkanTexture::VulkanTexture(VkDevice device, VmaAllocator allocator, VulkanCommands* commands,
VkImage image, VkFormat format, uint8_t samples, uint32_t width, uint32_t height,
TextureUsage tusage, VulkanStagePool& stagePool, bool heapAllocated)
: HwTexture(SamplerType::SAMPLER_2D, 1, samples, width, height, 1, TextureFormat::UNUSED,
tusage),
mVkFormat(format), mViewType(ImgUtil::getViewType(target)), mSwizzle({}),
tusage),
VulkanResource(
heapAllocated ? VulkanResourceType::HEAP_ALLOCATED : VulkanResourceType::TEXTURE),
mVkFormat(format),
mViewType(ImgUtil::getViewType(target)),
mSwizzle({}),
mTextureImage(image),
mPrimaryViewRange{
.aspectMask = getImageAspect(),
@@ -43,19 +47,28 @@ VulkanTexture::VulkanTexture(VkDevice device, VmaAllocator allocator,
.baseArrayLayer = 0,
.layerCount = 1,
},
mStagePool(stagePool), mDevice(device), mAllocator(allocator), mCommands(commands) {}
mStagePool(stagePool),
mDevice(device),
mAllocator(allocator),
mCommands(commands) {}
VulkanTexture::VulkanTexture(VkDevice device, VkPhysicalDevice physicalDevice,
VulkanContext const& context, VmaAllocator allocator,
VulkanCommands* commands, SamplerType target, uint8_t levels,
TextureFormat tformat, uint8_t samples, uint32_t w, uint32_t h, uint32_t depth,
TextureUsage tusage, VulkanStagePool& stagePool, VkComponentMapping swizzle)
VulkanContext const& context, VmaAllocator allocator, VulkanCommands* commands,
SamplerType target, uint8_t levels, TextureFormat tformat, uint8_t samples, uint32_t w,
uint32_t h, uint32_t depth, TextureUsage tusage, VulkanStagePool& stagePool,
bool heapAllocated, VkComponentMapping swizzle)
: HwTexture(target, levels, samples, w, h, depth, tformat, tusage),
VulkanResource(
heapAllocated ? VulkanResourceType::HEAP_ALLOCATED : VulkanResourceType::TEXTURE),
// Vulkan does not support 24-bit depth, use the official fallback format.
mVkFormat(tformat == TextureFormat::DEPTH24 ? context.getDepthFormat()
: backend::getVkFormat(tformat)),
mViewType(ImgUtil::getViewType(target)), mSwizzle(swizzle), mStagePool(stagePool),
mDevice(device), mAllocator(allocator), mCommands(commands) {
mViewType(ImgUtil::getViewType(target)),
mSwizzle(swizzle),
mStagePool(stagePool),
mDevice(device),
mAllocator(allocator),
mCommands(commands) {
// Create an appropriately-sized device-only VkImage, but do not fill it yet.
VkImageCreateInfo imageInfo{.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO,
@@ -154,7 +167,7 @@ VulkanTexture::VulkanTexture(VkDevice device, VkPhysicalDevice physicalDevice,
<< "handle = " << utils::io::hex << mTextureImage << utils::io::dec << ", "
<< "extent = " << w << "x" << h << "x"<< depth << ", "
<< "mipLevels = " << int(levels) << ", "
<< "TextureUsage = " << static_cast<int>(usage) << ", "
<< "TextureUsage = " << static_cast<int>(usage) << ", "
<< "usage = " << imageInfo.usage << ", "
<< "samples = " << imageInfo.samples << ", "
<< "type = " << imageInfo.imageType << ", "
@@ -167,11 +180,17 @@ VulkanTexture::VulkanTexture(VkDevice device, VkPhysicalDevice physicalDevice,
// Allocate memory for the VkImage and bind it.
VkMemoryRequirements memReqs = {};
vkGetImageMemoryRequirements(mDevice, mTextureImage, &memReqs);
uint32_t memoryTypeIndex
= context.selectMemoryType(memReqs.memoryTypeBits, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT);
ASSERT_POSTCONDITION(memoryTypeIndex < VK_MAX_MEMORY_TYPES,
"VulkanTexture: unable to find a memory type that meets requirements.");
VkMemoryAllocateInfo allocInfo = {
.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO,
.allocationSize = memReqs.size,
.memoryTypeIndex = context.selectMemoryType(memReqs.memoryTypeBits,
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT)
.memoryTypeIndex = memoryTypeIndex,
};
error = vkAllocateMemory(mDevice, &allocInfo, nullptr, &mTextureImageMemory);
ASSERT_POSTCONDITION(!error, "Unable to allocate image memory.");
@@ -206,7 +225,11 @@ VulkanTexture::VulkanTexture(VkDevice device, VkPhysicalDevice physicalDevice,
| VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT)) {
const uint32_t layers = mPrimaryViewRange.layerCount;
VkImageSubresourceRange range = { getImageAspect(), 0, levels, 0, layers };
VkCommandBuffer cmdbuf = mCommands->get().cmdbuffer;
VulkanCommandBuffer& commands = mCommands->get();
VkCommandBuffer const cmdbuf = commands.cmdbuffer;
commands.acquire(this);
transitionLayout(cmdbuf, range, ImgUtil::getDefaultLayout(imageInfo.usage));
}
}
@@ -256,7 +279,9 @@ void VulkanTexture::updateImage(const PixelBufferDescriptor& data, uint32_t widt
vmaUnmapMemory(mAllocator, stage->memory);
vmaFlushAllocation(mAllocator, stage->memory, 0, hostData->size);
const VkCommandBuffer cmdbuf = mCommands->get(true).cmdbuffer;
VulkanCommandBuffer& commands = mCommands->get();
VkCommandBuffer const cmdbuf = commands.cmdbuffer;
commands.acquire(this);
VkBufferImageCopy copyRegion = {
.bufferOffset = {},
@@ -317,7 +342,9 @@ void VulkanTexture::updateImageWithBlit(const PixelBufferDescriptor& hostData, u
vmaUnmapMemory(mAllocator, stage->memory);
vmaFlushAllocation(mAllocator, stage->memory, 0, hostData.size);
const VkCommandBuffer cmdbuf = mCommands->get().cmdbuffer;
VulkanCommandBuffer& commands = mCommands->get();
VkCommandBuffer const cmdbuf = commands.cmdbuffer;
commands.acquire(this);
// TODO: support blit-based format conversion for 3D images and cubemaps.
const int layer = 0;

View File

@@ -17,26 +17,29 @@
#ifndef TNT_FILAMENT_BACKEND_VULKANTEXTURE_H
#define TNT_FILAMENT_BACKEND_VULKANTEXTURE_H
#include "VulkanDriver.h"
#include "DriverBase.h"
#include "VulkanBuffer.h"
#include "VulkanResources.h"
#include "VulkanImageUtility.h"
#include <utils/RangeMap.h>
namespace filament::backend {
struct VulkanTexture : public HwTexture {
struct VulkanTexture : public HwTexture, VulkanResource {
// Standard constructor for user-facing textures.
VulkanTexture(VkDevice device, VkPhysicalDevice physicalDevice, VulkanContext const& context,
VmaAllocator allocator, VulkanCommands* commands, SamplerType target, uint8_t levels,
TextureFormat tformat, uint8_t samples, uint32_t w, uint32_t h, uint32_t depth,
TextureUsage tusage, VulkanStagePool& stagePool, VkComponentMapping swizzle = {});
TextureUsage tusage, VulkanStagePool& stagePool, bool heapAllocated = false,
VkComponentMapping swizzle = {});
// Specialized constructor for internally created textures (e.g. from a swap chain)
// The texture will never destroy the given VkImage, but it does manages its subresources.
VulkanTexture(VkDevice device, VmaAllocator allocator, VulkanCommands* commands, VkImage image,
VkFormat format, uint8_t samples, uint32_t width, uint32_t height, TextureUsage tusage,
VulkanStagePool& stagePool);
VulkanStagePool& stagePool, bool heapAllocated = false);
~VulkanTexture();

View File

@@ -57,11 +57,17 @@ std::tuple<VkImage, VkDeviceMemory> createImageAndMemory(VulkanContext const& co
VkDeviceMemory imageMemory;
VkMemoryRequirements memReqs;
vkGetImageMemoryRequirements(device, image, &memReqs);
uint32_t memoryTypeIndex
= context.selectMemoryType(memReqs.memoryTypeBits, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT);
ASSERT_POSTCONDITION(memoryTypeIndex < VK_MAX_MEMORY_TYPES,
"VulkanPlatformSwapChainImpl: unable to find a memory type that meets requirements.");
VkMemoryAllocateInfo allocInfo = {
.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO,
.allocationSize = memReqs.size,
.memoryTypeIndex
= context.selectMemoryType(memReqs.memoryTypeBits, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT),
.memoryTypeIndex = memoryTypeIndex,
};
result = vkAllocateMemory(device, &allocInfo, nullptr, &imageMemory);
ASSERT_POSTCONDITION(result == VK_SUCCESS, "Unable to allocate image memory.");

View File

@@ -85,13 +85,10 @@ void BackendTest::executeCommands() {
}
}
void BackendTest::flushAndWait(uint64_t timeout) {
void BackendTest::flushAndWait() {
auto& api = getDriverApi();
auto fence = api.createFence();
api.finish();
executeCommands();
api.wait(fence, timeout);
api.destroyFence(fence);
}
Handle<HwSwapChain> BackendTest::createSwapChain() {

View File

@@ -43,7 +43,7 @@ protected:
void initializeDriver();
void executeCommands();
void flushAndWait(uint64_t timeout = 1000);
void flushAndWait();
filament::backend::Handle<filament::backend::HwSwapChain> createSwapChain();

View File

@@ -87,11 +87,8 @@ void ComputeTest::executeCommands() {
}
}
void ComputeTest::flushAndWait(uint64_t timeout) {
void ComputeTest::flushAndWait() {
auto& api = getDriverApi();
auto fence = api.createFence();
api.finish();
executeCommands();
api.wait(fence, timeout);
api.destroyFence(fence);
}

View File

@@ -34,7 +34,7 @@ protected:
void TearDown() override;
void executeCommands();
void flushAndWait(uint64_t timeout = 1000);
void flushAndWait();
filament::backend::DriverApi& getDriverApi() { return *commandStream; }
filament::backend::Driver& getDriver() { return *driver; }

View File

@@ -34,10 +34,13 @@ class Entity;
namespace filament {
/**
* Camera represents the eye through which the scene is viewed.
* Camera represents the eye(s) through which the scene is viewed.
*
* A Camera has a position and orientation and controls the projection and exposure parameters.
*
* For stereoscopic rendering, a Camera maintains two separate "eyes": Eye 0 and Eye 1. These are
* arbitrary and don't necessarily need to correspond to "left" and "right".
*
* Creation and destruction
* ========================
*
@@ -140,6 +143,18 @@ namespace filament {
* intensity and the Camera exposure interact to produce the final scene's brightness.
*
*
* Stereoscopic rendering
* ======================
*
* The Camera's transform (as set by setModelMatrix or via TransformManager) defines a "head" space,
* which typically corresponds to the location of the viewer's head. Each eye's transform is set
* relative to this head space by setEyeModelMatrix.
*
* Each eye also maintains its own projection matrix. These can be set with setCustomEyeProjection.
* Care must be taken to correctly set the projectionForCulling matrix, as well as its corresponding
* near and far values. The projectionForCulling matrix must define a frustum (in head space) that
* bounds the frustums of both eyes. Alternatively, culling may be disabled with
* View::setFrustumCullingEnabled.
*
* \see Frustum, View
*/
@@ -234,6 +249,24 @@ public:
*/
void setCustomProjection(math::mat4 const& projection, double near, double far) noexcept;
/** Sets a custom projection matrix for each eye.
*
* The projectionForCulling, near, and far parameters establish a "culling frustum" which must
* encompass anything either eye can see.
*
* @param projection an array of projection matrices, only the first
* CONFIG_STEREOSCOPIC_EYES (2) are read
* @param count size of the projection matrix array to set, must be
* >= CONFIG_STEREOSCOPIC_EYES (2)
* @param projectionForCulling custom projection matrix for culling, must encompass both eyes
* @param near distance in world units from the camera to the culling near plane. \p near > 0.
* @param far distance in world units from the camera to the culling far plane. \p far > \p
* near.
* @see setCustomProjection
*/
void setCustomEyeProjection(math::mat4 const* projection, size_t count,
math::mat4 const& projectionForCulling, double near, double far);
/** Sets the projection matrix.
*
* The projection matrices must be of one of the following form:
@@ -309,11 +342,14 @@ public:
* The projection matrix used for rendering always has its far plane set to infinity. This
* is why it may differ from the matrix set through setProjection() or setLensProjection().
*
* @param eyeId the index of the eye to return the projection matrix for, must be <
* CONFIG_STEREOSCOPIC_EYES (2)
* @return The projection matrix used for rendering
*
* @see setProjection, setLensProjection, setCustomProjection, getCullingProjectionMatrix
* @see setProjection, setLensProjection, setCustomProjection, getCullingProjectionMatrix,
* setCustomEyeProjection
*/
math::mat4 getProjectionMatrix() const noexcept;
math::mat4 getProjectionMatrix(uint8_t eyeId = 0) const;
/** Returns the projection matrix used for culling (far plane is finite).
@@ -350,6 +386,26 @@ public:
void setModelMatrix(const math::mat4& model) noexcept;
void setModelMatrix(const math::mat4f& model) noexcept; //!< @overload
/** Set the position of an eye relative to this Camera (head).
*
* By default, both eyes' model matrices are identity matrices.
*
* For example, to position Eye 0 3cm leftwards and Eye 1 3cm rightwards:
* ~~~~~~~~~~~{.cpp}
* const mat4 leftEye = mat4::translation(double3{-0.03, 0.0, 0.0});
* const mat4 rightEye = mat4::translation(double3{ 0.03, 0.0, 0.0});
* camera.setEyeModelMatrix(0, leftEye);
* camera.setEyeModelMatrix(1, rightEye);
* ~~~~~~~~~~~
*
* This method is not intended to be called every frame. Instead, to update the position of the
* head, use Camera::setModelMatrix.
*
* @param eyeId the index of the eye to set, must be < CONFIG_STEREOSCOPIC_EYES (2)
* @param model the model matrix for an individual eye
*/
void setEyeModelMatrix(uint8_t eyeId, math::mat4 const& model);
/** Sets the camera's model matrix
*
* @param eye The position of the camera in world space.
@@ -448,7 +504,9 @@ public:
//! returns this camera's sensitivity in ISO
float getSensitivity() const noexcept;
//! returns the focal length in meters [m] for a 35mm camera
/** Returns the focal length in meters [m] for a 35mm camera.
* Eye 0's projection matrix is used to compute the focal length.
*/
double getFocalLength() const noexcept;
/**

View File

@@ -513,6 +513,14 @@ public:
*/
size_t getMaxAutomaticInstances() const noexcept;
/**
* Queries the device and platform for instanced stereo rendering support.
*
* @return true if stereo rendering is supported, false otherwise
* @see View::setStereoscopicOptions
*/
bool isStereoSupported() const noexcept;
/**
* @return EntityManager used by filament
*/
@@ -676,6 +684,25 @@ public:
bool destroy(const InstanceBuffer* p); //!< Destroys an InstanceBuffer object.
void destroy(utils::Entity e); //!< Destroys all filament-known components from this entity
bool isValid(const BufferObject* p); //!< Tells whether a BufferObject object is valid
bool isValid(const VertexBuffer* p); //!< Tells whether an VertexBuffer object is valid
bool isValid(const Fence* p); //!< Tells whether a Fence object is valid
bool isValid(const IndexBuffer* p); //!< Tells whether an IndexBuffer object is valid
bool isValid(const SkinningBuffer* p); //!< Tells whether a SkinningBuffer object is valid
bool isValid(const MorphTargetBuffer* p); //!< Tells whether a MorphTargetBuffer object is valid
bool isValid(const IndirectLight* p); //!< Tells whether an IndirectLight object is valid
bool isValid(const Material* p); //!< Tells whether an IndirectLight object is valid
bool isValid(const Renderer* p); //!< Tells whether a Renderer object is valid
bool isValid(const Scene* p); //!< Tells whether a Scene object is valid
bool isValid(const Skybox* p); //!< Tells whether a SkyBox object is valid
bool isValid(const ColorGrading* p); //!< Tells whether a ColorGrading object is valid
bool isValid(const SwapChain* p); //!< Tells whether a SwapChain object is valid
bool isValid(const Stream* p); //!< Tells whether a Stream object is valid
bool isValid(const Texture* p); //!< Tells whether a Texture object is valid
bool isValid(const RenderTarget* p); //!< Tells whether a RenderTarget object is valid
bool isValid(const View* p); //!< Tells whether a View object is valid
bool isValid(const InstanceBuffer* p); //!< Tells whether an InstanceBuffer object is valid
/**
* Kicks the hardware thread (e.g. the OpenGL, Vulkan or Metal thread) and blocks until
* all commands to this point are executed. Note that does guarantee that the

View File

@@ -28,11 +28,7 @@
namespace filament {
/**
* Fence is used to synchronize rendering operations together, with the CPU or with compute.
*
* \note
* Currently Fence only provide client-side synchronization.
*
* Fence is used to synchronize the application main thread with filament's rendering thread.
*/
class UTILS_PUBLIC Fence : public FilamentAPI {
public:

View File

@@ -166,6 +166,25 @@ public:
* many previous frames are enqueued in the backend. This also varies by backend. Therefore,
* it is recommended to only call this method once per material shortly after creation.
*
* If the same variant is scheduled for compilation multiple times, the first scheduling
* takes precedence; later scheduling are ignored.
*
* caveat: A consequence is that if a variant is scheduled on the low priority queue and later
* scheduled again on the high priority queue, the later scheduling is ignored.
* Therefore, the second callback could be called before the variant is compiled.
* However, the first callback, if specified, will trigger as expected.
*
* The callback is guaranteed to be called. If the engine is destroyed while some material
* variants are still compiling or in the queue, these will be discarded and the corresponding
* callback will be called. In that case however the Material pointer passed to the callback
* is guaranteed to be invalid (either because it's been destroyed by the user already, or,
* because it's been cleaned-up by the Engine).
*
* UserVariantFilterMask::ALL should be used with caution. Only variants that an application
* needs should be included in the variants argument. For example, the STE variant is only used
* for stereoscopic rendering. If an application is not planning to render in stereo, this bit
* should be turned off to avoid unnecessary material compilations.
*
* @param priority Which priority queue to use, LOW or HIGH.
* @param variants Variants to include to the compile command.
* @param handler Handler to dispatch the callback or nullptr for the default handler

View File

@@ -52,6 +52,7 @@ class UTILS_PUBLIC MaterialInstance : public FilamentAPI {
public:
using CullingMode = filament::backend::CullingMode;
using TransparencyMode = filament::TransparencyMode;
using DepthFunc = filament::backend::SamplerCompareFunc;
using StencilCompareFunc = filament::backend::SamplerCompareFunc;
using StencilOperation = filament::backend::StencilOperation;
using StencilFace = filament::backend::StencilFace;
@@ -367,6 +368,16 @@ public:
*/
void setDepthCulling(bool enable) noexcept;
/**
* Overrides the default depth function state that was set on the material.
*/
void setDepthFunc(DepthFunc depthFunc) noexcept;
/**
* Returns the depth function state.
*/
DepthFunc getDepthFunc() const noexcept;
/**
* Returns whether depth culling is enabled.
*/

View File

@@ -133,9 +133,9 @@ struct BloomOptions {
Texture* dirt = nullptr; //!< user provided dirt texture %codegen_skip_json% %codegen_skip_javascript%
float dirtStrength = 0.2f; //!< strength of the dirt texture %codegen_skip_json% %codegen_skip_javascript%
float strength = 0.10f; //!< bloom's strength between 0.0 and 1.0
uint32_t resolution = 360; //!< resolution of vertical axis (2^levels to 2048)
uint32_t resolution = 384; //!< resolution of vertical axis (2^levels to 2048)
float anamorphism = 1.0f; //!< bloom x/y aspect-ratio (1/32 to 32)
uint8_t levels = 6; //!< number of blur levels (3 to 11)
uint8_t levels = 6; //!< number of blur levels (1 to 11)
BlendMode blendMode = BlendMode::ADD; //!< how the bloom effect is applied
bool threshold = true; //!< whether to threshold the source
bool enabled = false; //!< enable or disable bloom
@@ -541,6 +541,13 @@ struct SoftShadowOptions {
float penumbraRatioScale = 1.0f;
};
/**
* Options for stereoscopic (multi-eye) rendering.
*/
struct StereoscopicOptions {
bool enabled = false;
};
} // namespace filament
#endif //TNT_FILAMENT_OPTIONS_H

View File

@@ -91,8 +91,6 @@ public:
/**
* Sets a texture to a given attachment point.
*
* All RenderTargets must have a non-null COLOR attachment.
*
* When using a DEPTH attachment, it is important to always disable post-processing
* in the View. Failing to do so will cause the DEPTH attachment to be ignored in most
* cases.

Some files were not shown because too many files have changed in this diff Show More