The default material does not cover all of the depth variants,
and so for the client material's depth variants (with no custom
depth shader), we need to check if the program is allocated for
the material or if it is actually part of the default material.
The previous fix attempt didn't work on some test. There are no known
bugs with AtomicFreeList other than tripping TSAN, and it's unclear
that TSAN isn't at fault.
However, switching to using a mutex works fine and doesn't appear to
be slower (it's actually faster with synthetic benchmarks on macOS)
BUGS=[377369108]
I do not think there was an actual error with AtomicFreeList, however
TSAN detected a data race when concurrent pop() happened. In that case,
there is indeed a race, where we can end-up reading data that is
already corrupted by the concurrent pop. However, that situation is
corrected by the following CAS. Somehow TSAN didn't see that.
The fix is strange and consists in replacing:
```
auto pNext = storage[offset].next;
```
with
```
auto s = storage[offset];
auto pNext = s.next;
```
In this PR we also adjust the memory ordering to be less strong. i.e.
we do not need `memory_order_seq_cst`, only the appropriate acquire or
release semantic.
In addition we also make `Node* next` a non-atomic variable again. It
should have been, but was change to placate an older version of TSAN.
BUGS=[377369108]
Continuing from PR #8220.
Here we change all of the references from the old ref-counting ways
to the new ref-counting structs and mechanisms. There should be no
functional change.
- Cache the render buffer in the render pass strucutre
- Enable the proteded path in the Render Target
- Make sure we build the protected version of the depth/color in the swap chaing
- Minor cleanup/restrucuring of the code
On ES2 devices (or in forceES2 mode), we emulate the sRGB swapchain
in the shader if the h/w doesn't support it. In that case, the emulation
is controlled by a uniform that technically lives in the frameUniforms
block. However, the frameUniforms buffer is not updated, instead,
the uniform is manually set. Unfortunately, the UBO emulation
overrides it with the uninitialized variable.
BUGS=[377913730]
Previously, we linked against libOSMesa through the linker and
then used dlopen to link against libGL. This is not how libOSMesa
is intended to be used. Instead, we use dlopen on libOSMesa
(via bluegl), which then will map the correct GL methods for us,
and for the OSMesa functions, we also map those functions
from libOSMesa (instead of relying on compile-time linker).
By providing a queue and the family index, we can now create
protected comand buffers that will be submitted to a protected
queue. The main abstraction is in VulkanCommands.h where we
introduced a CommandBufferPool.
Also did refactor on the "mStorage", fences, and semaphores of
the original VulkanCommands so that each buffer is associated
with one fence and one semaphore.
Feature flags are intended to be used when a new feature is added to
filament and have generally two purposes:
1) during feature development, the feature can be implemented but
disabled which can help developing large features. This way the
feature can be tested by stakeholders while it's being developed
without impacting other clients.
2) once a feature is ready, its flag can be enabled by default, but
in case the feature breaks something or has unintended consequence,
clients have the option to turn it off.
Feature flags are intended to have a relatively short life span, i.e.
once a feature is stable, the flag fill be removed.
There two types of feature flags. Constant feature flags can only be
set during Engine initialization via Engine::Builder. Non-constant
feature flags can be set at any time.
Feature flags SHOULD NOT be used as configuration or settings.
Feature flags are designed with a few ideas in mind:
- they are very cheap to check inside the engine
- non-constant flags can easily be toggled using ImGUI
We add a set of classes to the vk backend that will enable
cleaner ref-counting.
In particular, all allocation from the HandleAllocator will be
wrapped within a resource_ptr<> smart pointer. This struct will
maintain a count that will increment/decrement with respect to
references (similar to std::shared_ptr).
This commit only introduces the new classes/structs and does not
actually make any functional difference. A follow-up commit will
make the switch to use the smart pointer.
We also put VulkanFence and VulkanTimerQuery in a different header
because we will need to depend on them separately in the follow-up
(reason: they are accessed on the backend thread but allocated on
"sync"/filament thread).
Adding the basic mechanisms for creating and managing the protected
state. We also create the queue and get it. Finally we allocate
protected memory. Also removed some of the shared context logic
The goal of this PR is to get closer to being able to run all
FrameGraph passes in parallel. To achieve this we need all data
consumed by the "execute" closure of the FrameGraph passes to be
immutable or thread-safe. We also need the passes to never use the
Engine's global `DriverAPi&` object.
Specifically in this PR, we turn as many objects to `const` as possible
without major changes, and we pass the `DriverApi&` object as parameter
to render passes.
This work is far from being complete. So we also annotate with FIXMEs
all the places we can identify will be problematic (there are probably
others).
The main remaining issues are:
- main allocator is not thread-safe
- some places take a non-const View, Scene or Engine
- lazy allocation of materials and material instance usages are not
thread-safe.
This PR shouldn't change any behavior.
Move a lot of the code from beginRenderPass() to the initialization
of the RenderTarget. This will save us a bit of CPU when
RenderTarget is re-used.
We also reduce the size of the VulkanRenderTarget handle and put
must of the caching bits into a heap struct.
When multiview is enabled with the combining debug option toggled on, we
use an intermediate buffer that requires us to use the entire area of
the buffer as the viewport. Use xvp in this case.
* Capture the last win32 error immediately after failing win32 API functions are called in order to log it correctly. Prior to this change, intervening win32 API calls could clear the error code and it would not be logged.
* Oops fix bad whitespace in previous commit
In some cases, users set materials first without providing render primitives, which has incurred the attribute mismatching warning. This isn't helpful because users don't know what action they should take to remove the warning.
Emit the warning only when the primitive handle is initialized so the AttributeBitset is properly populated.
BUGS=[372755205]
RenderPass is now tracking the scissor state locally so it can avoid
re-setting the scissor when it doesn't change.
We also consolidate the scissor override and the scissor-viewport in
RenderPass::Execute.
RenderPass is now tracking the scissor state locally so it can avoid
re-setting the scissor when it doesn't change.
We also consolidate the scissor override and the scissor-viewport in
RenderPass::Execute.
- the main goal of this change was to move some state changes outside
of loops, usually with mip generation.
- for instance bindPostProcessDesciptorSet() must now be issued
manually (and can be done outside the loop)
- we also don't set the scissor for each pass
- we also move prepareMaterial() outside of getPipelineState(), it's
now done in PostProcessMaterial::getMaterial().
- We use PostProcessVariant instead of uint8_t everywhere
In the end we have three drawing helpers:
- commitAndRenderFullScreenQuad() which updates a material instance,
binds the corresponding material and draws a full screen quad.
- renderFullScreenQuad() which just renders a full screen quad.
- renderFullScreenQuadWithScissor() which does the same but scisorred
Additionally, we have the following helpers for getting materials and
instances:
- PostProcessMaterial::getMaterial(): which returns the FMaterial
- PostProcessMaterial::getMaterialInstance(): which is now a helper
returning a material instance from a PostProcessMaterial or FMaterial
Most of the change is pluming through these API changes.
It always references static data. Additionally, we don't need
to use a vector to store the specialization constants, because
it's also all static data.
And finally, we don't need a boolean to know the state of the
PostProcessMaterial, the mSize field can encode the same
information.
"history" is a map from a DescriptorSet pointer to a set of
bookkeeping values (we delay "binding" until "commit" so need to
keep values until then). Instead of using a map, we can store
these values in the DescriptorSet itself so that we save on a
map look-up.
ANGLE features should be set by apps, the system or developers but it's
not a good idea to set them in a library as it might conflict with other
libs etc.
we did it because it improved performance, but that should be fixed at
the angle level instead.
We are seeing a cluster of crashes that could be due to using an
EGLSurface whose ANativeWindow has become invalid. This could happen if
we continued to use (i.e. draw with) an EGLSurface after
SurfaceHolder::onSurfaceDestroyed() has returned.
This new flag enables an assertion that the native window is valid at
the time of makeCurrent(), which happens early in the frame.
BUG=[330392256]
Previously, default layout is based on usage, but this actually
has two paths (Filament's TextureUsage and the computed
VkTextureUsage) that do not always agree. We simplify so that
default layout is stored in the texture itself.
Also remove some unnecessary code that is no longer necessary.
In particular, we shouldn't be doing a flush and wait for the
transition to complete before updating a sampler descriptor.
We just need to make sure the layout before it is accessed is
correctly given in the update struct.
In certain compilers, the assignment operators defined as default
doesn't automatically make a call to the parent's method if it's
user-defined.
Make this behavior explicit to avoid this edge case.
BUGS=[371980551]
To ensure the source of readPixels() is properly copy-able, we
want the backing textures to be created with the right BLIT_SRC usage.
However, this was not documented in the API. We workaround the issue
to tag all color attachment textures as BLIT_SRC.
This workaround will be removed in the future. For now, violations of
this condition will elicit a warning being printed out.
the problem stems from a mismatch between the shader code
and the cpu code. if the shader is configured to read the shadow
map, then the cpu must generate it, otherwise we can get stale data.
Wether the shader reads the shadow map depends on the shadow type.
For directional shadows, the shader needs the SRE variant + a
"shadow enabled" bit per cascade in the main UBO.
For punctual shadows, the shader only needs the SRE variant.
Because of all that, if the conditions are met on the CPU side for
the shader to access the shadow map, we must make sure to generate it,
but in the case the shadow map would be empty (e.g. no shadow receivers),
we need to initialize it (and we can skip some work in the case of VSM).
BUGS=[369908659]
this caused the HAS_SHADOWS flag to not be disabled, this didn't
actually cause a problem because shadowing and SSR share the same
SRE variant bit. But both should never be active together.
Mesa always clears the generic binding if the buffer deleted
is bound to an indexed binding, even if it's not bound to the
generic binding.
BUGS=[371324321]
- We change GLDescriptorSet::Buffer default constructor to
workaround a client's compiler set up issue.
- We removed the assert_invariant that checks that ubo/samplers
are not changed after committed in DescriptorSet. This caused
an existing client's build to crash.
We use Mesa's gallium swrast to render as the driver with
Filament's backend set to GL. We provide a few scripts to parse
the tests (as jsons) and run gltf_viewer to produce the rendering.
For GL+Linux, PlatformGLX will try to open an X11 window
regardless of whether we are doing headless/offscreen rendering
or not.
Here we add an OSMesa platform, which will allow us to avoid
opening any window on Linux. This is particularly useful for
situation where a display is not available, like for CI.
One important detail is that even though we are displaying through
a window, we keep the SDL2 dependency in tact for gltf_viewer.
This is due to the fact that gltf_viewer is built upon
FilamentApp, which is heavily integrated with SDL2. This is mostly
ok since we won't be hitting any path for opening a window due to
gltf_viewer's existing support for headless mode.
Currently mLayerCount for RenderTarget is always updated to the number
of depth for attachments, which incurs unintended behaviors for some
types of textures. i.e., 2d array, cubemap array, and 3d textures.
Fix this by updating mLayerCount only for multiview use case.
BUGS=[369165616]
scissor() works like on metal now, that is, it is disabled when a
render pass starts.
The GL backend already assumed this in debug mode. We cannot really run
into issues at the moment because every time we get a new
MaterialInstance we set the scissor -- we do this both for the
color pass and the post-process passes. With this change we will be
able to skip setting the scissor altogether in a lot of cases.
- Also used a smaller runner as the gains from the 32-core was
not efficient when comparing output times.
- Clean-up:
- Rename the android continuous to a proper name
- set 'echo on' for the Windows release build so we'll know
why the output asset does not get "moved" correclty.
Found through testing that renderStandaloneView+vk+swiftshader
seems to cause synchronization issues, which results in incorrect
rendering. Here we workaround the issue by forcibly flush and
wait per renderStandaloneView call.
BUG=361822355
- Both Scene and IBL are holding on to a skybox reference. We
need to make sure the order they are destroyed in right order.
- Reloading IBL should trigger resetting the indrect light in
gltf_viewer.
FrameSkipper was recently broken because of a typo resulting in
the frame latency being always 3 instead of the default of 2.
This change also makes the maximum latency 2 instead of 3, because on
ANDROID, 3 can cause CPU throttling at seemingly random places on the
GL thread.
Clients can create a multi-layered render target that consists of array
textures, and use it as a custom render target.
A new sample app "hellostereo" demonstrates how to use this feature.
when debug markers are implemented with systrace, begin/end can is not
guaranteed to happen in the same C++ scope, so we can't really use
other scoped systraces.
- Pulled the android continuous workflow into an action
- Split the android continuous build into 3 ABIs armv7, armv8a,
and x86_64 (note that x86 is not present).
- Remove the upload artifacts step from previous workflow.
This was meant for letting client try a tip-of-tree Android
build. We can revisit this later. (Also removed mention
in README.md)
- Split the android continous into debug build and a release
build commands, enabling deletion of intermediate output
directory.
make sure to unset all textures in the per-view sampler group after
they are used, because the resource could be destroyed after the
pass is finished
- unset the fog and ibl_specular after the color pass
- move that cleanup a bit earlier
- in the case of screen-space reflection the structure pass is
set, but might not be used in the color pass, so we also need to
unset it after the SSR pass and before any other passes.
Also tag user texture "FTexture" if user doesn't provide a tag, this
is so that we can distinguish them from internal textures that might
not be tagged.
with a previous change we were too aggressive in falling back to
"no buffer padding", we need to do this only in the case where
"as subpass" would have been used if supported by the h/w.
This is because we had this assumption on vk and mtl backends already
and also because SYSTRACE works this way and we didn't have non-null
terminated strings. This makes things more consistant and less bug
prone.
this is because the "disposer" is now separate from the
ResourceAllocator, so even if we don't use the resource allocator we
need to register the handle with the disposer.
vk, metal and desktop gl all support depth clamp, GLES/android also does
with ANGLE. Add support for it in the backends.
use depth clamp to improve directional shadow quality; this allows
to render everything that's behind the camera at the same "zero" depth,
so we can reduce the depth range we need.
Fixes#6293
`DebugRegistry::setProperty` is no longer just applicable to debug
builds.
This change was previously added in 6c0bd36 but then reverted
in a7317e7. We re-enable it and separate it from the shadow
changes in this commit.
the framegraph blackboard tends to create a lot of problems hard to
debug, so we are now more explicit. here we remove usages of the
blackboard in RenderUtils.cpp.
default parameters were not initialized which could cause them to
be incorrectly evaluated in the shader. this is actually a pretty
crazy bug that has been around since forever and which we were
"lucky" to not run into sooner.
this was exposed with the specular extension that was added recently.
Verify the compatibility between a compiled material and the engine's
setting only when the engine is set up for stereo.
Default materials are always compiled with either 'instanced' or
'multiview. Therefore Filament will emit warnings unintentionally if the
engine is set up for stereo. This commit fixes it.
It is legal for the app to draw a frame even when FrameSkipper
detects that a frame should be skipped. In that case we used to
overwrite the most recent fence, but this wasn't ideal. We now
proceed as if the fence had signaled, i.e. we destroy it and
move all the fences up, creating a new one at the end of the
delayed array.
This is because the shadowmap doesn't store depth values so it doesn't
work to "flatten" all casters behind de camera to the near plane.
The bulk of shadowing works but the filtering/edges don't always.
Disable depth-clamping when VSM is enabled.
vk, metal and desktop gl all support depth clamp, GLES/android also does
with ANGLE. Add support for it in the backends.
use depth clamp to improve directional shadow quality; this allows
to render everything that's behind the camera at the same "zero" depth,
so we can reduce the depth range we need.
Fixes#6293
shadow cascades where not calculated properly because part of the
calculation took the cascade near/far into account, while another
part didn't. This resulted in cascades being too large. It didn't
create wrong shadows, but reduced (and in some case canceled) the
usefulness of the cascade.
We fix the problem by always using the projection matrix only for
describing the cascade's frustum, as opposed to just passing the
near/far plane distances.
Now the calculation of each cascade is completely self contained and
identical.
We also improve the orientation of the light frustum:
We can rotate the light frustum around the light direction axis, so
it aligns with the view direction, this generally result in smaller
light frustums. This cannot be used in stable mode.
A recent change broken the optional "depth clamp" as well as the
computation of the far plane of the light frustum. There was also
a case where DEBUG builds could assert.
- The far plane was no longer being "optimized" (i.e. moved as close
as possible), which resulted in less optimal use of the shadow texture.
the far plane can be moved as close as the farthest visible shadow
caster.
- After the camera/light frustums intersection we now see of the 2D
bounds seen from the light are empty and if so we bail, which prevents
an assertion later.
- finally, the "DEPTH_CLAMP" option is also updated for the new code
structure.
- the last View created was always overriding previous View's datasource
- because of lazy registering of the data source it was possible that
the registering lambda was called after the view was destroyed, leading
to crashes
- all view would share the same PID parameters and these would be
initialized to default value instead of the user provided value. so
debug build would behave differently.
With this change we improve things:
- now only the first view gets to publish its data source. it's still
not ideal, but works for our use case with gltf_Viewer
- the view can now unregister itself when it's destroyed
- only the view that successfully registered uses the debug PID values
and publishes its data source.
- the normal parameters are used until we query the datasource (from
imgui), so by default the behavior is now identical to release builds
This fixes a crash in gltf_viewer when opening the Debug panel.
Print a warning in case the sterescopic type in a compiled package is
different than what's in the engine's setting. The application may
proceed, but it could end up visual glitches when enabling stereoscopic
rendering.
This requires the stereoscopic type to be written into the package,
which needs a material version bump.
This reduces the latency of the timer query result; with the previous
code the latency could only increase, but there is no reason to wait
a whole frame for reading the next available result.
We just loop over them until we find one that has not signaled; instead
of doing one per frame.
we now have two levels of debug markers. Those that come from the "user"
(i.e. filament itself) are now always enabled and generate both
systrace and gl markers. the 2nd level is internal and always
disabled by default. Of enabled at compile time it'll emit markers for
each driver API method.
- use the same code on both ends of updating the free space. i.e.
both side compute the "used" space in exactly the same way.
the math was the same before, but the code was different which
could be confusing.
- assert for overflow before queuing the buffer. It wouldn't matter
anyways, because it's done with the condition lock held, so the
consumer would never have a chance to deuque it. still, less
confusing.
This commit 730bc99025 introduced a new
dependency on ResourceAllocator because of the new field
`std::unique_ptr<ResourceAllocator> mResourceAllocator{};` in
details/Renderer.h
This requires cpp files including details/Renderer.h to include
ResourceAllocator.h as well.
This compile issue only happens on the Windows compiler, Visual Studio.
This function attempts to set texture parameters in these two cases, but the
texture is not guaranteed to be bound. Perhaps it once was, but the assumption
broke at some point.
this change shouldn't have any impact on ARM, however, according
to cppreference it's not safe to mix seq_cst with other memory
orders:
"as soon as atomic operations that are not tagged memory_order_seq_cst
enter the picture, the sequential consistency guarantee for the program
is lost"
Value of mCgltfBuffersLoaded is sometimes retained across creation of FAssetLoader which skips loading the buffer in AssetLoaderExtended#createPrimitive leading to null pointer crash
This fixes a crash introduced by a8ace2891d
The refactored FrameInfoManager can cause a crash when IBL resource loading
happens because now the getLastFrameInfo() references an invalid value via the
`front` method. Return the default FrameInfo to resolve this.
Also fix a null pointer reference bug for OpenGLTimer::State, which
happenes when the renderer for IBLPrefilterContext is destroyed.
Adds multiview support for vulkan. This is done by adding a layerCount to the renderTarget, which is used to determine if multiview is available and being used in the current renderpass.
FIXES=[332425392]
Co-authored-by: Powei Feng <powei@google.com>
if it can't find a name for a node, it will revert to the config's
defaultNodeName, however, if that is nullptr also, a crash will occur.
so we provide a last-resort hardcoded name in that case.
Add Renderer::skipFrame() which should be called when intentionally
skipping frames, for instance because the screen content hasn't changed,
allowing Filament to performance needed periodic tasks, such as
cache garbage collection and callback dispatches.
We also improve the ResourceAllocator cache eviction policy:
- the cache is aggressively purged when skipping a frame
- we aggressively evict entries older than 3 frames
The default Config is now set to a more agressive setting.
The ResourceAllocator used to be global and owned by the Engine, this
was causing some issues when using several Renderers because each
one could cause the eviction of cache data for another.
We now have a ResourceAllocator per Renderer, which makes more sense
because most resources are allocated by the FrameGraph.
We also introduce a ResourceAllocatorDisposer class, which is used
for checking in and out a texture from the cache, and destroy the
texture when it's checked-out. That objet is still global.
* improve parallel_for a bit
We get about 40% performance increase. The gain comes from not having
to copy the JobData structure each time we create a job, by using a
new emplaceJob() method, we can create the structure directly into
its destination.
* avoid calling wakeAll() when possible
wakeAll() is very expensive and not always needed when a job finishes
because there may not be anyone waiting on that job.
We now maintain a waiter count per job, and use that to determine if
we need to notify or not.
And now that the JobSystem overhead is lower, we can decrease the size
of the jobs, which improves the load balancing.
* mActiveJobs fixes
some comments claimed mActiveJobs needed to be modified before or after
accessing the WorkQueue; this couldn't be correct because there were no
guaranteed global ordering with the workQueue.
- reduce the number of calls to notify_one() and notify_all().
notify_one() is not only called when running a new job, and
notify_all() only when a job finishes.
- don't hold the condition lock while calling notify_*(), as it is not
strictly needed, and because notify_*() can be very slow, there can
be a lot of contention on this lock as a result; blocking the whole
jobsystem thread pool.
- add a new version of run() that takes an opaque thread id that can
be retrieved from a job's execute function; this is especially
intended to be used by parallel_for(); it's just a more efficient
version of run() that avoids a hashmap lookup.
Overall these change yield a significant performance boost:
- running + waiting a job: +200%
- running many jobs: +150%
- running many jobs in parallel: +50%
- we use a circular buffer for the frame history so that
we don't have to copy the data when insert a new entry.
This also allows us to keep a reference to an entry, which
doesn't get invalidated when an entry is added/removed.
- we now store the gpu frame time in the correct slot (instead of
always the latest). It didn't matter before because the API wasn't
public and we only needed some recent frame time.
- a new public API now returns the frame history, which now contains
more data; in particular the main and backend thread's begin/end
frame time.
BUGS=[321110544]
- in case of failure we were munmap'ing the wrong size for the
guard page (in practice this never happened)
- the post-condition check was incorrect; it checked for nullptr instead
of MAP_FAILED. this also never happened in practice.
Also made a couple of small improvements:
- in case the special circular buffer mapping fails, log a message
as warning instead of debug.
- immediately memset (i.e. populate) the pages for the circular buffer
since they will all be accessed rather quickly.
- remove deprecated morphing APIs
- repair gltfio, samples and tests
The new API doesn't allow a MorphTargetBuffer per RenderPrimitive,
instead the MorphTargetBuffer is specified per Renderable.
gltfio separates RenderPrimitives from Renderables, in particular all
RenderPrimitives are created before their Renderable; this was
problematic for this change because all primitives must share
a single MorphTargetBuffer living in the Renderable.
To fix this, we're no longer initializing the morphing paramters
at RenderPrimitive creation, instead we store a reference to the
BufferSlot in the Primtive structure, so that later, when the Renderable
is created we can finally retrieve the BufferSlot and initialize its
morphing paramters, which are not available. The "morphing parameters"
are now expanded to contain the MorphTargetBuffer as before (except now
it's always the same for all the primitives of a Rendrable), as well
as the offset within the buffer and the vertex count.
Texture handles were resolved to pointers when updating a SamplerGroup,
as that point the handle was checked for use-after-free. However, the
texture could be destroyed later while still active in the SamplerGroup,
this would result in using the pointer which now contains garbage.
We now keep the handle and resolve the texture when binding samplers
to the program; which will also perform the use-after-free check.
If a user is using `setFrameScheduledCallback()`, managing the provided PresentCallable during engine shutdown is tricky -- we'll likely get a final frame scheduled when we flush the engine's work queue, but the PresentCallable will schedule the final CAMetalDrawable to be released on main thread afterwards, even if we call `present(false)` to skip it. If the swap chain is destroyed before that main queue block gets executed, the mutex presenting that drawable will no longer exist, causing a crash.
To make things easier, store the std::mutex in a shared_ptr, so that a PresentCallable can safely outlive the FSwapChain instance that created it and clean itself up afterwards.
Alternatives considered, all of which seem unfortunate:
* Require users to clear out the callback before shutting down the engine, so that any final drawables are immediately discarded instead of using PresentCallable
* Require users to split up the engine teardown across two main queue blocks, ensuring that the PresentCallable cleanup executes before swapchains are destroyed
* Drop the PresentCallable on the ground and leak the memory
When a user drags and drops a gltf file to the gltf_viewer window, it
loads the new asset.
While this happens, the function `checkAsset` tries parsing the gltf
file, but it doesn't free it. Fix this.
The number of SH bands used for the indirect light irradiance
computations can be set to 1, 2 or 3 (default) in Material::Builder.
For e.g. in lower-end devices w/ non HDR content, it might be
beneficial to set this value to 2.
BUGS=[341971013]
Abstracts the synchronization of the vulkan swapchain so that it is easier to override during the acquisition and presentation of images. A new structure (ImageSyncData) is created to hold swapchain synchronization data.
FIXES=[338303279]
PresentDrawable was moved to main thread by default in google#7535 and stopped
most crashes when a drawable is released. But there still appears to be crashes
if a drawable is released on main thread at the same time that -nextDrawable is
called from the Filament render thread. (It's likely that the drawable pool in
CAMetalLayer is completely non-thread-safe.)
So, add a mutex to the swapchain and always acquire it before creating or
releasing a CAMetalDrawable.
Users can opt out of this behavior by passing
-DFILAMENT_LOCK_METAL_DRAWABLE_POOL=0.
Currently, if this fails we log the error message to stderr (which
doesn't get captured by most crash reporting systems) and then crash in
a postcondition assert. By including the error message in an exception
reason and throwing an ObjC exception, we get better discoverability of
error causes.
(Building a render pipeline state from shaders is usually when a shader
actually gets JITted from LLVM IR to GPU-specific code, so if we
accidentally used a feature that's not available on the local GPU, we'll
find out about it here.)
going forward, instead of using the printf style syntax for panics
we use the c++ stream syntax
The new macros that replace ASSERT_*CONDITON are
FILAMENT_CHECK_PRECONDITON
FILAMENT_CHECK_POSTCONDITION
FILAMENT_CHECK_ARITIHMETIC
Example usage:
FILAMENT_CHECK_PRECONDITON(condition) << "Message";
It's also now possible to define FILAMENT_PANIC_USES_ABSL=1 to redirect
all these calls to Abseil's CHECK() macro.
Set combine_multiview_images to false by default as it's the desirable
setting for most Android devices.
Set the flag to true for GUI by default.
Put the `Combine Multiview Images` checkbox under the `Stereo mode` box
for an easier access.
The current API allowed to have a buffer for each primitive in a
renderable. We instead restrict the API so that there is a single
MorphTargetBuffer for the whole renderable, shared by all primitives.
The buffer can be shared thanks to the "offset" parameter on
setMorphTargetBufferAt().
Also
- fix FMorphTargetBuffer::updateDataAt()
- add support for the "offset" parameter of setMorphTargetBufferAt()
- Add option to build.sh to build for paritcular stereo
techniques (default to NONE). Only applies to samples.
- Consoldiate viewer checkbox for debugging stereo rendering
- Add DriverConfig flag for stereoscopic type so that it can
be used to determine availability of the feature and
(to be completed) enable corresponding GPU features.
Co-authored-by: Mathias Agopian <mathias@google.com>
we use a different hook for the draw() call when on an ES2 context,
this eliminates completely the overhead of supporting ES2 for the draw
call. draw calls are expected to be the most common calls.
- Use custom ICD path to enable Swiftshader instead of
specifying direct path to the lib.
- Remove unused `swiftshader` directory in `build`
- Remove swiftshader options in `build.sh` and cmakefiles
- Change BUILD.md
- Correctly handle XCB-only swapchain surface in VulkanPlatform
for swiftshader.
- Refactor `VulkanPlatform::ExtensionSet` so that `utils::CString`
is used instead of string_view, so that we don't get into
tricky lifetime issues with `const char*`
- a field was added, which broke the layout of the structure. We fix it
by adding constructors which will handle the old and new way of
initializing this structure.
- one of the test needed a hash update
- OpenGLContext wrongly asserted when trying to unbind texture 0
If a .metallib was compiled with a target iOS version that's newer than
the current device, loading the .metallib may succeed, but finding main0
(or any other function in it) will fail. Currently, this causes a crash
due to an assert. Logging the error and returning
MetalFunctionBundle::error() makes the crash slightly easier to
diagnose.
(Note that in practice, this will probably be a useless "Compiler
encountered an internal error" message -- the GPU backend is crashing,
and the Metal stub library sees XPC_ERROR_CONNECTION_INTERRUPTED. It
retries up to 3 times (crashing each time) and then gives up.)
* Add an Engine debug setting to force GLES 2.0
This setting is only meaningful on GLES backends, it's otherwise
ignored. When set to true, the backend will try to force a ES2 context
if supported. If not supported by the platform,
the backend will pretend it's a ES2 context.
This setting is currently only taken into account by the EGL platform.
* Update filament/backend/include/backend/Platform.h
Co-authored-by: Powei Feng <powei@google.com>
---------
Co-authored-by: Powei Feng <powei@google.com>
- make sure we can build with ES2 headers only
- make sure to not use ES3 features when in ES2 mode
(post processing was accidentally creating an R8 texture,
which is not supported)
I was unfortunately naive about the way that Filament handled external textures
on non-GLES platforms. This fix restricts the changes to Android (which is the
only place this change is required in the first place). Long story short, the
change broke WebGL. Desktop seems to be unaffected.
- At the end of a renderpass we use a more fine-grained barrier
for each of the attachments in the render target
- Make sure that buffer update are barrier'd from previous reads
- Remove previous Mali workaround barriers. Seems to be fine
without them on pixels + Mali.
This change will enable proper flat-shading and MikkTSpace.
Caveats:
- Only for disk-local glTF resources
- iOS, Web, Android do not work as of now
Fixes#6358, #7444
- use namespace for ImageUtility
- use FILAMENT_BACKEND_DEBUG_FLAG
- remove unused usage flag type
- add const list of VKFormat for iteration
- fix sampler name debug
The main goal of this change is to avoid having to select the
"per renderable" UBO at execution time. Instead we want the
PrimitiveInfo structure to already know which UBO will be used.
Concretely, this means that this determination must be done when
the RenderPass is created.
When automatic instancing is used, the RenderPass creates a temporary
UBO to store the instance info. This UBO's life-time is dictated by both
the life-time of the RenderPass and the Executors that where created
from it. For this reason we introduce SharedHandle<> to correctly
account for the owner's lifetime. This fixes a potential bugs where
that handle could have been destroyed and used later; in practice this
bug didn't happen however.
A couple other changes:
- RenderPass has a bunch of fields that were actually temporary, so we
removed those.
- The canonical "per-renderable" UBO was owned by View but accessed
through Scene. This was confusing, it's now accessed through View.
It's been a while that generateCommands() can only generate either
the depth or color pass but not both at once; we can simplify/de-dup
the code by leveraging that.
To allow easy enabling/disabling of debug options like vulkan
validation, android systrace, debug printing, and others, we
introduce an option to add a preprocessor flag so that a
backend can (optionally) use it to manage debug options.
* vk: Fix unsupported depth blitting
On certain hardware (pixel 4 for example), blitting of depth texture
is not supported as an "optimalTilingFeature". In these cases, we'd
would need to do a shader-based blit. We
- Add the shader blit in PostProcessingManager
- Add a driver API to check for support for blitting depthStencil
attachments.
- Fix some debugging ifdefs in vk backend.
The validation fixed is:
`[ VUID-vkCmdBlitImage-dstImage-02000 ] Object 0: handle = 0xb400007c300701d0, type = VK_OBJECT_TYPE_COMMAND_BUFFER; Object 1: handle = 0xf2039b0000000771, type = VK_OBJECT_TYPE_IMAGE; | MessageID = 0x86bc2a78 | In vkCmdBlitImage, VkFormatFeatureFlags (0x1c601) does not support required feature VK_FORMAT_FEATURE_2_BLIT_DST_BIT for format 126 used by VkImage 0xf2039b0000000771[] with tiling VK_IMAGE_TILING_OPTIMAL. The Vulkan spec states: The format features of dstImage must contain VK_FORMAT_FEATURE_BLIT_DST_BIT`
This largely undoes a change I did recently where PrimitiveInfo has
a FRenderPrimitive* to save some space and keep a command at 64 bytes.
This wasn't a good idea because the inner rendering loop shouldn't
to any dereference in the common case.
This change reorganises PrimitiveInfo such that it stores all the data
necessary to render a primitive in the common case. The less common
cases are when hybrid instancing, morphing or skinning are used; in
those cases, a dereference into the renderable SOA is needed.
PrimitiveInfo currently has 16 bytes free, which we keep for futur use.
In opengl it's possible to bind several textures to the same texture
unit as long as they're a different target. Until now we were tracking
that state. In practice it's not very useful to bind several textures
to the same unit (it is a little bit when updating texture data, but
not when rendering). With the coming change to a descriptor set API, it
is better to have a 1-to-1 mapping between bound textures and texture
units.
So with this change, only a single texture can be bound to a texture
unit. If another texture in bound to the same unit with a different
target, we first unbind the texture from the current target.
There is less state to track, and it allows us to
"unbind a texture unit" (whereas before we'd have to iterate through
all the possible targets for that unit and unbind all of them).
* don't rely on FMaterialInstance having a default ctor
FMaterialInstance needed a default ctor because it is a field of
FMaterial but cannot be initialized before FMaterial itself is
initialized. So we had a defautl ctor and we'd finish the initialization
later. Conceptually the default material instance should have been
new'ed and a pointer to it stored instead.
That's basically what we do now, but to avoid the extra allocation,
we in-place new and delete the default material instance into an
aligned_storage inside FMaterial.
* Update filament/src/details/Material.h
Co-authored-by: Ben Doherty <bendoherty@google.com>
---------
Co-authored-by: Ben Doherty <bendoherty@google.com>
the gl backend did some of its cleanup in the its destructor,
including calling into OpenGL, however, the destructor is called from
the main thread, not the GL thread, so these calls would be no-ops at
best, and crashes in the worst case.
* don't crash if we don't have a Camera set on View
- also add a method to query if a camera was set
* Update android/filament-android/src/main/java/com/google/android/filament/View.java
Co-authored-by: Powei Feng <powei@google.com>
---------
Co-authored-by: Powei Feng <powei@google.com>
using thread affinity naively on big.little architectures is very flaky,
for now it's better to simplify and not use it at all, let the kernel
figure things out.
BUGS=[333582569]
* gltfio: add Asset/Resource extended implementations
- Add gltfio/src/extended to implement an alternate loader for
primitives. This is largely based on the implementation in
AssetLoader/ResourceLoader
- Able to correctly produce flat shading from gltf that only have
vertex positions and indices.
- This is not hooked into current code and should have no
practical effect on gltfio.
Originally we did this because we wanted to run on a big core on
android. However setting the thread affinity in this way is fragile,
we are not guaranteed to be on a big core, and we don't even know
if some thread is pinned to that core already; which was the case
with some GL drivers. This can also cause scheduling problems with
other threads.
We just remove this logic entirely for now, and we'll figure out
something better later to run on a big core.
Fixes#7748
BUGS=[333949404]
This change introduces a new chunk type to material files for precompiled Metal libraries. Previously, SPIR-V was the only binary type, so there's also a couple of refactor commits present here. Nothing is changed in Filament or matc yet.
BUGS=[333547148]
- Use new descriptor set and layout caching
- Remove descriptor set related code in VulkanPipelineCache
- fix leaks for descriptor sets/layouts
FIXES=248594812,325157400
A MaterialParser could be leaked if several edits happened before they
were latched -- this was because the MaterialParser was stored as
a raw pointer instead of a unique_ptr<>, this was done as an attempt
to avoid to use a lock around accessing mPendingEdits.
Added
- Cache for layouts
- Pools for descriptor sets
- Cache for descriptor set updates
- Cache for pipeline layouts
Does not have effect on implementation.
- Move VulkanDescriptorSet to VulkanHandles.h
- Add VulkanDescriptorSetLayout to VulkanHandles.h
- Add "input" descriptor set types to VulkanUtility.h. These are
structs that will be defined in the backend API (shared across
all backends) and eventually passed from the front-end to the
vk backend.
- Logic to parse descriptor set layout from the spirv-v shaders.
- Move UsageFlags type to VulkanUtility.h
- Just prep work. No effect to current implementations.
We were calculating the shadow visibility of spot/point lights. The
visibility was calculated during the "execute" phase of the FrameGraph
but it was used/needed during the setup phase. The result was that
the visibility was always delayed by one frame (really it was stale
data from the previous calculation).
We are now computing the shadow visibility earlier, during the
setup phase. This is also better because we can now skip culling
of these shadow maps entirely if we know they're not visible.
Fixes#7715
going forward we want to be able to throw exceptions from the backend
at the very least, we need to be consistant, currently we are
potentially throwing exceptions from `noexcept` places.
this changes makes it possible to throw exceptions from the backend,
during handle construction and conversion to pointers, which wasn't
allowed before.
We still can't throw from dtors because it's generally a bad idea,
better abort in that case.
Replace the num_views for OpenGL multiviwe only when
- The engine is initialized with multiview stereo
- The variant for the material contains STE flag
- The program is for surface
- It's vertex shader (this is already in)
For OpenGL multiview, it honors the qualifier `layout(num_views = X)`
specified in shader files to determine the number of views for
multiview.
We cannot recompile materials everytime the value changes. So replace
the value of num_views with the engine's eye count when shaders compile.
This PR adds a new `pause()` option to the `Engine` `Builder` and a new function
`setPaused()` to the `Engine`. While paused, the rendering thread will pause
indefinitely for commands as if none are available. As soon as the rendering
thread is unpaused, the commands are immediately executed.
in case the swapchain creation fails, it will now return a swapchain
with an EGL_NO_SURFACE handle. this will avoid having to nullptr check
the pointer in various places and will revert to the previous behavior
on failure.
FIXES=[329659681]
We verify that the buffer given to setImage() is at least as large
as needed for the given region to transfer; at least based on the
size given.
This might help catch b/330407429.
BUGS=330407429
This is due to color attachments being set to store=discard when
they are multi-sampled. It's unclear why that condition exists. For
now, removing it will fix the rendering issues with transparent
object + MSAA. We'll keep it as such until an issue surfaces.
Fixes#7674
- dynamic (default) no restriction apply
- static bounds: bounds and world transform can't be changed
- static: additionaly morphing/skinning and vertex/index buffers are
immutable.
This will allow some optimizations in the future. Currently, we just
store the type but don't do anything with it.
Plumb through multiview configurations to the pipeline so that the
engine draws scenes using multiview extension. Users need to prepare
shaders compiled with the `multiview` param and set the
`stereoscopicType` flag to MULTIVIEW in the Engine::Config to enable
multiview feature.
In this change, postprocessings for multiview are not yet supported. So
we all disable them until they're supported.
The debug option `combineMultiviewImages` combines layers as one image,
which allows us to check the final result.
- TangentsJobExtended extracts data from cgltf accessor and
runs geometry::TangentSpaceMesh on the attributes and computes
the tangent space.
- The /extended folder is meant for running this process. Note that
this API might remesh the input and will require corresponding
changes that might break previous assumptions.
- The general flow of the code is modeled after src/TangentsJob.h
- This is not hooked into current code and should have no
practical effect on gltfio.
This new parameter indicates whether the render target will be created
for multiview.
If the value is greater than 1, it tells the render target should be
created for multiview. Otherwise, 1 or 0, it creates a single layer
render target.
Instead of holding pointers to class instances in VulkanDriver,
we standardize by making the relevant classes have proper
constructors and initialize in VulkanDriver's constructor
initializer list.
- `isSwapChainProtected()` is now virtual
- `createDefaultRenderTarget()` is renamed to `getDefaultFramebufferObject()`
- new `getCurrentContextType()` returns the current context type
- `makeCurrent` now takes an additional `ContextType`
- `PlatformEGL::getContextForType()` to retrieve the `EGLContext` for
a given type.
- `PlatformEGL::makeCurrent` non-virtual utilities to set only the
context or swapchains.
For reasons unknown, after upgrading to XCode 15.3, dlopen can
no longer find libvulkan.1.dylib. We fix it by explicitly adding
/usr/local/lib to rpath for macos.
PipelineState now holds a handle to a HwVertexBufferInfo.
DriverAPI::draw() is now technically deprecated and replaced by the
more efficient draw2(), which only takes an index offset, index count
and instance count. The Pipeline to use is now specified with a new
API bindPipeline() and the primitive to use with bindRenderPrimitive().
This allows clients to reuse RenderPrimitives and ultimately Pipelines.
This change reduces CPU usage significantly on Metal and Vulkan, by
reducing the need to lookup for a pipeline at every draw call.
The application, however, must be a "good citizen" by reusing
MaterialInstance and RenderPrimitive as much as possible. We do have
RenderPrimitive cache however, so reusing the same VertexBuffer and
associated parameters also works.
Some default materials such as defaultMaterial and skybox have discrete
material file for feature level 0.
Combine these materials as one utilizing the `-P` option of matc.
This adds a new material property (float postLightingMixFactor) which
is used to mix the original color with the post-lighting blended color.
The default value is 1.0, which keeps the current behavior.
FIXES=[328498606]
cmgen mirrors environment maps by default so that the reflection map
appears un-mirrored. IBLPrefilter didn't do that.
EquirectangularToCubemap now takes a Config parameter that allows to
specify the mirroring, which is enabled by default.
FIXES=[320856413]
Protected contexts are now supported by the OpenGLPlatform interface
and implemented in EGLPlatform.
Protected contexts can read from regular and protected resources but
can only write to protected resources (e.g. protected swap chains or
textures backed by protected memory. These can be created on Android
via AHardwareBuffer and EGLImage for instance).
The underlaying EGL implementation must support protected contexts.
Switching to a protected context is achieved by using a
protected-content SwapChain in Renderer::beginFrame().
A protected-content SwapChain can be created using the new
CONFIG_PROTECTED_CONTENT flag at creation time.
The OpenGL backend implementation will then use a protected context for
rendering until an unprotected SwapChain is used again.
The crux of this implementation is to use different VAOs in
the different contexts, because those can't be shared between contexts.
We also need to synchronize the state with our state cache and ensure
VAOs objects are destructed properly in the right context.
* Add new parameter -P for matc
This new matc parameter `-P` or `--material-parameter` allows users to
set material properties to the specified value.
Values passed through this matc parameters have the highest priorities.
I.e., they overwrite material properties specified in the material file.
Previously the cache would try to keep its size below a user-settable
value. This was not effective because when that value was too small,
it would cause a lot of churn every frame without actually keeping
the memory usage below the specified value.
We now evict buffer aggressively after they've not been used
(for two frames by default), but we don't cap the size of the cache.
The cache will naturally settle at the size it needs. When dynamic
resolution is used, it might be needed to increase resources
maximum age, which is a user-settable value still.
This improves performance on mobile on many scenes because the 64MB
default value was too low, causing the crash to thrash.
The validation error triggers on hellotriangle using AMD (desktop), QCOM (mobile) and Mali (mobile) GPU.
Before this MR, only a single semaphore object was used to synchronize all the calls to vkAcquireNextImage (signal) and vkQueueSubmit (wait).
The issue is that by the time vkQueueSubmit returns, the semaphore is not necessarily reset.
When multiple frames are in flight, the next call to vkAcquireNextImage might try to reuse the semaphore while it is still in the wait status.
The semaphore is reset at a driver/hardware-dependent timing that's likely to be linked to the GPU queue execution.
The solution proposed by this MR is to use a pool of semaphores big enough to cover all possible queue submissions.
Previous commit [1] changed the semantic of the index to
mBufferObjects. Here we just make sure that if a buffer has been
allocated, we don't allocate another (otherwise, we'd leak).
Also cleaned up `updateBoneIndicesAndWeights` indexing
[1]: a3131a64b6
- Add cache for ds layouts and ds
- Abstract descriptors API into VulkanDescriptorSetManager
- Note that this is just a draft and not hooked into the current
implementation.
- better naming
- TimerQueryFactory doesn't depend on OpenGLDriver anymore,
only a OpenGLContext.
- timer query factory is now owned by and accessed through OpenGLDriver
- we can't temporarily store a negative number
in the query shared state, because it now indicates an error.
This is supported only by the PlatformEGL currently. There is not much
that can be done with it either at this point. A protected swapchain is
one that can only be written by a protected context, however, there
is currently no way to create such context.
This is currently only implemented in the GLES backend and simply
exposes GL_EXT_protected_textures. There is not much that can be done
with this yet. Protected textures can't be read nor written at the
moment.
* Add a new material param, stereoscopicType
This new parameter allows us to specify which implementation of
stereoscopic rendering Filament uses for the material.
This change just includes material parameter addition and shader code
changes, so it doesn't affect the current rendering behavior.
These changes will follow as separate commits.
- render pipeline changes
- material parameter override via matc parameter
- material document update
We're going to add a new implementation of stereoscopic rendering using
multiview. Thus we want to remove the word `Instanced` from all methods
and properties.
This shader takes an array texture and a layer index to draw to the
current render target.
This will be used for debugging purpose to combine an array texture
rendered from the multiview feature that is going to be implemented
later, so that we can verify the feature properly performed.
When shader compilation happens in threadpool mode, shader source code
isn't stored correctly in the token. This leads to empty error messages
if a shader has problems later on.
- added many precondition checks and asserts to VertexBuffer creation
- simplified code in VertexBuffer as well
- enforce BONE_INDICES to integer when specified by user since that's
what shaders expect.
- better comments about *always* setting BONE_INDICES to integer
- some code simplification in RenderPass + some comments about skinning
- in the GL backend we no longer set the vertex buffer objects at
renderprimitive creation time, because they might not be available
yet. Instead, we let the natural age mechanism update them next
time it's needed. This allows us to add some asserts about the
declared buffer being present
We do this to better match Gl, Vulkan and Metal, which don't need
to specify the scissor in the pipeline. In practice, this will also
allow us to set the scissor less often, saving a bit of CPU.
These are missing parts from the commit
111ad96134.
NDK 26.1.10909125 is used by default
Minimum API level on Android is now API 21 instead of API 19. This allows the use of OpenGL ES 3.1
Throw an NSException when a program fails to compile and then is used for drawing; this helps aid debugging compiler errors in production, where stdout logs are not available.
The OOB would happen is the scene never had any renderables, in that
case the scene's SoA would stay unallocated, but the summedAreaTable
code relies on it have at least a capacity of 1.
It was incorrect to skip the RenderPass entirely because it might have
had some custom commands that needed to be executed (e.g. for applying
post-processing in subpass mode).
This new backend object holds the information needed to create the
pipeline on vulkan/metal relative to draw calls.
It is used to create HwVertexBuffer.
* Add OpenGL extension for multiview
This extension is going to be used for multiview implementation in
OpenGL.
Now the API isStereoSupported takes a stereo type as a parameter.
HwRenderPrimitive doesn't need to know about the index offset and
index count, these parameters are only needed when drawing. draw()
is updated consequently.
This is a first step towards being able to lower the overhead of
draw similar draw calls.
A side effect of this is that the HwRenderPrimitiveFactory now will
cache buffers regardless of their index count & offset.
This reduces resource utilisation for Views that never need shadows.
It saves a UBO, two Entities and about 10KB memory. We also lazily
allocate the debugging DataSource, which saves about 10K per View
in debug builds.
Overall this change makes "simple" Views less than 4KB heavy down from
about 24KB (debug, 14KB release).
The main changes:
- ShadowMapManager is now allocated lazily
- the ShadowMap cache object is also allocated lazily
- debug DataSource is allocated lazily
- ShadowMaps are prepared/initialized with a Builder, which makes it
clearer that some APIs are only for preparing the ShadowMap cache.
- each handle now has a 4-bits "age", meaning that handles are recycled
only after 16 alloc/free cycles.
This is used to detect double-free and use-after free.
This should also allow us to compare handles, because freeing and
reallocating an object, won't produce the same Handle (at least
for 16 rounds).
- removed "type safety" checks because it's almost impossible to
get it wrong thanks to our compile time type safety checks. This
didn't provide a useful value added.
- This feature is built on top of being able to set/get a 8 bits tag
associated with the memory block returned by the pool allocator. We
use the "extra" parameter of the allocator to allocate a "hidden"
structure containing the age of that memory block.
- Also we don't allow to compare Handle<> of different types
- update the pools sizes for metal and vulkan, which were very outdated.
- add debug code on all backends to print the size of each handle
(with a compile time switch)
The most important change is that now the 3 pools of HandleAllocator
are sized so that each can accommodate about the same amount of handles.
This makes it easier to reason about. The total amount of handles is
three times that, since there are 3 pools.
We also try to allocate the buckets so that handles are evenly
distributed, however, that's very hand wavy.
With the current setup the number of handles per pool is as follows:
- GL : 3240 / pool / MiB
- VK : 1820 / pool / MiB
- MTL: 1310 / pool / MiB
* Automatically flush CommandStream
When generating commands, we now automatically flush the CommandStream,
so that we're guaranteed to not overrun the circular buffer.
* clenaup CircularBuffer implementation and API
Also fix a bug in DEBUG mode that could corrupt the CircularBuffer, it
was due to a wrong debugging code attempting to clear the unused
area of the buffer (this was wrong because in "ashmem" mode, there are
no guaranteed unused areas).
* Fix a couple threading vs. allocations
- prepareVisibleLights was run on a dedicated thread (via JobSystem),
but was using its own local ArenaScope. This is wrong because it
could reset the root arena at any later point. This is fixed by
just not using a local ArenaScope.
- related to the above, the root Arena (LinearAllocatorArena) didn't
use a locked policy, which cause also cause problems since some
allocations are done off the main thread. We now pre-allocate the one
buffer we need.
This PR also renames some variable and types to improve readability.
* Rework RenderPass to improve allocations and API
RenderPass now is a fully immutable object that gets constructed with a
RenderPassBuilder. RenderPassBuilder can be passed around and doesn't
do any (major) allocations.
All RenderPass allocations and heavy lifting is done in
RenderPassBuilder::Build().
Additionally, RenderPass cannot be copied anymore.
Where allocations happen is now much clearer.
* new LinearAllocatorWithFallback
LinearAllocatorWithFallback is a linear allocator that can fall back
to the heap allocator. We use it for the high level command buffer to
avoid crashing when running out of memory.
FIXES=[277115740]
* Update filament/src/RenderPass.h
Co-authored-by: Powei Feng <powei@google.com>
* Update libs/utils/include/utils/Allocator.h
Co-authored-by: Powei Feng <powei@google.com>
---------
Co-authored-by: Powei Feng <powei@google.com>
- Add methods for adding attributes to the input mesh
- Add method in TangentSpaceMesh for when user provides the
tangents
- Separate client-side Algorithm enum from implementation algorithm
(AlgorithmImpl)
- Fix CMake config for combining static libs
32 sample may be more suited to 2x upsampling, it gives 8 samples per
high-res pixel (instead of 4). This is also what FSR 2.0 is using,
which is useful for comparing.
This has caused issues and over time we have reduced the use of
spinlocks, it was only used in few places and we still have evidence
that it's causing ANRs.
We use utils::Mutex instead which is a low overhead mutex implementation
on Linux systems.
FIXES=[321101014]
it was incorrectly mapping the equirect image to a cubemap due to a
typo in our overload of atan2 which was swapping its parameters.
atan2 is now removed, and we use atan(y,x) instead. Also modified the
code slightly so it matches almost exactly cmgen's.
FIXES=[320856413]
When parsing a lexeme, we use one less byte than it's intended to be for
comparing the current string.
This results in a success in cases like:
- true and truX
- false and falsX
- null and nulX
where X means an arbitrary character.
Fix this by the full intended length.
* Bokeh aspect ratio
new DoF option to set the bokeh aspect ratio, this can be used to
simulate anamorphic lenses
* Update android/filament-android/src/main/java/com/google/android/filament/View.java
Co-authored-by: Powei Feng <powei@google.com>
* Update web/filament-js/filament.d.ts
Co-authored-by: Powei Feng <powei@google.com>
---------
Co-authored-by: Powei Feng <powei@google.com>
See #7415 for a more detailed description of why this change is necessary.
The remaining variants which are filtered from FL0 materials are all related to
lighting, so further hacks like this won't be necessary.
Future work involves properly supporting differing sets of variants based on
shader language.
both vk and metal don't support depth resolves, and are currently
implemented in the backend. vk is buggy and they don't resolve the
same way that gl does.
* TAA improvements
- fix variance + AABB history clipping (we were computing the union
of both AABB instead of the intersection).
- added a setting for the variance parameter
- added a setting for the jitter pattern
- improved some default values
- fixed a few comments
- smaller code tweaks to facilitate future improvements
This is what the NDK does, it's needed to keep debug infos for the
STL symbols (because the STL is provided by the platform and doesn't
have debug infos).
In practice none of the extensions we would use eglGetProcAddress for
are supported. And we had a case where an extension was reported
supported but eglGetProcAddress didn't return the corresponding
entry point.
update web demos remote ui.
FIXES=[315033914]
We shouldn't remove the listener callbacks when a surface is destroyed
by the system (e.g. screen off/on) because then we won't know when
it comes back.
But we still need to do this when the user calls UiHelper.detach() or
when we are attaching to a new surface.
Fixes#7424
getUserWorldFromWorldMatrix() was always set to identity during the
shadow pass. It needs to be set the the same value as the main
camera.
FIXES=[315504607]
The per-light shadow caster flag wasn't updated when a light was
toggled from casting to non-casting. This resulted in an out of date or
incorrect shadowmap to be used.
FIXES=[315859790]
we make PostProcessMaterial, getPostProcessMaterial() as well as
render(), commitAndRender() public (as in public to filament internals).
These methods have no reason to be private to PostProcessManager.
Imported targets always use the imported flags, not the flags that
where specified when the target was created. The clear flags
should be cleared after they've been used once in case that rendertarget
is reused by multiple passes.
For e.g. if the clear flag is set, the target will be cleared the first
time it is used, but if it's used again, we don't want to clear again,
in that case we'll use the "local" flags used when the target was
created (as opposed to the imported flags).
Material constants (a.k.a: specialization constants) can only be set
during Material creation through Material::Builder.
This change somewhat relaxes that limitation by allowing constants to
be set at runtime on Material directly.
Currently this new API is still private and only supported on FMaterial.
This feature works by invalidating the HwProgram cache of the concerned
Material, causing a shader recompile per variant; so this API is costly
and should be used only for debugging or during app/game configuration.
The TAA material is modified to use constants instead of #defines for
various settings and those are exposed in TaaOptions as well is in
ViewerGui. So with this change all aspects of the TAA material can
be changed at runtime.
This allows to turn off the OpenGL backend (using -DFILAMENT_SUPPORTS_OPENGL=OFF).
Previously, disabling the OpenGL backend was leading to a compilation error related to missing level 0 materials.
Swiftshader runs spirv validation before compilation. However,
the validation does not like having Nop (no-op) in the input.
So we skip instructions instead of writing no-op for the
output of `workaroundSpecConstant`.
Also, fix issue to keep the value in the original shader if a
specialization wasn't provided.
Even if skinning is not fully implemented on FL0, we have clients which depend
on materials with skinning variants that otherwise could easily be converted to
FL0 materials. There are two proper ways to deal with this:
1. Support skinning/morphing in Feature Level 0.
2. Allow ESSL 1.0 code and ESSL 3.0 code to support different sets of variants.
However, the simplest solution is to just include skinning/morphing variants,
but disable codegen for ESSL 1.0 code, making them identical to the base
variants. This shouldn't increase the file size much due to the dictionary
deflation. Of course, skinning will not work correctly on FL0, but this has
always been the case. Future work here would be to properly implement one of the
two solutions described above.
- deprecate blit(), renamed to blitDEPRECATED. It's only used in one
place in copyFrame() now. We can't void it because we don't have
access to the texture from the RenderTarget.
- add a new blit() api that works with textures instead of render
targets.
- add a new resolve() api that works with textures instead of render
targets. doesn't support scaling.
- always use a shader when scaling in the framegraph
(there was only one place where we used blit)
- use the new blit() for:
- for mipmap generation on vk (fixme)
- copying the depth buffer to avoid ssao feedback loop
- use the new resolve() for:
- manually resolving MSAA color/depth
- Simplify the resolve APIs on the filament side
- implement generateMipmaps for the vulkan backend
simplify MetalBlitter
- remove metal blit workarounds
- don't issue a blit from a renderpass, this only affect Renderer::copyFrame
- We only handle a single texture now (instead of color+depth), so we
can simplify the code a lot.
Didn't touch the "slowpath" much, but it now assumes it blits color or
depth, not both.
* on ANGLE we now use a thread pool for parallel shader compilation
in general we now prefer using a thread pool instead of the KHR
extension, because we have less control over how the queue is
managed by the driver.
ANGLE supports many threads very well, so we use cpu_threads/2 for the
pool size, at background priority.
* Update filament/backend/src/opengl/ShaderCompilerService.h
Co-authored-by: Ben Doherty <bendoherty@google.com>
* Update filament/backend/src/opengl/ShaderCompilerService.h
Co-authored-by: Ben Doherty <bendoherty@google.com>
---------
Co-authored-by: Ben Doherty <bendoherty@google.com>
* Add spirv-headers as a separate third_party repo
Previously, we pulled in spirv-headers as part of spirv-tools. We
still keep this behavior but move the content of the repo to
third_party.
We introduce a script for updating spirv-tools, and update the patch
file. The same script will also pull in a dependent spirv-headers.
* better handle invalid programs in release builds
Until now invalid program would basically be undefined behavior,
which in practice was a crash via a null pointer dereference.
With this change, invalid programs cause drawing ops to become no-ops.
Additionally fixed an unsynchronized access of a the variable containing
the program id. I don't think it would have caused issues though.
FIXES=[311775564]
Co-authored-by: Powei Feng <powei@google.com>
We were not unregistering the TextureView or SurfaceHolder callbacks on
detach, so they could fire and access an null'ed RenderSurface
FIXES=[308443790]
The CL introducing the ESSL 1.0 chunk in materials inadvertently disabled
optimizations for said code. This commit reintroduces those optimizations and
fixes associated bugs which manifested. In particular, spirv-cross was
generating uints for bools; this has been fixed with a hack. Additionally,
spirv-cross is now compiled with exceptions enabled so that matc can gracefully
fail and show the code which failed to compile rather than abruptly aborting.
With these fixes failing tests are:
OpenGL:
BackendTest.FeedbackLoops
Metal:
BackendTest.ColorResolve
BackendTest.BufferObjectUpdateWithOffset
BasicStencilBufferTest.StencilBufferMSAA
Vulkan:
Many failures still
Since #7358 is blocked by an upstream spirv-cross issue, we can at least do a
bit of preprocessor optimization for ESSL 1.0 code in the meantime and introduce
the FILAMENT_EFFECTIVE_VERSION preprocessor definitions.
This change in glslang removes the include of "intermediate.h" from
GlslangToSpv.h:
62de186c33
As a result, the definition of "class TIntermediate" is removed, and
will fail compilation of MaterialCompiler.cpp when glslang is updated to
a version including the aforementioned change. We fix this by adding an
explicit include to this header in MaterialCompiler.cpp.
Co-authored-by: Powei Feng <powei@google.com>
- gltfio: Enable -Wall -Werror for gltfio_core
- gltfio: Fix various errors that were missed warnings
- matdbg: switch from std::atomic_uint64_t to
std::atomic<uint64_t> for older clang
* Add Material.compile() Java binding.
* Add Engine.flush() java binding
* Add Scene.forEach java binding
update the Android gltf-viewer sample to precompile all variants of all
materials in the scene, similarly to the desktop sample.
- we were not using the correct field in ShadowMapManager
- we were not computing the transform correctly, it should applied
after the local transform, not before.
FIXES=[299310624]
we recently added calls to Material::compile in gltfio to precompile
materials are they are discovered. that wasn't a good call, because
this should be the responsibility of the app, not of gltfio, at least
not without an option.
This is now done in gltf_viewer. We need something similar for
Android.
Bugs #7318, #7336
Moving setFrontFaceWindingInverted to MaterialInstance will enable
finer control over face inversion and aligns better with Vulkan's
pipeline definition (see VkGraphicsPipelineCreateInfo).
There's no functional change. Remove an unused local variable
`outgoingEdges` to save CPU resource. Tidy up an usage of local variable
to slightly improve readability.
We removed the no-op queue submit for headless in PR #7264. This
means that a semaphore will not be waited on and caused a
validation error. Here, we simply don't acquire that semaphore
for present.
Also reorganzied the pSempaphores array code for better
readability.
Fixes#7334
Drag and dropping a gltf folder was broken:
- the handle didn't find the gltf file on drag&drop
- the ResourceLoader cached the asset path
- don't exit(1) when drag&dropping an invalid file
- Make sure matinfo works by selecting a default backend in the
absence of activeShaders.
- Add options to select backend in matinfo mode.
- Workaround cursor misplacement for monaco
- Refactor menu sections into a common element.
First, this commit introduces some very simple bugfixes regarding ES2
compatibility related to postprocessing.
Second, this commit adds support for creating textures specified as R8, SRGB8,
and SRGB8_A8 in ES2. R8 is trivial: just use GL_LUMINANCE instead. The sRGB
formats, however, are maybe a bit more controversial. As implemented, they
instead just use the equivalent non-sRGB formats. This is of course technically
incorrect. There are a few approaches to how to add sRGB compatibility for ES2
that I can think of.
1. Do a bunch of complex shader nonsense in matc. Maybe even traversing the AST
and ensuring any texture lookup of a texture flagged as sRGB uses some
compatibility function. This would require static analysis to track if samplers
are reassigned to another variable, for example. This of course also breaks down
if you don't know at compile time if the shader will receive an RGB or an sRGB
sampler, or if the shader should be able to support both RGB or sRGB samplers.
Really only worth mentioning here for the sake of completion.
2. You could also generate simple compatibility functions to look up each
sampler, which would only apply to FL0 materials.
First, we would have to extend the material format to be able to explicitly
"color" a sampler as sRGB or not, like:
```
parameters : [
{
type : sampler2d,
name : albedo,
precision : medium,
colorSpace : srgb,
},
{
type : sampler2d,
name : normal,
precision : medium,
colorSpace : linear,
}
],
```
Then, the following GLSL code would be generated.
```glsl
\#if __VERSION__ == 100
vec4 texture_albedo(vec2 position) {
return sRGBtoLinear(texture2D(materialParams_albedo, position));
}
vec4 texture_normal(vec2 position) {
return texture2D(materialParams_albedo, position);
}
\#else
vec4 texture_albedo(vec2 position) {
return texture(materialParams_albedo, position);
}
vec4 texture_normal(vec2 normal) {
return texture(materialParams_normal, position);
}
\#endif
```
Finally, at runtime, if a sampler is "colored" one way or the other, we would
verify that only the appropriate kinds of samplers are bound.
I'm actually very partial to this solution. Since sRGB compatibility is only a
concern on ES2, we can generate this code only for FL0 shaders, which already
require GLSL shader authors to care about ESSL 1.0 compatibility by calling the
appropriate `textureXX` functions. Additionally, it provides a layer of
high-level validation that texture lookups are correct, even if a real ES2
context is not available on the device being tested.
3. Leave it entirely up to the client. (What this commit does.) This leaves
client code ripe for making mistakes, but luckily, we can go back and do
solution 2 whenever. If specifying a color space for a sampler remains optional,
then if this feature is retrofitted in the future, client code will continue to
compile.
Enable a limited subset of materials in PostProcessManager for FL0.
Create new function Material::getFeatureLevel() in C++ and Java.
Create missing Material::getReflectionMode() method in Java.
This change hard-codes writing the post process output at index 0 (i.e. color)
to gl_FragColor when generating ESSL 1.0 shaders. Any other outputs (besides
depth) are discarded with a warning, but as far as I can tell, no such cases yet
exist in Filament.
Fix edge case where an empty struct could be generated in an ESSL 1.0 shader.
Include _maskThreshold and _doubleSided in ESSL 1.0 shaders.
Add GL_OES_standard_derivatives extension to ESSL 1.0 shaders. According to
gpuinfo.org, this has 96% device coverage and supports both Mali-400 and Adreno
(TM) 304.
Remove 3D sampler support from ESSL 1.0 shaders. This extension is only
supported by 62% of devices.
Change filagui material to a FL0 material.
don't use FixedCapacityVector to store pointers to active shadowmaps,
that's just not needed. They're all stored un a static array already
and directional and spot shadows are partitioned.
This saves a couple heap allocations as well a an pointer dereference.
compile_commands.json was being generated, but hidden away inside of the cmake
build directories. This change makes build.sh link it to the main project dir
and adds some associated .gitignore entries. Now compile_commands.json is
properly read when starting clangd from Emacs, for example, and probably many
other editors.
When we enable SSR the first time, the SSR buffer is not initialized,
this can result in the color pass fragment shader aborting, which in
turn prevents the SSR history buffer from being initialized (since
it's made from the result of the color pass), repeating the cycle.
In some other case, the system somehow recovers but we still see a
flicker when enabling SSR.
The solution here is to disable SSR in the shader until the history
buffer is ready (i.e. a frame later).
The reason is that some implementations of WGL require all contexts to
be created on the same thread, which we're not necessarily doing here.
fixes#7078
- only check/log in debug builds
- use epsilon = 2e-7 * double(tempPairCount)
- compute boneWeightsSum in double
- don't modify the weights if they're within the threshold
FIXES=[306565054]
We've seen hangs/ANR that are not well understood on that spinlock, so
for now we're going back to mutexes, which, on android, are very
efficient under low contention (no syscall).
FIXES=[308029108]
This change does three main things. First, it adds an option to the Engine
Builder to pick the feature level at which to instantiate Filament. The only
real practical purpose of allowing this is to be able to instantiate at feature
level 0. Secondly, it allows feature level 0 to properly work on non-ES2
devices. Thirdly, it changes both Android and desktop hellotriangle samples to
explicitly opt-in to feature level 0.
Unfortunately, feature levels are used in two different, somewhat contradictory
ways presently in Filament, which can make reasoning about this change a bit
confusing. From a client perspective, feature levels refer to buckets of
capabilities which are guaranteed to be supported. Internally, there is a
separate "feature level" stored internally at the Driver subclass level which
generally corresponds to the maximum supported feature level, but is also
referenced when activating workarounds for limited devices. For example, Uniform
Buffer Objects are not supported in ES2, however, Filament supports emulating
them such that the client does not need to care at all; a supported feature is a
supported feature. But internally, Filament uses this "Driver" feature level to
determine whether or not a given workaround is needed. There were several cases
where the "active feature level" was being examined in order to activate these
workarounds rather than the "driver feature level", which was incorrect.
Why should non-ES2-only devices want to activate feature level 0? Allowing this
behavior 1. makes feature level 0 more consistent with the behavior of other
feature levels and 2. allows clients a layer of validation that their software
will work on all devices supported by Filament if they explicitly opt into it.
Consistency: Filament guarantees that any given device which supports a given
feature level will also support running on every feature level below, except for
feature level 0. This change removes that exception.
Validation: It's not perfect, and there will likely be bugs and unexpected
differences in behavior between ES2 and non-ES2 devices that crop up in the
future between two devices running on the same feature level. However, it's at
least a basic high level layer of validation that enables more rapid testing
workflows directly via desktop versions of Filament rather than having to fiddle
with something like ANGLE to get perfect GLES 2.0 compliance. Additionally, it
expands options for automated testing (with the same caveats).
This change has been tested on both the desktop and Android versions of
hellotriangle.
* prevent public classes from being created on the stack
- we used to to this by deleting operator delete, but this prevented
the internal "F" classes from being virtual; which can be useful
when using EntityManger::Listener.
now we just make the destructor protected in each class.
- EntityManger::Listener now has a virtual destructor so that
objects could be correctly destroyed from Listener*
* improve EntityManger and Component managers
- all component managers now have the same "base" API
- getComponentCount()
- empty()
- getEntity()
- getEntities()
- Scene now has getEntityCount()
- EntityManager now has getEntityCount()
- all component manager implement gc() the same way, by calling destroy()
- SingleInstanceComponentManager::gc() that calls removeComponent() has
been removed because it's dangerous. removeComponent() is often
not enough, some additional cleanup might be needed.
CameraManager creates a Transform component for each Camera component
is not already present. However, it didn't destroy the transform
component when it's itself destroyed. the leaked transform component
would eventually be garbage collected, but caused significant
slow down and memory pressure. This is because camera components are
created every frame for the shadow maps.
FIXES=[303914944]
- the insert and retrieve handlers can now be set/unset independently.
this could be useful for debugging.
- program caching is disabled if the GL implementation doesn't support it.
- removed unused code
FIXES=[307549547]
- Ensure that waiting on lock times out so that we don't lock
up a thread when the client is gone.
- Add an experimental folder to matdbg/web/ for the new
UI work.
The transient property `mRootNotes` in FAssetLoader is built when a new
root asset is created and referenced whenever a new instance is created.
So it incurs an undefined behavior when a previously created asset tries
creating a new instance after a newly created asset has already created
via the same asset loader.
Move this transient property to each asset so that they can reference it
when a new instance is created.
This partially fixes#7269
There's no functional change in this commit.
Make some parameter names more legible by renaming them and put output
parameters to the right of their function.
The temporary variable has been used to store the current instance of
FFilamentAsset being loaded for easy access from internal methods. This
causes a crash as to a complex scenario as follows.
val asset1 = assetLoader.createAsset(assetBuffer1)
val instance1 = assetLoader.createInstance(asset1)
val asset2 = assetLoader.createAsset(assetBuffer2)
val instance2 = assetLoader.createInstance(asset1)
As the first step of fixing this issue, remove the transient property
`mAsset` from FAssetLoader. This commit alone doesn't resolve the issue,
and more commits are following.
Consolidate the low level version of createInstance, which takes a
pointer to cgltf_data type, into the high level version as the latter
one uses a parameter for FFilamentAsset instead of referencing mAsset.
Update all other relevant methods to take a FFilamentAsset pointer
instead of cgltf_data.
This partially fixes#7269
* debugging PCF mode
This mode always uses a hard PCF and takes a
slightly slower code path.
* dynamic shadowmap visualization
The directional shadowmap visualizer is implemented behind a
specialization constant. Add the DebugRegistry infrastructure to be
able to update the spec-constant at runtime and have a subset of
all materials invalidated.
This allows to toggle the visualization at runtime using a debug
property.
This is also a proof of concept that we can update spec-constants
at runtime; we could probably leverage this work for engine-wide
shader configurations.
* Update main.fs
* Update filament/src/details/Material.cpp
Co-authored-by: Powei Feng <powei@google.com>
---------
Co-authored-by: Powei Feng <powei@google.com>
- Remove queue submit call when using headless swapchain. It was meant
to emulate a real swapchain, but queue submits are expensive.
- Add option to remove flush and wait when window resizes. If a
headless platform uses this signal to refresh the swapchain, we
don't necessarily need it to also flush and wait before the refresh.
- Refactor VulkanPlatform customizations
- shadows are now stable (in stable mode) when an IBL rotation is
used.
- fix the shadow transform option which didn't work when an IBL rotation
was used
- also use the x-axis as a reference for the "up" direction when
computing the light space matrix so that we don't fall into the
degenerate case when the light points straight down, which is a
common case
FIXES=[299310624]
- Use a hanging-GET approach to reduce dependency on websockets.
- Also add mutex to protect access to MaterialRecords, which is
written to/read from from multiple threads.
The websocket code for parsing the EDIT command is pretty verbose.
Proposing that we move to a HTTP POST request instead.
Also moved the API handler code out of DebugServer.h for clarity.
- reverse the link and original relationship between
docs/viewer/filament-viewer.js and
web/filament-js/filament-viewer.js
- symlink in github pages does not seem to link to outside of the
/doc directory (it does not get pulled in during deploy).
- setProjection and setLensProjection are now less special, they can
now be entirely implemented by the user thanks to two new helper
functions. Everything can now be done with setCustomProjection.
- fix some out-dated comments
- remove dead code
- reorder methods in Camera.h
- Pin lit to version 2.8.0 (to fix a breakage caused by new
release).
- Update viewer filament version to latest
- Use symbolic link instead of having two copies of the same
file. (Could we consider removing `filament-viewer.js` in
`web/filament-js/` ?)
- Update `web/filament-js/README.md`
When we update the Far plane in the projection matrix, we assumed the
shape of the matrix. This fell appart when the projection matrix was
(for instance) a blend between an ortho and perspective projection.
We now do this more generally, that is, with less assumptions on the
projection matrix shape.
- Return the correct SubresourceRange for depth attachments
- Fix transition for when one layout within mulitple mip-levels
is different
- Use implicit layout transition for renderpasses
- Fix access mask for sampler in vertex shaders
- Use unordered_map for VkSubresourceRange in VulkanTexture
This reverts most of commit 9a6b8bf24e. The hello
triangle sample remains unreverted.
The original commit inadvertently broke screen space reflections, and perhaps
other features when the default material was used. The source of the issue is
that MaterialBuilder.cpp (correctly) filters out variants that aren't supported
in feature level 0 materials, including screen space reflections.
Unfortunately, while the "feature level 0 compatibility" feature itself was
intended to make creating duplicate materials like this redundant in client
code, unfortunately, it seems the best solution for resolving this issue is to
simply keep these redundant materials in the core.
To elaborate: clients should expect that feature level 0 materials that they
create work on /all/ feature levels /exactly/ or /close to exactly/ identically.
This includes restricting more advanced features that theoretically could be
available on a higher feature level, like SSR. It's already true that if a user
would like to optionally opt-in to a more advanced material which takes
advantage of more advanced features, they would have to maintain two separate
versions of that material: one for feature level 3 and one for feature level 1.
It should be no different in this case.
However, the materials built into the engine core are an exception to this
expectation. Given that feature level 0 was tacked on after the fact with fewer
features, there must /by necessity/ have been a new material introduced for both
the default material and the default skybox specifically for feature level 0
with fewer features than extant client apps expected to be included by default.
I imagine if filament were to be rebuilt from the ground up, this exception
wouldn't exist. However, the end result is this somewhat messy redundancy.
Instead of a hard cutoff, we fade shadows out at
the shadowFar distance if active, fading occurs
over about 10% of the shadowFar distance.
- this works only for the directional shadow
(other lights don't use shadowFar).
these materials would not generate proper structure or shadow buffer,
because they used a special variant that in most case removed the
user code.
now when the user code writes the depth or calls discard, the user
shader is kept.
- fix IDE warnings
- rename ASTUtils to ASTHelpers to match filename
- break dependency of ASTHelper on GLSLTools
- break dependency of GLSLTools on MaterialInfo
Previously, when a material was expilicitly built for feature level 0, it was
necessary to write ESSL 1.0 code which was incompatible with the OpenGL feature
level 1 implementation in Filament. Rather than adjust the Filament
implementation so that feature level 1 uses the same workarounds as feature
level 0 to emulate GLES 3.0+ features, it's more runtime-efficient to have matc
embed ESSL 1.0 and 3.0 shaders as separate chunks and load the corresponding
one based on the active feature level.
Feature level 0 material shaders must still be written in what is effectively
ESSL 1.0. To assist with this, a small set of compatibility definitions are
introduced when building the ESSL 3.0 variant. These are virtually all
textureXXX() functions, and end up either optimized out or inlined by glslang.
One final effect of this change is that external and 3D samplers are now
properly supported in feature level 0 materials.
This informs a broad variety of text editors of some of the formatting
conventions of the filament project. It may be possible to replicate some of the
more specific formatting conventions like brace placement in a future commit.
all textures declared in a shader must be bound, in the case of the
new skinning API, if less than 4 bone weights are used the texture is
not needed, but it must still be bound.
* Add skinning and morphing samples to check functionality
* Implement skinning for more than four bones pair vertex
The API allows defining an unlimited number of bone indices and weights of primitives. Data is defined in building process of the renderable manager. Backward compatibility with the original solution.
Skinning of vertices is calculated on GPU, data is transferred to the vertex shader in the texture.
To reduce CPU on draw(), we move the VertexBuffer related metadata
out of draw into a cache. We store the cache info outside of the
actual VulkanVertexBuffer class since VulkanVertexBuffer subclass
HwVertexBuffer, which is a Handle meant to be minimal.
This means that we need to cache on the heap, but it should be ok
since the caching is only for scene set-up and not per-frame.
The latest macOS toolchain triggers warnings for duplicate libraries at
link time. This is caused by our dependency chains.
Also remove an inlining warning in Kotlin and unnecessary warnings in
build.sh when doing a clean or generating web docs.
Functionally this shouldn't be too different, but we have some
improvements:
- better detection of "no shadows" cases
- more computations done in light-space, which should result in
better light frustum.
- spit the code into several static functions
This features didn't work well, had a lot of artifacts and generally
wasn't very useful. This kind of effect should be accomplished
differently.
This is an API break because BloomOptions::anamorphism has been removed.
- add a quality option
- remove the ping-pong code, we'll disable for GPU that don't work
instead.
- improve quality by doing a better first downscale
(using a 5x5 gaussian).
- improve performance by using a 9 tap filter instead of 13 for
in most cases
- fix usages of setMinMaxLevels as it resets the base level to "min"
We allocate all the fences beforehand to reduce calls to
vkCreateFence.
Also remove blocking code in `getFenceStatus` since there is
not a usecase that would require that.
We were calling vkFreeCommandBuffers directly, but resetting
the buffers implicitly (when vkBeginCommandBuffer is called)
seems to be a lot more performant.
Also, cleaned up destructor for VkBuffer to no longer require
a separate terminate() method.
This is admittedly a very nitpicky change.
For most of the changes, I went through the various Markdown files and added
language names to the source blocks for better syntax highlighting on GitHub. It
also makes it easier to copy and paste commands without copying the leading `$`.
I avoided changing anything in `third_party`.
Additionally, I added some instructions for compiling the Android samples on the
command line and fixed some typos.
- use the geometric normal to apply the shadow bias. This affects
cascades > 0 and spot/point lights.
- use the scene's origin as a reference point for stabilizing the
shadowmap, this is more robust.
- clamp directional shadowmap correctly to the 1-texel border, which
needs to be reachable, as it is a valid value.
- don't snap the shadowmap to texel boundaries if stable mode is not
active (before we only didn't do it based on lispsm). Stable mode can
make the shadow unstable when both the camera and the scene move
together, so it's better to have a more predictable API where
"stable" mode means that the snapping occurs and doesn't otherwise.
- add "far origin" distance slider to the debug ui
FIXES=[299310624]
in stable mode the scale was ever so slightly varying with the
camera position, because it was calculated from the camera frustum in
world-space, this variation was amplified when the camera is far from
the origin, which eventually caused the modulo needed for snapping the
shadowmap projection to widely vary, leading to the instability.
We now calculate the camera frustum sphere in view space, which is
guaranteed to be constant. If "shadow caster mode" is chosen, we
quantize the scale a little bit so it stays constant.
The snapping code itself has been cleaned.
We wrote a bool directly into 4 bytes (as the first byte). This has two issues:
- the other 3 bytes are not initialized
- should be writing VK_TRUE/FALSE instead
* Don't force masked blending for transmission/volume materials
glTF lets you choose your own alpha mode when using the transmission
and volume material extensions. We were forcing the masked mode which
was incorrect, except to pass the standard tests.
* Update release notes
* Properly apply emissive to masked materials
The emissive property should not be multiplied by the color alpha
in masked materials. The alpha is treated as a coverage value in
that case, not an opacity value.
* Update release notes
This reverts commit 58f96be2c4.
This caused material files to increase in size significantly. It turns
out that glslang has to generate a copy for each parameter that is
passed to a function as a non-const parameter.
This revert will break IMG devices again, but that should be the case
only on debug builds. Release builds lose the const qualifier by
virtue of going through spirv. We'll try to address this some other
way later.
- separate out the settings for bloom, ssao and ssr
- update webgl binaries
- change default bloom resolution to 384 from 360 to have up to 7
mipmap levels vertically
- don't rely on it being 32-bits
- update the jni code to store SamplerParams in a long (64 bits)
instead of a int. This gives us some future-proofing of the java side.
* rework how we initialize the gl context
- early initialization is now implemented with static methods so that
it's very clear which state they need.
- the version number is no longer used outside of initialization,
instead we use the feature level.
- ES3.0 Adreno devices are downgraded to feature level 0
* Update filament/backend/src/opengl/OpenGLContext.cpp
Co-authored-by: Powei Feng <powei@google.com>
---------
Co-authored-by: Powei Feng <powei@google.com>
CompilerThreadPool:
- it now supports a thread cleanup function
- some initialization is moved to the setup function
OpenGLPlatform:
- now cleans-up the thread pool threads upon exit
readPixels requests staging memory to be host-visible/coherent/cached.
But "cached" is not supported on Mali (Pixel 6pro). We make it a
preferrable but optional bit.
turns out that KHR_surfaceless_context is implied for ES3.0 when
KHR_create_context is present. However, Adreno 306 fails even if
it advertises it. So, we now reset the value of KHR_surfaceless_context
based on actually calling eglMakeCurrent(EGL_NO_SURFACE).
- remove support for non-shared contextes parallel compilation.
this wasn't used. we can always revive it later if we need to.
- rework how callbacks work so that we don't have to use a work list
executed at each tick() in the shared context case (common case).
this improves performance significantly on low-end devices, by
not having to go through the list to check if all programs are
compiled, multiple times per frame.
The new CallbackManager handles scheduling the callbacks after all
previous programs are compiled.
The only use of this API was with a timeout 0 to check the fence
status. Timeouts other than zero could be very dangerous and since we're
not using that feature for now, we just get rid of it.
wait() is replaced with getFenceStatus(). It is currently only used by
the FrameSkipper.
This is not a public API.
This is more appropriate (and simple) than runEveryNowAndThen because
the later doesn't manage a fence, and therefore is more of a superset.
This will allow us to use a shared context implementation in the future.
This is technically forbidden by the GLES 3.x specification but many
GPU support it, which saves us a depth buffer copy.
Note that this is supported in GL desktop.
Destroying the FBO target of a blit operation causes a stall similar
to calling glFnish().
We workaround this by delaying all FBO destructions to after the
GPU is finished with the current frame.
In some situation, functions with const parameter cause the shader
compilation to fail without an error message.
We remove all the `const` qualifiers on functions, assuming this
shouldn't impact code generation a lot.
instead of moving the "urgent" compilation to the head of the queue,
we simply remove it from the queue and process it immediately. This
has the benefit that on drivers that truly support parallel compilation,
the latency will be reduced as we don't need to wait for the current
compile to finish.
* add a GLES compiler unit test
* Update filament/test/compiler_test.cpp
Co-authored-by: Ben Doherty <bendoherty@google.com>
---------
Co-authored-by: Ben Doherty <bendoherty@google.com>
Previously, we have a VulkanSync with a default constructor that
allows us to have sync objects that returns error when
actual fences are not yet present. We need to replicate that
with VulkanFence since sync objects have been removed from the
API.
Fixes#7034
We were breaking the promise of pending shader compilation jobs by
destroying the corresponding std::promise embedded in the job queue.
In practice there was no danger of a deadlock by construction, but
std::promise throws an exception in that case. On builds without
exception enabled, this we be turned into an abort().
We fix this by using our own mechanism for signaling instead of
std::promise. This ends up be more lightweight anyways.
Fix: #6933
We don't need to convert the object id to float,
instead we can just "reinterpret_cast" it.
With the current possible values of Entities, there was a risk of
overflow once the age gets to 128 (very rare).
The crashes are triggered by spirv-opt's MergeReturnPass, so we
just disable it. This pass also caused issues with AMD drivers on macOS.
fixes b/291140208
We get rid of the backend's HwSync object because on all platforms
but GL it was implemented just like a HwFence. We now use HwFence
instead.
On GL platforms though, HwFence doesn't exist natively it is instead
provided by the Platform. In that case, we emulate it as with GLSync
objects -- the emulation incurs some latency that can cause frames
to be skipped.
On Android and platforms that provide the Fence functionality, there is
no such issue.
This change improves significantly frame pacing on Android.
The frame latency specified was off-by-one, i.e. a value of 1 meant a
latency of 2. The default was 2, which meant 3. Also it wasn't possible
to specify the max latency of 4, which would OOB.
When the engine is shut down, it's possible for some parallel
compilation jobs (and callbacks) to be queued. We need to make sure
to clear the queues and call the callbacks before destroying the
parallel compilation service.
Fixes b/290388359
We were waiting for programs from both queues to be compiled before
calling the callback associated with one queue. In practice this caused
the callback associated with high priority programs to be called only
after low priority programs were ready.
Also cleanup-up "token" so that it doesn't store the priority.
Update the documentation and sample to better reflect what the
implementation does.
This PR sets up the ability for shaders to use `gl_ClipDistance`, which will be needed in the future. Desktop GL supports this natively. OpenGL ES requires the EXT_clip_cull_distance extension.
Unfortunately glslang does not support this extension, so we have to employ a workaround for mobile when going through glslang. We instead write to `filament_gl_ClipDistance`, and then modify the SPIR-V to decorate this as `gl_ClipDistance`. See the comment in SpirvFixup.h.
Note this PR does not actually use `gl_ClipDistance` yet, so there should be no change to shaders.
- Carry out readPixels without blocking and wait for the read to
complete own a separate thread.
- Add mContext.commands->wait() in finish()
- Wait for readPixels to complete in finish()
- Remove unused commandBuffer in Context
- Before, we supposed that the maximum number of input attachment
should match the maximum number of color attachments. But in
reality, we've only used one input attachment for the second
subpass.
- The problem with the above supposition is that the descriptor
set layout for the input attachment descriptor set must have the
exact number of input attachment specified in the shader. If the
*layout* has more input attachment slots than specified in the
shader, then we'd run into a validation error.
- In this patch, we fix the number of max input attachment in
the descriptor set layout to 1, since we ever only make use of
one.
Fixes#6513
The work queue is sorted by priority but when we insert a notification
job we didn't have a priority to use for insertion, in addition the
priority was taken from the token, but for the notification job we don't
have a token.
The fix consists in passing the priority around so we have it when needed.
The scene graph was using the wrong boolean to decide whether to fallback
to low quality upscaling when the render target is translucent. It was
instead only looking at the view's blending mode. We need to check both,
as color grading does in the same function.
- Use of getter method instead of property access syntax (for getInstance)
- 'rangeTo' or the '..' call should be replaced with 'until' (in for)
Co-authored-by: jeongth9446 <taehuniy@gmail.com>
when using the thread pool we were destroying the shaders immediately,
we need to defer this until we query the program link status, so that
in case of failure we can query each shader compile status.
we make the shader handles part of the promise/future so they can be
transferred to the main thread, just like the program id is.
vkCmdEndDebugUtilsLabelEXT expects that a label was "pushed" onto
the command queue (as described in the spec). It is possible to push
labels across command buffers, but the pushed label must still be in
the queue (unexecuted) when End is called. This implies that we need
to make sure the labels are in a good state (all popped) when
vkQueueSubmit is called.
- We add a stack to carry the labels across vkQueueSubmit.
- Also add CPU time durations between push and pop to provide rough
CPU execution times (in debug).
- Add systrace markers for Android systrace
- Validation failing due to incorrect layout wrt Blitter
render pass.
- Clear colors are also not needed for blitter renderpass.
- SAMPLEABLE + DEPTH_ATTACHMENT needs to have the correct layout.
- Timer query stashed a pointer to VulkanCommandBuffer, but this
points to an object that can be reused.
- We use the shared_ptr'd fence object instead to track whether
the query request has been completed.
It supports KHR_parallel_shader_compile as well as a
thread pool of GL contexts.
- we have a new 2-priorities queue for shader compilation
- use this feature in gltfio in the ubershader case
There was two related issues:
- we need to "latch" the new TextureView size when its resized. That
can only be done by recreating the EGLSurface (i.e. recreating the
SwapChain). UiHelper now calls onNativeWindowChanged in the case of
the TextureView resize, so clients can recreate their SwapChain.
- we also needed to make sure that all current filament frames have
finished to render (i.e. the last eglSwapBuffers has been called) so
that they don't pick-up a new size (this happens after
eglSwapBuffers) that doesn't match the viewport.
Fixes b/282220665
this is needed on armv7 because we use alignas to get strcture-alignment,
but that also implies (to the compiler) that the structure itself
is aligned properly.
the math needs to be maintained in highp, including during the blur
pass.
we add the ability to specify a "precision" qualifier to the "output"
of a post-process material.
we also remove the mediump clamping we used to do on mobile, it shouldn't
be done automatically behind the scenes, it's up to the shaders to do
it if it makes sense.
- flush() and wait() before destroying a swapchain
- Make sure the debug marker extension is enabled under correct
circumstances.
- Change shared_ptrs to unique_ptrs and raw pointers.
- Rename most teardown methods to terminate()
- Introduce new custom swapchain API for VulkanPlatform.h
- Implement the API for the base VulkanPlatform by refactoring
the existing swapchain code.
- VulkanSwapChain is now a wrapper for VulkanPlatform's
swap chain API.
- Actual implementation is in
vulkan/platform/VulkanPlatformSwapChainImpl.{h,cpp}
this is implemented by here by using the skybox texture and blurring it
with the irradiance filter + mipmapping. This only creates a subtle
anisotropic phase-function effect.
- we must sort commmands *after* we have added all commands!
- custom commands could change the UBO/Sampler bindings so we need
to make sure to invalidate them after executing the command.
When sampling the fog color from the IBL we need to take into account
the IBL transform. This broke recently when the for calculation was
moved in user world coordinates.
* froxelizer doesn't use textures anymore
All data is stored in UBOs.
Additionally, the buffer size is no longer hardcoded at compile time.
This CL cuts in half the numbers of froxels to accommodate devices
limited to 16KiB,
* Use a specialization constant to adjust the size of froxel UBO
This really only affects OpenGL in practice because metal supports 256MB
minimum and only 3% of android devices support less than 32K, which is
what we need.
- Passing filename.bin to --sh-output generates a file containing the SH
as binary floats in native endianness (LE on x86 and arm64)
- Fix --sh-output so it works properly with -x
- Cleanup variable names to avoid shadowing
We need to keep the context handles as part of the platform
class so that we can implement the new swapchain API based
on them.
- Move creation of "context" handles from Context to Platform
- VulkanContext contains immutable data
- Change constructor of classes that depended on VulkanContext
- Move timer query logic from VulkanDriver to VulkanTimerQuery
and VulkanTimestamps
Fog can now be opted-out on a per renderable basis. When fog is disabled
on a renderable it removes the requirement that this renderable's
materials have the FOG variant.
This works by making the fog an entity which can be used to create
a TransformManager component an participate to the transform hierarchy.
This feature can be used as more advanced way to set the fog's floor,
which now can have an orientation (essentially be a plane).
This is useful for coordinate systems that are not y-up.
Material::compile() can be used to asynchronously ask the backend to
compile a subset of the variants of a Material and be notified when
done. This can be used during initialization to avoid hiccups later.
This will also force caching of those material programs if the
Platform provides the blob cache API.
This doesn't add or remove functionality, but merely changes the API
to create an Engine, to be more consistant with how we construct other
objects in filament.
You can now use Engine::Builder to construct an Engine.
By default ancillary buffers are discarded on commit (eg. depth buffer),
but in certain situations the platform may want to preserve them. this
new virtual allows a concrete implementation to specify which buffers
need to be preserved.
Note that in ES2 mode, the depth value returned by the picking API
only has 8-bits precision and incurs a bigger performance penalty,
because glReadPixels is synchronous.
This is only the first step where we:
- clean-up some code to prepare for 2nd step
- add support for the linear->srgb in the shaders
The linear->srgb conversion is protected by a
specification constant and will be enabled only
if the corresponding EGL extension in not present.
Then, if enabled, the actual conversion is
controlled by a uniform so that it can be
selectively enabled on swapchains that have it
turned on.
In this change, the emulation logic that sets
these gates is not implemented (that's step 2).
This CL contains two parts:
- changes to matc/filamat
- changes to filament itself
Filamat can now generate ES2 compatible shaders. Only the unlit variant
is supported. Fog and picking are supported as well.
post-processing, skinning, instancing, all lighting and shadowing are not supported.
Filament is updated to not issue commands that are not supported in ES2.
Addtionnally, the hello-triangle sample is updated to work on an ES2 device.
From the backend's point of view, UBOs are emulated with uniforms.
The backend will maintain a data structure that maps an offset into
the UBO to a uniform and will do the appropriate glUniform* calls
at the right time and if needed (e.g. only if the UBO content has
changed).
The mapping from an UBO content to uniforms is passed to Program
upon creation.
This first round is mostly about making the backend compile with the
ES2 header only and use the ES2 code path when running on an ES2
context. We also add feature level 0, which corresponds to ES2
devices.
We introduce the macro FILAMENT_SILENCE_NOT_SUPPORTED_BY_ES2, which is
exclusively used to compile-out code that cannot compile with ES2
headers. This macro is active when ES2 headers are used.
There is also a new `OpenGLContext::isES2()` method that is returns
whether we're running an ES2 context, either statically (desktop, ios)
or dynamically (mobile).
This PR should add or remove any functionality.
A material global is a variable seen by all materials. There are 4 such
variable which are all vec4 and they can be set on a per-view basis.
All materials used during Renderer::render() will see the same value.
These variable can be accessed in the materials by using
getMaterialGloabal{0|1|2|3}.
* vulkan: Fix Adreno issue with optimized material
Turning off the simplification pass seems to remove all of the
artifacts associated with Adreno GPUs.
- `-g` option was missing from `build.sh`
- `-Pcom.google.android.filament.matnopt` should be passed to
the sample apps as well.
- Add logic to respect `-Pcom.google.android.filament.matnopt`
in `FilamentPlugin.groovy`
They're not hardcoded in a database inside MaterialBuilder and are
generated. This will be needed later for ES2 support.
We could actually imagine something more dynamic in the future too.
- Moved most of the layout transition logic into VulkanImageUtil
so that we'd have a single place to consider if failure arises
- Add an abstraction on top of vk's layouts so that reasoning
about our use cases (and corresponding layout is easier).
- Removed the redundant VulkanDepthLayout
- Refactor VulkanTexture::transitionLayout so that most of the
transition paths can be done through this entry point. It also
enables us to handle tracking the current layout.
- Add a special case to transition the depth attachment/texture
if it is both a sampler and an attachment.
- Add a few debug printing markers across the classe - guarded
under the existing FILAMENT_VULKAN_VERBOSE define.
- refactor the code so that all defines are generated in the same place
- generate common_type after all defines are generated
- protect (with defines) structures and UBOs that are not needed, based
on the variant
* Support the external image on macOS
Implement CocoaExternalImage.
* Fix to take an onwership of the external image
* Correct incorrect comments
* Rename a function explicitly
Make a function name to know copying RECTANGLE to TEXTURE2D.
* Do lazy initialization
Create CocoaExternalImage::SharedGl when it's needed.
* Fix a crash when engine is terminated
Destroy the external image shared gl before gl context is destroyed.
* Remove an useless variable
* Improve size optimizations when compiling material
This changes the behavior of the size optimizer in matc (-S), but
only for GLSL and MSL. With this change we gain a ~65% size reduction
on a lit material compiled for OpenGL. To get those gains we generate
extra SPIRV debug information to preserve variable names and better
utilize the line dictionary. Unfortunately this break the SPIRV
optimizer so we skip it and instead rely on a simple DCE pass provided
by glslang. We also enhance the whitespace removal pass of the GLSL
minifier to move lone { and } to the previous line, which avoids
generating an extra index in each shader variant. Each index being
at least as big as the character itself, this is quite wasteful.
When generating SPIRV for Vulkan, we rely on spirv-opt for size
optimizations as before.
Some shaders can be shared across all materials (e.g. the depth
shaders). We use the filament default material as the "source" of
the cache, but until now we relied on an a priori knowledge of which
variants were present in the default material.
With this change, we now query once the list of variants (of interest)
in the default material and reuse that list for caching these variants
later.
This is better because the cached variants are now entirely driven by
the default material (which they depend on anyways). This is also faster
because we don't need to query which variant we need each time we create
a material.
- Depth attachment layout has generated a lot of error due to
it being read-only. But the store-ops for the attachment during
the renderpass are all write ops. We set the depth attachment
layout as VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL
- Enable extra blitting step for SSAO because of the above layout
conundrum.
- Index buffers did not have a pipeline barriers after loading
them.
- Remove `assert_invariant(utils::popcount(sampleCount) == 1);`
from `reduceSampleCount`. This assert fails when enabling the
duplicate pass for SSAO.
The VSM variant is never needed for unlit materials, it was filtered
out correctly for color shaders but not for the depth shaders.
This removes 4 variants from all unlit materials.
Also improve matinfo variants output.
* Suppress numerous libz warnings
zlib will fix this issue later (see https://github.com/madler/zlib/issues/633).
For now we will just turn off the warning.
* Fix Windows build
The feature-level option sets the maximum feature level allowed for
the material. matc will fail if the specified material has a higher
feature level than the value set with the feature-level option. The
default is 3 (max).
This can be used to ensure that materials don't use features above
a specified level.
... which is needed to avoid compilation error with recent LLVM libc++ (after https://reviews.llvm.org/D146097). Previously, symbols like std::terminate() were available through other headers (e.g. <functional>, <vector>, etc.).
- FogOption::color is now correctly multiplied by the exposure and
environment intensity.
- New option to exclude the skybox from the fog
- better documentation and naming
In device vertex domain it is easy to specify infinite world-space
coordinated (e.g. with the skybox), this results in complications
everywhere. We were mitigating this by essentially moving these
coordinates to around 16000 (give or take depending on the near plane),
but that was way too small.
Now we move them around 1e19, which seems to work. It's important with
applications rendering very large scenes to not use too small of a value.
This happened because the code iterated over the keys of a hashmap,
which obviously were not guaranteed to be in the same order as those
entries where added to the hashmap.
We fix this by adding a "visitor" to the materialchunk, so we can
iterate through it in order and retried the info we need.
When we call TextureProvider::cancelDecoding, we should make sure
that textures that have been decoded, but not yet used (popped)
should be released (i.e. memory freed and the meta data marked
appropriately.)
The PixelBufferDescriptor was not being deallocated properly, which
resulted in the leak. This patch explicitly deletes the
PixelBufferDescriptor at the end of the callback to prevent the leak
.This is necessary as the move constructor does not automatically
deallocate the existing PixelBufferDescriptor.
An attachment would be wrongly discarded if used as read-only: because
it didn't have a dependency to it, the discard flag was set.
This is fixed by setting the discard flag only if a resource is written
and has dependencies to it.
Should fix: #5005
- Add a TaskHandler class for process events on the backend
thread
- Check fence status when handler runs and copy bits to the client
when the fence is reached.
* Add way to retrieve the user world-space in materials
added `getUserWorldFromWorldMatrix()` and `getUserWorldPosition()` to
retrieve the API-level (user) world position in materials.
Deprecated `getWorldOffset()`
`getWorldOffset` didn't work when an IBL rotation was applied.
* fix large scenes with an ibl rotation
Rotate the IBL around the camera instead of the world so that the camera
is always at the origin regardless of the rotation.
* Add new alphaToCoverage material property
The alphaToCoverage property lets you enable or disable alpha to coverage
in a material. More importantly it lets you overrides the behavior of
blending: masked which automatically enables alphaToCoverage.
* Update release notes
The fog calculation could fail when the falloff was strong and the
camera z position was far from the fog height. The problem was
that the computation was spread between the cpu and gpu by splitting
an exp(), but with certain parameters the two exp() would independently
blow-up. We fix this by doing the exp() only on the gpu side.
try to be more explicit about which configurations are supported,
and use the same pattern everywhere for checking the gl version at
either compile and runtime.
This didn't happen in practice, but we would call (null) if the
clip_control extension was available and we were not on GL 4.1.
We "fix" this by not supporting the clip_control extension on desktop.
It doesn't matter because clip_control is core as of 4.5 and is
present in only 8% of non 4.5 GL implementation, and this doesn't
include any macOS versions.
This small abstraction is (will be) needed because GLES 2.0 doesn't
have sync object, however synchronization is sometimes available
externally, in particular with EGL.
This PR doesn't provide other implementations.
- make sure to initialize all extension booleans, we treat them
as feature flags.
- be more explicit about #define'ing gl tokens, so we can more easily
catch errors later.
- don't blindly use extension tokens that might not be available
(e.g.: GL_TEXTURE_EXTERNAL_OES or GL_TEXTURE_CUBE_MAP_ARRAY). It
would probably cause a spurious gl error.
- iOS is treated the same way than Android now. The only difference
is that iOS only provides prototypes and no typedef, whereas
Android only provides typedefs and no prototypes.
- Timer queries is core in GL and an extension on all versions of
GLES. On iOS it's not available at the header level. An additional
subtlety is that glGetObjectuiv is core in GLES 3.0, so it conflicts
with the extension.
So, now we do things correctly:
- on desktop we use the core methods
- on ios we ifdef out everything related to timer queries
- on gles 2.0 and up we use only the extension entry-points
We were not testing for that case properly. This case is taken when
either:
- depth & stencil textures are the same and not null
- or, only depth is specified but both attachments are requested
Also cleanup the dimension checks in debug builds.
- some of the convenience are not available in ES2
- it's less efficient
- we can save some PBO space when reading back
a partial framebuffer.
We also avoid using GL_PACK_ROW_LENGTH which technically is not
a convenience, but in our case we are doing an extra copy anyways, so
we can account for the row-length at that point.
- We now call visitScene only once for the directional shadowmap instead of
1 + cascade_count times.
- Don't use visitScene for spot shadows
it was only used to compute the near/far planes, but instead we can
use the radius of the light. This could degrade the quality of the
spot shadows, but this can be corrected by setting a correct radius.
caveat: currently the near is hardcoded to 0.01 units. this should be
user-settable.
- Improve performance of visitScene calls
instead of transforming 8 points of an AABB and finding the min/max,
we transform the AABB and use its minz and maxz. This works for affine
transforms.
This change also cleans-up AABB and Box transform APIs, which also
are now inline.
- StructureOfArray: don't initialize trivial ctors
We mimic the behavior of std::vector<> here, where a resize() won't
initialize the array if the type is trivially_default_constructible.
This can reveal existing bugs, where we depended on the initialization
to 0.
- StructureOfArray: add push_back(std::tuple<>)
This basically allows us to push_back() a struct of the SoA.
- Make PerRenderableData trivially constructible
this improves performance when we have tons of objects in the scene
because PerRenderableData is used in arrays.
We use isnan() to detect that the estimation of directional light
parameters from the IBL has failed, however isnan() always returns
false with -ffast-math, which we're using in release builds.
This works around that. The affected code is non essential, performance
is not a concern here.
When loading a glTF file on platforms without a filesystem, a client
calls `addResourceData` to populate `ResourceLoader`'s cache with data.
For example, a web client might make HTTP fetch requests to fill in
buffer data. This internal cache of data is stored in
`ResourceLoader::Impl::mUriDataCache`.
This works well, except the cache must persist until after it has been
uploaded to the GPU.
There was already a mechanism (see `uploadUserdata`) in place to ensure
that glTF data persisted until after it had been uploaded to the GPU.
However, this mechanism did not extend to client-provided data. Thus, a
race occured between Filament's driver consuming the buffer and it
getting freed.
Instead of storing the arrays into an array of void*, we use a
tuple<> instead. This improves debugging because now the tuple<>
has pointer with the correct types.
It also improves most of the code except `push_back` which now
relies on a hack -- this is the only place where I'm not able to
resolve the array strictly at compile time, even if in practice it is.
UbershaderProvider.getNativeObject() is accessed from AssetLoader.cpp,
so it must be annotated with @UsedByNative("AssetLoader.cpp") to
avoid runtime crashes when minification is applied.
Fixes#5944
`-d` now enables matdbg and adds debugging data, but doesn't affect
material optimization
`-g` disables material optimizations
A similar change is done with gradle options. The new proprety
`com.google.android.filament.matnopt` is used to disable material
optimizations.
These options mimic `matc` options.
@@ -40,8 +40,8 @@ Here are all the libraries available in the group `com.google.android.filament`:
| Artifact | Description |
| ------------- | ------------- |
| [](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/filament-android) | The Filament rendering engine itself. |
| [](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/filament-android-debug) | Debug version of `filament-android`. |
| [](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/gltfio-android) | A glTF 2.0 loader for Filament, depends on `filament-android`. |
| [](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/gltfio-android-lite) | Trimmed version of `gltfio` that does not support some glTF extensions. |
| [](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/filament-utils-android) | KTX loading, Kotlin math, and camera utilities, depends on `gltfio-android`. |
| [](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/filamat-android) | A runtime material builder/compiler. This library is large but contains a full shader compiler/validator/optimizer and supports both OpenGL and Vulkan. |
| [](https://maven-badges.herokuapp.com/maven-central/com.google.android.filament/filamat-android-lite) | A much smaller alternative to `filamat-android` that can only generate OpenGL shaders. It does not provide validation or optimizations. |
@@ -50,19 +50,9 @@ Here are all the libraries available in the group `com.google.android.filament`:
iOS projects can use CocoaPods to install the latest release:
```shell
pod 'Filament', '~> 1.56.0'
```
pod 'Filament', '~> 1.31.3'
```
### Snapshots
If you prefer to live on the edge, you can download a continuous build by following the following
steps:
1. Find the [commit](https://github.com/google/filament/commits/main) you're interested in.
2. Click the green check mark under the commit message.
3. Click on the _Details_ link for the platform you're interested in.
4. On the top left click _Summary_, then in the _Artifacts_ section choose the desired artifact.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.