* HandleAllocator::deallocate() was unsafe
It needs to know the concrete type to call the proper destructor, so
if it was given a base type handle (e.g. Handle<HwFoo>) it would not
destroy it properly.
* add AsyncJobQueue::cancelAll()
* Fix a race condition when tearing down FrameInfo
It is actually invalid to destroy a Handle<HwFence> while inside
fenceWait().
Updated the HwFence implementations so that they don't pretend they can
handle being destroyed during fenceWait(), they can't.
FrameInfo now cancels all the pending callbacks and waits for the
currently executing one to terminate, *before* destroying the
handles.
Reenable the gpuFrameComplete metric, as it should be working now.
* Use a custom, non-allocating map for frame ids
* use memory_order_relaxed for the id of heap allocated handles
* Added displayPresent time to FrameInfo
To do this, we added support for queryFrameTiming() to the backend,
as a synchronous API. Then FrameInfo uses it to update the
corresponding history entry.
displayPresent time can be used to detect "buffer stuffing", i.e.
when the GPU gets too much ahead of the display. Currently
the information is returned to the user only. Eventually filament will
make use of it to determine if a frame skip is mandated.
* API BREAK: rename frameTime to gpuFrameDuration
* FrameInfo now contains compositor timings
- the presentation deadline
- the refresh rate from the display
- the composition-display latency
* set FrameInfo data to INVALID if feature is not supported
report presentDeadline properly.
* Add missing includes.
* Add parenthesis around operators to suppress compiler warning.
* Remove duplicate definition of is_supported_aux_t.
* Explicitly create descriptions.
* Remove usage of anonymous struct with non-trivially constructible members.
This is an error on some compilers (e.g. gcc).
* Remove unnecessary rvalue-reference on pointer type.
* Explicitly construct SamplerParams to suppress compiler warnings.
* Place attribute specifier before declaration.
* Remove usage of anonymous struct with a non-trivially constructible member.
Replace the `array` union member with an `operator[]` to provide similar
functionality.
Some compilers (e.g. gcc) do not support this non-standard use-case.
* Use same warning settings as main filament project.
* fix a possible deadlock in AsyncJobQueue
drainAndExit() could get stuck because it waited for the job
queue to be empty, but that was not signaled.
in fact, drainAndExit() didn't need to do that.
* Improvements to FrameInfo
- return the GPU Complete timestamp
- return the app VSYNC timestamp
- works TimerQueries are not supported
The VSYNC time is just a convenience as it is the same value
provided by the application during Renderer::beginFrame() or via
Renderer::setVsyncTime().
* Make JsonishParser use utils::Status
- add unsupported error in utils::Status
- add invalid case in the test
* return a pair of Status and string in resolveEscapes;
use initializer list for JsonishString and move that to the header.
* Use utils::Status in MaterialParser
* Use utils::sstream instead of std::stringstream
* Remove remaining std::cerr and dep; update MaterialParser::reflectParameters
* make error message in utils::Status more generic
---------
Co-authored-by: Powei Feng <powei@google.com>
ImmutableCString is a string class similar to CString except it's
immutable. ImmutableCString occupies 16 bytes instead of 8 for CString.
However, ImmutableCString is able to avoid memory allocation when
constructed from a string literal, and in that way it us similar
to StaticString.
ImmutableCString can be auto converted from StaticString.
The backend tag tracking is updated to use ImmutableCString and
the FrameGraph resource manager us updated to use StaticString.
Together these changes significantly cut down heap allocations due to
internal tagging.
We also add optional tracking to {Immutable}CString.
* Minor changes in utils::Status
- << operator doesn't have to be friend
- simplify getErrorMessage to not use strlen internally
* Replace std::ostream to utils::io::ostream
* utils: RefCountedInternPool/RefCountedMap
First, introduce RefCountedInternPool, a reference counted intern pool of
Slice<const T>. Just acquire() a slice that you want and you're guaranteed to
get exactly one canonical value-equal Slice<const T> back.
Additionally, introduce the concept of NullValue to RefCountedMap. A NullValue
defines what should be considered an uninitialized value; by default, it's the
default value of that type (0 for ints, nullptr for pointers, etc). This allows
us to lazily-initialize values in the map. A client can acquire() a bunch of
different resources which will be initialized only when get(factory) is called.
If a client attempts to get() a value without specifying a factory, and the
value is not initialized (i.e. equal to NullValue{}()), RefCountedMap will
panic.
* utils: add unit tests for ref-counted collections
* utils: remove C++20 features, fix memory issue
* utils: remove RefCounted from InternPool
* slice: fix memory semantics
* slice: prefer passing slice by value
This lets us do nice things like coercing Slice<T> to Slice<const T>, etc.
* slice: fix unit tests
* slice: fix copy/assignment, hash function
Don't attempt to define a copy constructor/assignment operator which would
convert a constant type to a mutable type.
Additionally, fix the hash function such that we're hashing U instead of const
U.
* new utility AsyncJobQueue
this is a very simple job queue, it spawns a thread and runs the jobs
pushed to the queue in sequence.
* use AsyncJobQueue in OpenGLTimerQuery
* materials: introduce MaterialCache
Presently, Filament Materials are instantiated by first parsing a bunch of
read-only data from a material file, then applying a bunch of options from its
Builder before settling on a final, immutable Material object. If two different
Material instances need to be parameterized differently, e.g. setting their spec
constants independently, each has to do all of these steps independently for
each variation.
This change introduces two new concepts: MaterialDefinition, representing the
deserialized, read-only state of a material file, and MaterialCache, a
reference-counted system responsible for managing the lifetimes of
MaterialDefinitions. Now, each Material asks the cache if a MaterialDefinition
exists for the particular UUID of the data it's trying to read; if not,
MaterialCache creates a new entry transparently. If a hundred different
Materials all try to load the same material data, only one
MaterialDefinition (and its associated GPU resources) will be created.
This first PR is the least possible invasive implementation of this feature.
There are a lot of room for improvements (and more planned). For example, each
Material still manages its own compiled shader program cache, but we can easily
move this to the MaterialCache in future PRs, further enabling the planned
mutable spec constants feature.
Additionally, there's room here to add a Material::toBuilder() method, which
could take an extant material and create a Builder object from it already
parameterized with all of the same options, a la the prototype design pattern.
* material cache: key on crc32
* material cache: make materialParser private
* move RefCountedMap to utils and add unit tests
* material cache: make create functions private
* material cache: fix broken tests on iOS/web
* material cache: address more comments
The new cmake FILAMENT_ENABLE_PERFETTO must be set to enable
perfetto traces on Android. It is disabled by default on release
builds, enabled otherwise.
The reason for this is that the perfetto SDK adds about 800K of code
to the library. The text section goes from 2.2 to 1.4 MB
- libbackend (except webgpu)
- libfilament
- libutils
std::string generates a lot of code bloat, we use CString instead.
We also update CString to be more compatible with std::string's api.
This commit introduces a CRC32 checksum to material packages to ensure
data integrity.
When a material is loaded, this checksum is verified. If the check
fails, an error is logged, and the material fails to load. For older
material packages without a CRC32 checksum, a warning is logged and
proceed.
BUGS=[373396840]
We link the behavior of certain precondition/postcondition checks to feature flag states.
- If provided *flag* is true, then this macro will assert when the *condition* is false.
- If provided *flag* is fase, then this macro will output a warning when the condition is false.
- If the condition is true, then neither of the above will happen
These two macros will enable us to provide less restrictive behavior for certain correctness assertions, allowing clients to modify their before enabling these assertions by default.
The addition JobSystem.cpp allows for defining
FILAMENT_TRACING_ENABLED across targets.
Addingin FILAMENT_TRACING_ENABLED to the #if in Tracing.h prevents
perfetto from being included.
* re-do "Use the Perfetto SDK instead of ATRACE" (#8701)
This time we create an entirely new private header: Tracing.h which
uses the perfetto SDK instead of ATRACE. The old Systrace.h is
unchanged to presever backward compatibility but is essentially
deprecated and no longer used within the filament repo.
The new TRACING_ macros use an explicit CATEGORY parameter, which is
declared in Tracing.h.
Moreover, tracing can be compiled out by defining
FILAMENT_TRACING_ENABLED to false before including Tracing.h
iOS tracing is still supported and still controlled via
FILAMENT_APPLE_SYSTRACE.
There are three perfetto categories defined:
- "filament/filament"
- "filament/jobsystem"
- "filament/gltfio"
The "filament/jobsystem" category is compiled out by default.
And they can be enabled in AGI / perfetto by adding:
```
data_sources {
config {
name: "track_event"
track_event_config {
disabled_categories: "*"
enabled_categories: "filament/filament"
enabled_categories: "filament/jobsystem"
enabled_categories: "filament/gltfio"
}
}
}
```
* Update libs/utils/include/private/utils/Tracing.h
Co-authored-by: Powei Feng <powei@google.com>
* Update libs/utils/src/android/Tracing.cpp
Co-authored-by: Powei Feng <powei@google.com>
* remove all references to SYSTRACE_TAG
---------
Co-authored-by: Powei Feng <powei@google.com>
We add rvalue version of append/insert/replace so that calling
those on a temporary yields to a temporary.
Also add string literal versions of those so that we can
append/insert/replace a literal without allocation, e.g.:
`CString foo = CString{ bar }.append("baz");`
This will not end-up creating a temporary CString for "baz".
clang optimizes code differently with very likely or unlikely
conditions, so we add a `VERY` version of these macros, and we
make use of it for assertions.
Perfetto has significantly less overhead. The User facing API is
mostly unchanged:
Here are the differences:
- SYSTRACE_ENABLE() does nothing on ANDROID, initializes systraces on darwin.
- SYSTRACE_DISABLE() is removed.
- A new "gltfio" tag is added.
- SYSTRACE_TAG *must* be defined before including `utils/Systrace.h`
- `utils/Systrace.h` should not be used from a public header
- the new SYSTRACE_TAG_DISABLE disables systrace at compile time
For android a data source MUST be created in the perfetto config:
```
data_sources {
config {
name: "track_event"
track_event_config {
enabled_categories: ["filament", "jobsystem", "gltfio"]
disabled_categories: "*"
}
}
}
```
This can for example be added to AGI's custom/advanced config.
FIXES=[407572663]
* Remove redundant qualifiers in filament public headers
* remove redundant qualifiers in filament implementation
* remove redundant qualifiers in libutils public headers
* remove redundant qualifier for libutils implementation
* remove redundant qualifiers for libmath
* use is_same_v<> instead of is_same<>
* bring back Builder::name()
we keep Builder::name() on all object, and forward to the MixIn class
that does the implementation, so that we have correct documentation, and
better IDE completion.
* add missing const parameters in filament's implementation
* various source cleanup
- missing includes
- missing const
- C cast style
- superfluous inline keyword
- when inserting an entry at a root other than zero, we need to update
the children count of the root's parent.
- the QuadTree array nodes need to be able to encode enough indices for
the largest "layer" in the tree. With 7 layers the largest one has
4096 entries, so we need 12 bits, not 8.
The previous fix attempt didn't work on some test. There are no known
bugs with AtomicFreeList other than tripping TSAN, and it's unclear
that TSAN isn't at fault.
However, switching to using a mutex works fine and doesn't appear to
be slower (it's actually faster with synthetic benchmarks on macOS)
BUGS=[377369108]
I do not think there was an actual error with AtomicFreeList, however
TSAN detected a data race when concurrent pop() happened. In that case,
there is indeed a race, where we can end-up reading data that is
already corrupted by the concurrent pop. However, that situation is
corrected by the following CAS. Somehow TSAN didn't see that.
The fix is strange and consists in replacing:
```
auto pNext = storage[offset].next;
```
with
```
auto s = storage[offset];
auto pNext = s.next;
```
In this PR we also adjust the memory ordering to be less strong. i.e.
we do not need `memory_order_seq_cst`, only the appropriate acquire or
release semantic.
In addition we also make `Node* next` a non-atomic variable again. It
should have been, but was change to placate an older version of TSAN.
BUGS=[377369108]