Commit Graph

4309 Commits

Author SHA1 Message Date
Bartosz Taudul
ec1c2d4267 Display aggregation counts in bottom / up sample trees. 2026-05-24 14:23:59 +02:00
Bartosz Taudul
e003885946 Merge pull request #1369 from siliceum/feature/no-sys-param-for-bsd
Drop sys/param.h dependency for BSD detection
2026-05-21 17:57:16 +02:00
Clément Grégoire
e7c71c991c Drop sys/param.h dependency for BSD detection
Replace `#ifdef BSD` (which requires including `<sys/param.h>` first) with explicit checks for `__FreeBSD__`, `__NetBSD__`, `__OpenBSD__` and `__DragonFly__`, matching how these BSDs are already enumerated elsewhere in the codebase (OS name strings, thread id helpers, etc.).

This also avoids leaking the `sys/param.h` requirement through public headers (`TracySysTime.hpp`, `TracyCallstack.h`), where consumers would otherwise need it to correctly see `TRACY_HAS_SYSTIME` / `TRACY_HAS_CALLSTACK`.
`libbacktrace/config.h` is left as-is — it's third-party and only included from .c files where the `BSD` macro can still be picked up locally.

Note: for `setsockopt( m_sock, IPPROTO_IPV6, IPV6_V6ONLY, (const char*)&val, sizeof( val ) );` I added `__APPLE__` too since this was the only place where it was not checked explicitely.
2026-05-21 10:18:30 +02:00
Clément Grégoire
69855a1416 Prevent unlimited recursion leading to stack overflow
This can happen notably when the user does not call ZoneEnd.

I used 256 arbitrarily as it seemed higher values would just make the UI freeze anyway due to perf reasons.
I added a warning in the notification area so that users can locate it.
2026-05-21 09:38:43 +02:00
Bartosz Taudul
60247b68d3 Merge pull request #1364 from siliceum/fix/zone-runtime
Fix and refactor zone running time
2026-05-20 19:29:29 +02:00
Clément Grégoire
91c0b1e42b Fix and refactor zone running time
Many of the zones would have a negative running time due to a missing `cs->IsEndValid()` check.
This could end reporting context switches before the zone start, due to `cs->End()` returning -1.

This happened when systrace dropped event, or when using Fibers and `TracyFiberEnter` is called on the new thread once the fiber has been scheduled. (The manual actually does not really hint this is wrong, we should probably fix the manual or the server code.)

In both cases, we assume runtime to be 0 for that context switch. Since we have no actual information. Both options (counting full runtime or no runtime) are wrong, and most of the code handling `!cs->IsEndValid()` uses `Start` instead so that's what I did. This is still a net improvement over displaying negative values. If we want to change this handling, we'd need to review the other places that do `it->IsEndValid() ? it->End() : it->Start()` as well.

It also seems two different concepts were being mixed:
1. Do we have any context switch data at all ? (`it != ctx->v.end()` ie `count != 0`)
2. Do we have complete data for the last context switch (`eit != ctx->v.end()`)

This led to some places of the code not displaying or counting running time at all, notably when hovering a zone.

I think most of the time we wanted 1, as it reports correctly and assumes the last context switch is still running, which is a fair assumption if we didn't see one putting the thread to sleep.

I also fixed a case where we were overcounting runtime when range start was during a sleep.
2026-05-20 12:02:38 +02:00
Bartosz Taudul
f0f579172b Force inline fast check. 2026-05-17 16:06:56 +02:00
Bartosz Taudul
4c6157d249 Remember last retrieved external check result. 2026-05-17 15:43:26 +02:00
Bartosz Taudul
6ae6fb741e Check if image is external once, before checking subframe filenames.
Cache is shared between image names and source file names, because the
underlying StringIdx storage makes indices unique. Both name sets should
be completely separate, but if you have conflicts here, you have much
more pressing problems to solve.
2026-05-17 15:30:03 +02:00
Bartosz Taudul
4ab7ef301e Split IsFrameExternalImpl into image + filename parts. 2026-05-17 15:30:03 +02:00
Bartosz Taudul
41f1172774 Change order of IsFrameExternal checks.
Check image first, then perform the expensive filename check.
2026-05-17 14:30:45 +02:00
Bartosz Taudul
4e0259148f Change IsFrameExternal interface to work with external cache.
Locks are dominating the execution time, making the global cache non-viable.
2026-05-14 22:29:51 +02:00
Bartosz Taudul
0dfd7fb20b Cache IsFrameExternal() queries. 2026-05-14 19:51:41 +02:00
Bartosz Taudul
b90e44a5f1 Add raw data accessor to StringIdx. 2026-05-14 19:33:21 +02:00
Bartosz Taudul
744bd21423 Change IsFrameExternal() interface to operate on StringIdx, move to Worker. 2026-05-14 19:01:04 +02:00
Bartosz Taudul
b0f00d20c8 Proper floating point charconv is available since libstdc++ 11 (April 2021). 2026-05-12 00:30:49 +02:00
Bartosz Taudul
f7ab78893c Don't query inline-symbol frames as native addresses.
Frames whose symbol data is shipped inline with the callstack payload
(sel=1, e.g. Lua-side stack entries) were being passed to
GetCanonicalPointer() in the AddCallstackAllocPayload() query loop,
tripping its sel==0 assertion. They have no native pointer to query
and were already registered in callstackFrameMap earlier in the same
function, so just skip them.

Regression from c704f909, which hoisted the per-call-site dedup into
QueryCallstackFrame(). Three of the four updated call sites were
equivalent before and after, because the old guard and the new one
keyed on the same value. The fourth, this one, was not: the old guard
tested the frame as-is and matched the entry inserted a few lines above,
short-circuiting before GetCanonicalPointer() ran. The new guard keys on
PackPointer(addr), so GetCanonicalPointer() must run first to compute
addr, and the assert fires.
2026-05-09 12:13:45 +02:00
Bartosz Taudul
305382453d Add callstack sample events with 32 and 16 bit timestamps. 2026-05-07 02:16:53 +02:00
Bartosz Taudul
ccaef5ba0b ZoneBegin / ZoneBeginCallstack with 32 and 16 bit time data. 2026-05-07 02:16:53 +02:00
Bartosz Taudul
4d094c108d Add zone end messages with 32 and 16 byte time deltas.
Change in test application:
    compressed data: 130 Mpbps -> 105 Mbps
    uncompressed: 830 Mbps -> 740 Mbps
2026-05-06 19:09:19 +02:00
Bartosz Taudul
d6e77b3f40 Remove server query fast path.
The profiler will typically want to send bursts of queries (e.g. 3 queries
to retrieve source location strings, or multiple queries to get all the call
stack frames, etc.).

Each of these queries will be sent immediately, if available space in the
network buffer permits. Each of these sends is a separate syscall.

Remove this and instead batch all queries with the already existing network
buffer overflow handling functionality.
2026-05-06 00:42:24 +02:00
Bartosz Taudul
e300d56f68 Force resort of possibly broken plots. 2026-03-27 20:16:27 +01:00
Bartosz Taudul
8351daea73 Add ability to mark the SortedVector unsorted. 2026-03-27 20:16:27 +01:00
Bartosz Taudul
b5a322d122 Fix SortedVector regression.
Sort unsorted part of the vector, not the already sorted part.
2026-03-27 17:49:49 +01:00
Bartosz Taudul
d0222ef7d9 Worker::GetZoneEnd() can be const. 2026-03-19 19:53:58 +01:00
Ivan Molodetskikh
16af373a7e Add missing cstdint include
Fixes build on new gcc on Fedora 44.
2026-03-19 17:57:57 +03:00
Bartosz Taudul
49590756a0 Extract broadcast message parsing into TracyBroadcast
Move broadcast message parsing logic from profiler/src/main.cpp into
server/TracyBroadcast.cpp/hpp. This reduces code duplication and enables
reuse by other tools (e.g., multi-capture).

ParseBroadcastMessage() handles all broadcast protocol versions (0-3) and
returns std::optional<BroadcastMessage>. ClientUniqueID() generates a unique
identifier from IP address and port.

Co-authored-by: Grégoire Roussel <gregoire.roussel@wandercraft.eu>
2026-03-02 19:45:48 +01:00
Bartosz Taudul
9442517f30 Remove trailing whitespace. 2026-03-02 19:40:41 +01:00
Naveen Regulla
99b502f9a4 simplify and clarify leading zeros calculation 2026-02-17 17:13:36 +05:30
Naveen Regulla
9879a31fc5 Add Windows on ARM64 with MSVC support for Tracy Profiler
Introduce Windows ARM64(native) support across ToyPathTracer,
profiler, and server code paths when building with MSVC(_M_ARM64).

Key changes:
- MathSimd.h/Maths.h:
   - Fix NEON movemask constants for MSVC/ARM64 by loading from a uint32_t[]
    via vld1q_u32() and using vdupq_n_u32() for highbit.
- enkiTS/TaskScheduler.cpp:
   - Provide Pause() implementation on _M_ARM64 using __yield().
- profiler/winmain.cpp:
   -  AVX feature checks to x86/x64 only and skip on ARM64.
- server/TracyPopcnt.hpp:
   - Implement TracyCountBits using ARM NEON intrinsics.
   - Implement TracyLzcnt using _BitScanReverse64().
2026-02-17 16:42:50 +05:30
Bartosz Taudul
2b201e6f59 RetrieveThread() uses a cache that cannot be shared between threads.
Note: IsThreadFiber() uses the same functionality, but is only called from
the main thread.
2026-02-01 18:01:39 +01:00
Bartosz Taudul
73694c7a24 Cleanup enums. 2026-01-24 01:50:11 +01:00
Bartosz Taudul
b624ada00a Cosmetics. 2026-01-24 01:16:28 +01:00
Clément Grégoire
5ef64841cc Remove MessageMetadata type and replace by uint8_t everywhere 2025-12-28 15:04:18 +01:00
Clément Grégoire
1e61dc88de Add source and severity to the server's MessageData + bump minor version for serialization
Note this does not change `sizeof(MessageData)` as there were 5 bytes left due to alignment. (now 3)
2025-12-28 14:50:38 +01:00
Clément Grégoire
f981330f66 Replace all messages text addr by TaggedUserlandAddress and send metadata over the network
There are two changes to the protocol:

- `QueueMessageLiteral*` were changed and what used to be addresses are now addresses+metadata
- Other messages now send `QueueMessage*Metadata` with added metadata.

This will later be used to store and transmit message sources, level, etc.
2025-12-28 14:44:40 +01:00
Alex Gunnarson
90bc94f237 Use builtins before MSVC intrinsics 2025-12-17 15:24:04 -07:00
Trevor L. McDonell
d1b0406801 Add option to ignore memory free faults
This replaces the IsApple flag, which was previously only used for this purpose.
2025-12-02 16:16:50 +01:00
Clément Grégoire
255f465a8f Fix uninitialized Worker::m_pending* loaded traces.
This was causing issues in the Infos -> Trace Statistics window as `GetCallstackFrameCount` uses `m_pendingCallstackFrames`. Just in case, init those all those variables where declared instead of constructor.
2025-10-08 07:47:51 +02:00
whouishere
a755cfab78 Fix usage of stat64 on non-glibc Linux 2025-09-05 10:22:02 -03:00
Antoine Mura
80126ed1e0 Add lock struct with condition variable to main thread 2025-07-27 11:05:45 +02:00
Bartosz Taudul
3c1c444a15 Unbreak loading traces from previous versions. 2025-07-22 20:18:55 +02:00
Bartosz Taudul
c03fdaec1e Merge pull request #1097 from erieaton-amd/rocprofv3-2
Collect dispatches and counter values with Rocprofv3
2025-07-22 13:33:15 +02:00
Eric Eaton
1639598d62 Update documentation
This provides some instructions and tips for the manual. Also:
* Made the calibration feature a CMake option
* Cleaned up some minor code issues
* Fixed an issue with the calibration
* Incremented patch number
2025-07-21 15:30:42 -07:00
Bartosz Taudul
2f17e33851 Prevent duplicate callstack frame queries.
Callstack frames will now have nullptr as the value in the callstackFrameMap
map, as a way to signal that a query for given key is already pending.
Duplicate queries should no longer happen.

@slomp provided alternative implementation, which produced the following
results:

Queries made: 195,778
Duplicate queries skipped: 9,518,910

Co-authored-by: Marcos Slomp <slomp@adobe.com>
2025-07-18 01:16:46 +02:00
Bartosz Taudul
c704f909be Move duplication check into QueryCallstackFrame() to make things clear. 2025-07-18 01:15:16 +02:00
Bartosz Taudul
c4a6cf3456 Use contains() to check if map contains element. 2025-07-18 00:45:26 +02:00
Bartosz Taudul
f578d14553 Extract callstack frame query to a separate function. 2025-07-18 00:33:09 +02:00
Bartosz Taudul
38ff7a6697 Make sure count is right. 2025-07-18 00:24:23 +02:00
Bartosz Taudul
320eb67581 Assume callstackFrameMap can store null ptrs. 2025-07-18 00:23:43 +02:00