Replace `#ifdef BSD` (which requires including `<sys/param.h>` first) with explicit checks for `__FreeBSD__`, `__NetBSD__`, `__OpenBSD__` and `__DragonFly__`, matching how these BSDs are already enumerated elsewhere in the codebase (OS name strings, thread id helpers, etc.).
This also avoids leaking the `sys/param.h` requirement through public headers (`TracySysTime.hpp`, `TracyCallstack.h`), where consumers would otherwise need it to correctly see `TRACY_HAS_SYSTIME` / `TRACY_HAS_CALLSTACK`.
`libbacktrace/config.h` is left as-is — it's third-party and only included from .c files where the `BSD` macro can still be picked up locally.
Note: for `setsockopt( m_sock, IPPROTO_IPV6, IPV6_V6ONLY, (const char*)&val, sizeof( val ) );` I added `__APPLE__` too since this was the only place where it was not checked explicitely.
This can happen notably when the user does not call ZoneEnd.
I used 256 arbitrarily as it seemed higher values would just make the UI freeze anyway due to perf reasons.
I added a warning in the notification area so that users can locate it.
Many of the zones would have a negative running time due to a missing `cs->IsEndValid()` check.
This could end reporting context switches before the zone start, due to `cs->End()` returning -1.
This happened when systrace dropped event, or when using Fibers and `TracyFiberEnter` is called on the new thread once the fiber has been scheduled. (The manual actually does not really hint this is wrong, we should probably fix the manual or the server code.)
In both cases, we assume runtime to be 0 for that context switch. Since we have no actual information. Both options (counting full runtime or no runtime) are wrong, and most of the code handling `!cs->IsEndValid()` uses `Start` instead so that's what I did. This is still a net improvement over displaying negative values. If we want to change this handling, we'd need to review the other places that do `it->IsEndValid() ? it->End() : it->Start()` as well.
It also seems two different concepts were being mixed:
1. Do we have any context switch data at all ? (`it != ctx->v.end()` ie `count != 0`)
2. Do we have complete data for the last context switch (`eit != ctx->v.end()`)
This led to some places of the code not displaying or counting running time at all, notably when hovering a zone.
I think most of the time we wanted 1, as it reports correctly and assumes the last context switch is still running, which is a fair assumption if we didn't see one putting the thread to sleep.
I also fixed a case where we were overcounting runtime when range start was during a sleep.
Cache is shared between image names and source file names, because the
underlying StringIdx storage makes indices unique. Both name sets should
be completely separate, but if you have conflicts here, you have much
more pressing problems to solve.
Frames whose symbol data is shipped inline with the callstack payload
(sel=1, e.g. Lua-side stack entries) were being passed to
GetCanonicalPointer() in the AddCallstackAllocPayload() query loop,
tripping its sel==0 assertion. They have no native pointer to query
and were already registered in callstackFrameMap earlier in the same
function, so just skip them.
Regression from c704f909, which hoisted the per-call-site dedup into
QueryCallstackFrame(). Three of the four updated call sites were
equivalent before and after, because the old guard and the new one
keyed on the same value. The fourth, this one, was not: the old guard
tested the frame as-is and matched the entry inserted a few lines above,
short-circuiting before GetCanonicalPointer() ran. The new guard keys on
PackPointer(addr), so GetCanonicalPointer() must run first to compute
addr, and the assert fires.
The profiler will typically want to send bursts of queries (e.g. 3 queries
to retrieve source location strings, or multiple queries to get all the call
stack frames, etc.).
Each of these queries will be sent immediately, if available space in the
network buffer permits. Each of these sends is a separate syscall.
Remove this and instead batch all queries with the already existing network
buffer overflow handling functionality.
Move broadcast message parsing logic from profiler/src/main.cpp into
server/TracyBroadcast.cpp/hpp. This reduces code duplication and enables
reuse by other tools (e.g., multi-capture).
ParseBroadcastMessage() handles all broadcast protocol versions (0-3) and
returns std::optional<BroadcastMessage>. ClientUniqueID() generates a unique
identifier from IP address and port.
Co-authored-by: Grégoire Roussel <gregoire.roussel@wandercraft.eu>
Introduce Windows ARM64(native) support across ToyPathTracer,
profiler, and server code paths when building with MSVC(_M_ARM64).
Key changes:
- MathSimd.h/Maths.h:
- Fix NEON movemask constants for MSVC/ARM64 by loading from a uint32_t[]
via vld1q_u32() and using vdupq_n_u32() for highbit.
- enkiTS/TaskScheduler.cpp:
- Provide Pause() implementation on _M_ARM64 using __yield().
- profiler/winmain.cpp:
- AVX feature checks to x86/x64 only and skip on ARM64.
- server/TracyPopcnt.hpp:
- Implement TracyCountBits using ARM NEON intrinsics.
- Implement TracyLzcnt using _BitScanReverse64().
There are two changes to the protocol:
- `QueueMessageLiteral*` were changed and what used to be addresses are now addresses+metadata
- Other messages now send `QueueMessage*Metadata` with added metadata.
This will later be used to store and transmit message sources, level, etc.
This was causing issues in the Infos -> Trace Statistics window as `GetCallstackFrameCount` uses `m_pendingCallstackFrames`. Just in case, init those all those variables where declared instead of constructor.
This provides some instructions and tips for the manual. Also:
* Made the calibration feature a CMake option
* Cleaned up some minor code issues
* Fixed an issue with the calibration
* Incremented patch number
Callstack frames will now have nullptr as the value in the callstackFrameMap
map, as a way to signal that a query for given key is already pending.
Duplicate queries should no longer happen.
@slomp provided alternative implementation, which produced the following
results:
Queries made: 195,778
Duplicate queries skipped: 9,518,910
Co-authored-by: Marcos Slomp <slomp@adobe.com>