Compare commits

..

19 Commits

Author SHA1 Message Date
Bartosz Taudul
3e5310cac6 Fix sigma conversion in latex -> md script. 2026-06-25 00:14:21 +02:00
Bartosz Taudul
606cd7bdaf Protect against zero frames to show. 2026-06-24 23:49:53 +02:00
Bartosz Taudul
d924b05ab7 Calculate square of averages with double precision. 2026-06-24 23:27:36 +02:00
Bartosz Taudul
d0c7596977 Regenerate markdown manual. 2026-06-24 23:02:25 +02:00
Bartosz Taudul
0bee59ff85 Update manual. 2026-06-24 23:01:34 +02:00
Bartosz Taudul
963a968a9e Update NEWS. 2026-06-24 20:37:51 +02:00
Bartosz Taudul
f700597067 Display P75, P90, P99 and P99.9 for frame statistics. 2026-06-24 20:09:03 +02:00
Bartosz Taudul
a237b009ec Use unified percentile value calculation in find zone. 2026-06-24 19:56:24 +02:00
Bartosz Taudul
5970e90233 Display coefficient of variation in frame statistics. 2026-06-24 19:32:50 +02:00
Bartosz Taudul
5fc4ae8eef Display coefficient of variation in find zone. 2026-06-24 19:30:42 +02:00
Bartosz Taudul
a35f8a738d Display standard deviation in frame statistics. 2026-06-24 19:21:40 +02:00
Bartosz Taudul
3714e36652 Frame count in frame statistics may be not full. 2026-06-24 18:56:08 +02:00
Bartosz Taudul
565f3e5a72 Add mode to frame statistics. 2026-06-24 18:48:38 +02:00
Bartosz Taudul
17720a0cd8 Keep FPS range and max counts in the same line. 2026-06-24 18:48:38 +02:00
Bartosz Taudul
a7b0907d6c Move mean and median in frame statistics to match find zone. 2026-06-24 18:48:38 +02:00
Bartosz Taudul
552159ccf6 Remove tree node for historgram in frame statistics. 2026-06-24 18:48:38 +02:00
Bartosz Taudul
7e8180a2a7 Move frame statistics from trace info to its own window. 2026-06-24 18:48:38 +02:00
Bartosz Taudul
05d34e4daa Make it possible to hide annotation. 2026-06-24 18:48:37 +02:00
Bartosz Taudul
72d45dfdad Fix display of annotations color on annotations list. 2026-06-24 18:48:35 +02:00
16 changed files with 1025 additions and 1025 deletions

23
NEWS
View File

@@ -83,8 +83,10 @@ vx.xx.x (2026-xx-xx)
- Entry call stacks can be now attached (previously it was only regular
call stacks).
- Crash call stack attachments are now annotated with crash info.
- Wait stack attachments now contain wait metadata.
- Source code can be attached (also with execution costs in symbol view).
- Zone histogram data can be attached for analysis.
- Added MCP server.
- Markdown renderer improvements.
- Tables are now properly rendered.
- Tasklist rendering has been implemented.
@@ -108,6 +110,7 @@ vx.xx.x (2026-xx-xx)
locations won't be fixed.
- Call stack window will now display notification if viewing a crash call
stack.
- Call stack window will now show a proper label if viewing a wait stack.
- Removal of Tracy crash handler stack from the reported crash call stack
should now work again on Linux.
- In disassembly line view, source file names are now displayed instead of
@@ -194,11 +197,31 @@ vx.xx.x (2026-xx-xx)
with the TRACY_OPENGL_AUTO_CALIBRATION compilation define. Note that this
requires a full CPU/GPU sync on each calibration event. These events will
not fire more often than once every second.
- Additional hardening of OpenGL tracing.
- Added missing C API for shared locks.
- Implemented semi-unique, nonsense random name generator.
- Can be used to set a trace description.
- Will be used to provide default description for newly added annotations.
- Polished look and feel of annotation regions on the timeline.
- Annotations can now be hidden.
- Expanded available tests.
- The previously existing "test" program now lives in "tests/tracy".
- The tests directory also contains some repro cases for solved problems.
- Expanded examples showing how to use the profiler.
- Includes "Dyna.net", a small 2D game that is a variation of Bomberman.
- Made numeric input field for trace parameters usable again.
- Added new type of trace parameter that exposes a trigger button.
- Added "sections" feature for marking high-level sections of program
execution.
- For example, a game with multiple levels may have each of the levels
marked as a section.
- Each section is a distinct region not restricted by other sections.
- Multiple sections can overlap.
- Find zone window now also displays coefficient of variation, in addition
to standard deviation.
- Moved frame statistics from trace information window to its own window.
- Added various statistical data to bring frame statistics up to par with
the find zone window.
v0.13.1 (2025-12-11)

View File

@@ -12,6 +12,7 @@ sed -i -e 's@\\LMB{}~@@g' _tmp.tex
sed -i -e 's@\\MMB{}~@@g' _tmp.tex
sed -i -e 's@\\RMB{}~@@g' _tmp.tex
sed -i -e 's@\\Scroll{}~@@g' _tmp.tex
sed -i -e 's@\\textsigma@σ@g' _tmp.tex
# Resolve \circled{} markers and lstlisting escapeinside (@...@) snippets, which
# pandoc would otherwise emit verbatim or drop, to their Unicode equivalents.

File diff suppressed because it is too large Load Diff

View File

@@ -1474,6 +1474,24 @@ Exiting the profiled application from inside a zone is not supported. When the c
As a workaround, you may add a \texttt{try}/\texttt{catch} pair at the bottom of the function stack (for example in the \texttt{main()} function) and replace \texttt{exit()} calls with throwing a custom exception. When this exception is caught, you may call \texttt{exit()}, knowing that the application's data structures (including profiling zones) were properly cleaned up.
\subsection{Marking sections}
\label{markingsections}
To provide a high-level overview of what your program is doing, you may mark sections of execution. For example, in a puzzle game, the main menu would be a section, and each of the levels would be a section on its own.
To mark when a section starts, use the \texttt{TracySectionEnter(fmt, ...)} macro with printf-like formatting. The macro will return an \texttt{uint32\_t} identifier, which you must store and pass to the corresponding section-end macro, \texttt{TracySectionLeave(id)}. Here's an example of a simple RAII wrapper:
\begin{lstlisting}
struct TracySection
{
explicit TracySection(const char* name) : idx(TracySectionEnter("%s", name)) {}
~TracySection() { TracySectionLeave(idx); }
uint32_t idx;
};
\end{lstlisting}
Each of the sections you add is handled independently of any other section. Multiple sections may overlap, either with a hierarchical structure or with each section having a part that does not coincide with other sections.
\subsection{Marking locks}
Modern programs must use multi-threading to achieve the full performance capability of the CPU. However, correct execution requires claiming exclusive access to data shared between threads. When many threads want to simultaneously enter the same critical section, the application's multi-threaded performance advantage nullifies. To help solve this problem, Tracy can collect and display lock interactions in threads.
@@ -3595,7 +3613,7 @@ The main profiler window is split into three sections, as seen in figure~\ref{ma
\draw[rounded corners=5pt] (1.3, -0.1) rectangle+(2.2, -0.5) node [midway] {\faComment{} Messages};
\draw[rounded corners=5pt] (3.6, -0.1) rectangle+(1.3, -0.5) node [midway] {\faMagnifyingGlass{} Find};
\draw[rounded corners=5pt] (5.0, -0.1) rectangle+(2, -0.5) node [midway] {\faArrowUpWideShort{} Statistics};
\draw[rounded corners=5pt] (7.1, -0.1) rectangle+(1.5, -0.5) node [midway] {\faFire{} Flame};
\draw[rounded corners=5pt] (7.1, -0.1) rectangle+(1.5, -0.5) node [midway] {\faFireFlameCurved{} Flame};
\draw[rounded corners=5pt] (8.7, -0.1) rectangle+(2.1, -0.5) node [midway] {\faMemory{} Memory};
\draw[rounded corners=5pt] (10.9, -0.1) rectangle+(2.1, -0.5) node [midway] {\faScaleBalanced{} Compare};
\draw[rounded corners=5pt] (13.1, -0.1) rectangle+(1.2, -0.5) node [midway] {\faFingerprint{} Info};
@@ -3631,7 +3649,7 @@ The control menu (top row of buttons) provides access to various profiler featur
\item \emph{\faComment{} Messages} -- Toggles the message log window (section~\ref{messages}), which displays custom messages sent by the client, as described in section~\ref{messagelog}.
\item \emph{\faMagnifyingGlass{} Find} -- This buttons toggles the find zone window, which allows inspection of zone behavior statistics (section~\ref{findzone}).
\item \emph{\faArrowUpWideShort{} Statistics} -- Toggles the statistics window, which displays zones sorted by their total time cost (section~\ref{statistics}).
\item \emph{\faFire{} Flame} -- Enables the flame graph window (section~\ref{flamegraph}).
\item \emph{\faFireFlameCurved{} Flame} -- Enables the flame graph window (section~\ref{flamegraph}).
\item \emph{\faMemory{} Memory} -- Various memory profiling options may be accessed here (section~\ref{memorywindow}).
\item \emph{\faScaleBalanced{} Compare} -- Toggles the trace compare window, which allows you to see the performance difference between two profiling runs (section~\ref{compare}).
\item \emph{\faFingerprint{} Info} -- Show general information about the trace (section~\ref{traceinfo}).
@@ -3642,6 +3660,7 @@ The control menu (top row of buttons) provides access to various profiler featur
\item \emph{\faNoteSticky{}~Annotations} -- If annotations have been made (section~\ref{annotatingtrace}), you can open a list of all annotations, described in chapter~\ref{annotationlist}.
\item \emph{\faRuler{}~Limits} -- Displays time range limits window (section~\ref{timeranges}).
\item \emph{\faHourglassHalf{}~Wait stacks} -- If sampling was performed, an option to display wait stacks may be available. See chapter~\ref{waitstacks} for more details.
\item \emph{\faImages{}~Frame statistics} -- Display frame statistics window (section~\ref{framestatistics}).
\end{itemize}
\item \emph{ \faBook{}~User manual} -- Opens the user manual for quick reference. Note that the version of the user manual available directly in the profiler is an inferior quality version compared to the proper PDF.
\item \emph{\faMagnifyingGlassPlus{}~Display scale} -- Enables run-time resizing of the displayed content. This may be useful in environments with potentially reduced visibility, e.g. during a presentation. Note that this setting is independent to the UI scaling coming from the system DPI settings. The scale will be preserved across multiple profiler sessions if the \emph{Save UI scale} option is selected in global settings.
@@ -3741,7 +3760,7 @@ Clicking the \LMB{}~left mouse button on the graph while the \keys{\ctrl}~key is
\subsubsection{Timeline view}
The timeline is the most crucial element of the profiler UI. All the captured data is displayed there, laid out on the horizontal axis, according to time flow. Where there was no profiling performed, the timeline is dimmed out. The view is split into three parts: the time scale, the frame sets, and the combined zones, locks, and plots display.
The timeline is the most crucial element of the profiler UI. All the captured data is displayed there, laid out on the horizontal axis, according to time flow. Where there was no profiling performed, the timeline is dimmed out. The view is split into four parts: time scale, frame sets, sections, and the combined zones, locks, and plots display.
\subparagraph{Collapsed items}
\label{collapseditems}
@@ -3828,6 +3847,38 @@ If a frame has an associated frame image (see chapter~\ref{frameimages}), you ca
If the \emph{\faFlagCheckered{}~Draw frame targets} option is enabled (see section~\ref{options}), time regions in frames exceeding the set target value will be marked with a red background.
\paragraph{Sections}
\label{sections}
If sections of program execution were marked (see chapter~\ref{markingsections}), the profiler will try to reconstruct a child-parent hierarchy between the sections, as presented on figure~\ref{figsections}.
\begin{figure}[h]
\centering\begin{tikzpicture}
\draw(0, 0) rectangle+(2, -0.5) node[midway] {Menu};
\draw(2, 0) rectangle+(10, -0.5) node[midway] {In-game};
\draw(12, 0) rectangle+(1.5, -0.5) node[midway] {Menu};
\draw(2, -0.5) rectangle+(3, -0.5) node[midway] {Level 1};
\draw(5, -0.5) rectangle+(5, -0.5) node[midway] {Level 2};
\draw(10, -0.5) rectangle+(2, -0.5) node[midway] {Level 3};
\end{tikzpicture}
\caption{Sections showing a hierarchy.}
\label{figsections}
\end{figure}
With some data sets such reconstruction will not be possible, in which case sections may look as on figure~\ref{figsections2}.
\begin{figure}[h]
\centering\begin{tikzpicture}
\draw(0, 0) rectangle+(8, -0.5) node[midway] {First section};
\draw(3, -0.5) rectangle+(8, -0.5) node[midway] {Second section};
\end{tikzpicture}
\caption{Sections that cannot form a hierarchy.}
\label{figsections2}
\end{figure}
Clicking the \MMB{}~middle mouse button on a section will zoom the view to the extent of the section.
\paragraph{Zones, locks and plots display}
\label{zoneslocksplots}
@@ -3923,7 +3974,7 @@ In an example in figure~\ref{zoneslocks} you can see that there are two threads:
Meanwhile, the \emph{Streaming thread} is performing some \emph{Streaming jobs}. The first \emph{Streaming job} sent a message (section~\ref{messagelog}). In addition to being listed in the message log, it is indicated by a triangle over the thread separator. When multiple messages are in one place, the triangle outline shape changes to a filled triangle.
The GPU zones are displayed just like CPU zones, with an OpenGL/Vulkan/Direct3D/Metal/OpenCL/CUDA/WebGPU context in place of a thread name.
The GPU zones are displayed just like CPU zones, with an OpenGL / Vulkan / Direct3D / Metal / OpenCL / CUDA / WebGPU context in place of a thread name.
Hovering the \faArrowPointer{} mouse pointer over a zone will highlight all other zones that have the exact source location with a white outline. Clicking the \LMB{}~left mouse button on a zone will open the zone information window (section~\ref{zoneinfo}). Holding the \keys{\ctrl} key and clicking the \LMB{}~left mouse button on a zone will open the zone statistics window (section~\ref{findzone}). Clicking the \MMB{}~middle mouse button on a zone will zoom the view to the extent of the zone.
@@ -4076,15 +4127,18 @@ Sometimes, you may want to specify a time range, such as limiting some statistic
To define a time range, drag the \LMB{}~left mouse button over the timeline view while holding the \keys{\ctrl} key. When the mouse key is released, the profiler will mark the selected time extent with a blue striped pattern, and it will display a context menu with the following options:
\begin{itemize}
\item \emph{\faMagnifyingGlass{}~Limit find zone time range} -- this will limit find zone results. See chapter~\ref{findzone} for more details.
\item \emph{\faArrowUpWideShort{}~Limit statistics time range} -- selecting this option will limit statistics results. See chapter~\ref{statistics} for more details.
\item \emph{\faFire{}~Limit flame graph time range} -- limits flame graph results. Refer to chapter~\ref{flamegraph}.
\item \emph{\faHourglassHalf{}~Limit wait stacks time range} -- limits wait stacks results. Refer to chapter~\ref{stackwindows}.
\item \emph{\faMemory{}~Limit memory time range} -- limits memory results. Read more about this in chapter~\ref{memorywindow}.
\item \faRuler{}~Set time range for:
\begin{itemize}
\item \emph{\faMagnifyingGlass{}~Find zone} -- this will limit find zone results. See chapter~\ref{findzone} for more details.
\item \emph{\faArrowUpWideShort{}~Statistics} -- selecting this option will limit statistics results. See chapter~\ref{statistics} for more details.
\item \emph{\faFireFlameCurved{}~Flame graph} -- limits flame graph results. Refer to chapter~\ref{flamegraph}.
\item \emph{\faHourglassHalf{}~Wait stacks} -- limits wait stacks results. Refer to chapter~\ref{stackwindows}.
\item \emph{\faMemory{}~Memory} -- limits memory results. Read more about this in chapter~\ref{memorywindow}.
\end{itemize}
\item \emph{\faNoteSticky{}~Add annotation} -- use to annotate regions of interest, as described in chapter~\ref{annotatingtrace}.
\end{itemize}
Alternatively, you may specify the time range by clicking the \RMB{}~right mouse button on a zone or a frame. The resulting time extent will match the selected item.
Alternatively, you may specify the time range by clicking the \RMB{}~right mouse button on a zone, section or a frame. The resulting time extent will match the selected item.
To reduce clutter, time range regions are only displayed if the windows they affect are open or if the time range limits control window is open (section~\ref{timerangelimits}). You can access the time range limits window through the \emph{\faScrewdriverWrench{} Tools} button on the control menu.
@@ -4097,15 +4151,14 @@ Tracy allows adding custom notes to the trace. For example, you may want to mark
Methods of specifying the annotation region are described in section~\ref{timeranges}. When a new annotation is added, it is assigned a semi-unique random name to make it distinguishable. The settings window is also opened (section~\ref{annotationsettings}), allowing you to enter your own description of the annotation.
Annotations are displayed on the timeline, as presented in figure~\ref{annotation}. Clicking on the circle next to the text description will open the annotation settings window, in which you can modify or remove the region. List of all annotations in the trace is available in the annotations list window described in section~\ref{annotationlist}, which is accessible through the \emph{\faScrewdriverWrench{} Tools} button on the control menu.
Annotations are displayed on the timeline, as presented in figure~\ref{annotation}. Clicking on the text description area will open the annotation settings window, in which you can modify or remove the region. List of all annotations in the trace is available in the annotations list window described in section~\ref{annotationlist}, which is accessible through the \emph{\faScrewdriverWrench{} Tools} button on the control menu.
\begin{figure}[h]
\centering\begin{tikzpicture}
\draw (0, 0.25) -- (0, 1) -- (5, 1) -- (5, 0.25);
\draw[dotted] (0, -0.2) -- (0, 0.25);
\draw[dotted] (5, -0.2) -- (5, 0.25);
\draw (0.25, 0.75) circle(0.15);
\draw (0.4, 0.7) node[anchor=west] {Description};
\draw (0, -0.25) -- (0, -1) -- (5, -1) -- (5, -0.25);
\draw[dotted] (0, 0.2) -- (0, -0.25);
\draw[dotted] (5, 0.2) -- (5, -0.25);
\draw (0, -0.8) node[anchor=west] {\faNoteSticky{}~Description};
\end{tikzpicture}
\caption{Annotation region.}
\label{annotation}
@@ -4367,7 +4420,21 @@ Clicking the \LMB{}~left mouse button on the group name will highlight the group
The call stack grouping mode has a different way of listing groups. Here only one group is displayed at any time due to the need to display the call stack frames. You can switch between call stack groups by using the~\faCaretLeft{}~and~\faCaretRight{} buttons. You can select the group by clicking on the~\emph{\faCheck{}~Select} button. You can open the call stack window (section~\ref{callstackwindow}) by pressing the~\emph{\faAlignJustify{}~Call~stack} button.
Tracy displays a variety of statistical values regarding the selected function: mean (average value), median (middle value), mode (most common value, quantized using histogram bins), and \textsigma{} (standard deviation). The mean and median zone times are also displayed on the histogram as red (mean) and blue (median) vertical bars. Additional bars will indicate the mean group time (orange) and median group time (green). You can disable the drawing of either set of markers by clicking on the check-box next to the color legend.
Tracy displays a variety of statistical values regarding the selected function:
\begin{itemize}
\item \emph{mean} -- average time value,
\item \emph{median} -- middle time value (P50),
\item \emph{mode} -- most common time value, quantized using histogram bins,
\item \emph{\textsigma} -- standard deviation,
\item \emph{coefficient of variation} -- relation of \textsigma{} to mean,
\item \emph{P75} -- 75th percentile time value,
\item \emph{P90} -- 90th percentile time value,
\item \emph{P99} -- 99th percentile time value,
\item \emph{P99.9} -- 99.9th percentile time value.
\end{itemize}
The mean and median zone times are also displayed on the histogram as red (mean) and blue (median) vertical bars. Additional bars will indicate the mean group time (orange) and median group time (green). You can disable the drawing of either set of markers by clicking on the check-box next to the color legend.
Hovering the \faArrowPointer{}~mouse cursor over a zone on the timeline, which is currently selected in the find zone window, will display a pulsing vertical bar on the histogram, highlighting the bin to which the hovered zone has been assigned. In addition, it will also highlight zone entry on the zone list.
@@ -4614,8 +4681,6 @@ If the \emph{\faUserGear{}~Public sidecar} option is selected, the file containi
Open the \emph{Trace statistics} section to see information about the trace, such as achieved timer resolution, number of captured zones, lock events, plot data points, memory allocations, etc.
There's also a section containing the selected frame set timing statistics and histogram\footnote{See section~\ref{findzone} for a description of the histogram. Note that there are subtle differences in the available functionality.}. As a convenience, you can switch the active frame set here and limit the displayed frame statistics to the frame range visible on the screen.
If \emph{CPU topology} data is available (see section~\ref{cputopology}), you will be able to view the package, core, and thread hierarchy.
The \emph{Source location substitutions} section allows adapting the source file paths, as captured by the profiler, to the actual on-disk locations\footnote{This does not affect source files cached during the profiling run.}. You can create a new substitution by clicking the \emph{Add new substitution} button. This will add a new entry, with input fields for ECMAScript-conforming regular expression pattern and its corresponding replacement string. You can quickly test the outcome of substitutions in the \emph{example source location} input field, which will be transformed and displayed below, as \emph{result}.
@@ -4639,6 +4704,11 @@ In this window, you can view the information about the machine on which the prof
If an application should crash during profiling (section~\ref{crashhandling}), the profiler will display the crash information in this window. It provides you information about the thread that has crashed, the crash reason, and the crash call stack (section~\ref{callstackwindow}).
\subsection{Frame statistics window}
\label{framestatistics}
This windows shows statistical information about the selected frame set timing, and a histogram. See section~\ref{findzone} for a description of the displayed data. As a convenience, you can switch the active frame set here. Additionally, with the \emph{Limit to view} option you can restrict the displayed frame statistics to the frame range currently visible on the screen.
\subsection{Zone information window}
\label{zoneinfo}
@@ -4950,7 +5020,7 @@ The profiled program is highlighted using green color. Furthermore, the yellow h
\subsection{Annotation settings window}
\label{annotationsettings}
In this window, you may modify how a timeline annotation (section~\ref{annotatingtrace}) is presented by setting its text description or selecting region highlight color. A random annotation description can be set with the \emph{\faDice{}~Generate name} button. If the note is no longer needed, you may also remove it here.
In this window, you may modify how a timeline annotation (section~\ref{annotatingtrace}) is presented by setting its text description or selecting region highlight color. A random annotation description can be set with the \emph{\faDice{}~Generate name} button. If the note is no longer needed, you may also temporarily hide or remove it here.
\subsection{Annotation list window}
\label{annotationlist}
@@ -4958,15 +5028,17 @@ In this window, you may modify how a timeline annotation (section~\ref{annotatin
This window lists all annotations marked on the timeline. Each annotation is presented, as shown on figure~\ref{figannlist}. From left to right the elements are:
\begin{itemize}
\item \emph{\faPenToSquare{} Edit} -- Opens the annotation settings window (section~\ref{annotationsettings}).
\item \emph{\faMicroscope{} Zoom} -- Zooms timeline to the annotation extent.
\item \emph{\faTrashCan{} Remove} -- Removes the annotation. You must press the \keys{\ctrl} key to enable this button.
\item \emph{\faCheck{}~Visible} -- Makes the annotation visible or not on the timeline.
\item \emph{\faPenToSquare{}~Edit} -- Opens the annotation settings window (section~\ref{annotationsettings}).
\item \emph{\faMicroscope{}~Zoom} -- Zooms timeline to the annotation extent.
\item \emph{\faTrashCan{}~Remove} -- Removes the annotation. You must press the \keys{\ctrl} key to enable this button.
\item Colored box -- Color of the annotation.
\item Text description of the annotation.
\end{itemize}
\begin{figure}[h]
\centering\begin{tikzpicture}
\draw[rounded corners=5pt] (-0.6, 0) rectangle+(0.5, -0.5) node [midway] {\faCheck};
\draw[rounded corners=5pt] (0.0, 0) rectangle+(0.5, -0.5) node [midway] {\faPenToSquare};
\draw[rounded corners=5pt] (0.6, 0) rectangle+(0.5, -0.5) node [midway] {\faMicroscope};
\draw[rounded corners=5pt] (1.2, 0) rectangle+(0.5, -0.5) node [midway] {\faTrashCan};
@@ -4987,12 +5059,15 @@ This window displays information about time range limits (section~\ref{timerange
\begin{itemize}
\item \emph{Limit to view} -- Set the time range limit to current view.
\item \emph{\faMicroscope{}~Focus} -- Set the timeline view to the time range extent.
\item \emph{\faNoteSticky{}~From annotation} -- Allows using the annotation region for limiting purposes.
\item \emph{\faMagnifyingGlass{}~Copy find zone} -- Copies the find zone time range limit.
\item \emph{\faArrowUpWideShort{}~Copy statistics} -- Copies the statistics time range limit.
\item \emph{\faFireFlameCurved{}~Copy flame} -- Copies the flame graph time range limit.
\item \emph{\faHourglassHalf{}~Copy wait stacks} -- Copies the wait stacks time range limit.
\item \emph{\faMemory{}~Copy memory} -- Copies the memory time range limit.
\item \faCopy{}~Copy from:
\begin{itemize}
\item \emph{\faNoteSticky{}~Annotation} -- Allows using the annotation region for limiting purposes.
\item \emph{\faMagnifyingGlass{}~Find zone} -- Copies the find zone time range limit.
\item \emph{\faArrowUpWideShort{}~Statistics} -- Copies the statistics time range limit.
\item \emph{\faFireFlameCurved{}~Flame graph} -- Copies the flame graph time range limit.
\item \emph{\faHourglassHalf{}~Wait stacks} -- Copies the wait stacks time range limit.
\item \emph{\faMemory{}~Memory} -- Copies the memory time range limit.
\end{itemize}
\end{itemize}
Note that ranges displayed in the window have color hints that match the color of the striped regions on the timeline.

View File

@@ -101,6 +101,7 @@ set(SERVER_FILES
TracyView_FindZone.cpp
TracyView_FlameGraph.cpp
TracyView_FrameOverview.cpp
TracyView_FrameStatistics.cpp
TracyView_FrameTimeline.cpp
TracyView_FrameTree.cpp
TracyView_GpuTimeline.cpp

View File

@@ -199,7 +199,8 @@ bool UserData::Save()
{ "text", v->text },
{ "min", v->range.min },
{ "max", v->range.max },
{ "color", v->color }
{ "color", v->color },
{ "visible", v->visible },
} );
}
}
@@ -295,6 +296,7 @@ bool UserData::Load()
LoadValue( v, "min", a->range.min );
LoadValue( v, "max", a->range.max );
LoadValue( v, "color", a->color );
LoadValue( v, "visible", a->visible );
a->range.active = true;
m_annotations.emplace_back( std::move( a ) );
}

View File

@@ -981,6 +981,10 @@ bool View::DrawImpl()
{
m_showWaitStacks = true;
}
if( ButtonDisablable( ICON_FA_IMAGES " Frame statistics", !m_worker.AreFramesUsed() ) )
{
m_showFrameStatistics = true;
}
ImGui::EndPopup();
}
ImGui::SameLine();
@@ -1209,6 +1213,7 @@ bool View::DrawImpl()
if( m_showRanges ) DrawRanges();
if( m_showWaitStacks ) DrawWaitStacks();
if( m_showManual ) DrawManual();
if( m_showFrameStatistics ) DrawFrameStatistics();
#ifndef __EMSCRIPTEN__
if( m_llm.m_show ) m_llm.Draw();
#endif

View File

@@ -332,6 +332,7 @@ private:
bool ShouldDrawRange( const RangeId& id ) const;
void DrawWaitStacks();
void DrawManual();
void DrawFrameStatistics();
void DrawFlameGraph();
void DrawFlameGraphHeader( int64_t vStart, int64_t vEnd, uint64_t period );
void DrawFlameGraphLevel( const std::vector<FlameGraphItem>& data, FlameGraphContext& ctx, int depth, bool samples );
@@ -607,6 +608,7 @@ private:
bool m_showFlameGraph = false;
bool m_showManual = false;
bool m_manualPositionReset = false;
bool m_showFrameStatistics = false;
AccumulationMode m_statAccumulationMode = AccumulationMode::SelfOnly;
bool m_statSampleTime = true;
@@ -947,6 +949,12 @@ private:
bool limitToView = false;
std::pair<int, int> limitRange = { -1, 0 };
int minBinVal = 1;
double sumSq = 0;
float sd = 0;
int64_t p75 = 0;
int64_t p90 = 0;
int64_t p99 = 0;
int64_t p99_9 = 0;
} m_frameSortData;
struct {

View File

@@ -85,6 +85,7 @@ struct Annotation
std::string text;
Range range;
uint32_t color;
bool visible = true;
};
struct SourceRegex

View File

@@ -30,6 +30,8 @@ void View::DrawSelectedAnnotation()
ImGui::Begin( "Annotation", &show, ImGuiWindowFlags_AlwaysAutoResize );
if( !ImGui::GetCurrentWindowRead()->SkipItems )
{
ImGui::Checkbox( "Visible", &m_selectedAnnotation->visible );
ImGui::SameLine();
if( ImGui::Button( ICON_FA_MICROSCOPE " Zoom to annotation" ) )
{
ZoomToRange( m_selectedAnnotation->range.min, m_selectedAnnotation->range.max );
@@ -127,6 +129,8 @@ void View::DrawAnnotationList()
for( auto& ann : m_annotations )
{
ImGui::PushID( idx );
ImGui::Checkbox( "##visible", &ann->visible );
ImGui::SameLine();
if( ImGui::Button( ICON_FA_PEN_TO_SQUARE ) )
{
m_selectedAnnotation = ann.get();
@@ -143,7 +147,7 @@ void View::DrawAnnotationList()
}
if( !ctrl ) TooltipIfHovered( "Press ctrl key to enable removal" );
ImGui::SameLine();
ImGui::ColorButton( "c", ImGui::ColorConvertU32ToFloat4( ann->color ), ImGuiColorEditFlags_NoTooltip );
ImGui::ColorButton( "c", ImGui::ColorConvertU32ToFloat4( ann->color | 0xFF000000 ), ImGuiColorEditFlags_NoTooltip );
ImGui::SameLine();
if( m_selectedAnnotation == ann.get() )
{

View File

@@ -1001,7 +1001,7 @@ void View::DrawCompare()
{
const auto sz = sorted[0].size();
const auto avg = m_compare.average[0];
const auto ss = sumSq0 - 2. * total0 * avg + avg * avg * sz;
const auto ss = sumSq0 - 2. * total0 * avg + double( avg ) * avg * sz;
const auto sd = sqrt( ss / ( sz - 1 ) );
ImGui::SameLine();
@@ -1027,7 +1027,7 @@ void View::DrawCompare()
{
const auto sz = sorted[1].size();
const auto avg = m_compare.average[1];
const auto ss = sumSq1 - 2. * total1 * avg + avg * avg * sz;
const auto ss = sumSq1 - 2. * total1 * avg + double( avg ) * avg * sz;
const auto sd = sqrt( ss / ( sz - 1 ) );
ImGui::SameLine();

View File

@@ -542,12 +542,13 @@ void View::DrawFindZone()
const auto vsz = vec.size();
if( vsz != 0 )
{
auto Percentile = [&vec, vsz]( double p ) { return vec[std::min<size_t>( vsz - 1, p * vsz )]; };
m_findZone.average = float( total ) / vsz;
m_findZone.median = vec[vsz/2];
m_findZone.p75 = vec[3 * (vsz / 4)];
m_findZone.p90 = vec[vsz / 10 * 9];
m_findZone.p99 = vec[size_t(float(vsz * 0.99))];
m_findZone.p99_9 = vec[size_t(float(vsz * 0.999))];
m_findZone.median = Percentile( 0.5 );
m_findZone.p75 = Percentile( 0.75 );
m_findZone.p90 = Percentile( 0.9 );
m_findZone.p99 = Percentile( 0.99 );
m_findZone.p99_9 = Percentile( 0.999 );
m_findZone.total = total;
m_findZone.sortedNum = i;
m_findZone.tmin = tmin;
@@ -1114,7 +1115,7 @@ void View::DrawFindZone()
{
const auto sz = m_findZone.sorted.size();
const auto avg = m_findZone.average;
const auto ss = zoneData.sumSq - 2. * zoneData.total * avg + avg * avg * sz;
const auto ss = zoneData.sumSq - 2. * zoneData.total * avg + double( avg ) * avg * sz;
const auto sd = sqrt( ss / ( sz - 1 ) );
ImGui::SameLine();
@@ -1122,6 +1123,9 @@ void View::DrawFindZone()
ImGui::SameLine();
TextFocused( "\xcf\x83:", TimeToString( sd ) );
TooltipIfHovered( "Standard deviation" );
ImGui::SameLine();
ImGui::TextDisabled( "(%.2f%%)", 100.f * sd / avg );
TooltipIfHovered( "Coefficient of variation" );
}
TextFocused( "P75:", TimeToString( m_findZone.p75 ) );
ImGui::SameLine();

View File

@@ -0,0 +1,635 @@
#include "TracyPrint.hpp"
#include "TracyView.hpp"
#include "../Fonts.hpp"
#include "tracy_pdqsort.h"
namespace tracy
{
extern double s_time;
void View::DrawFrameStatistics()
{
const auto scale = GetScale();
ImGui::SetNextWindowSize( ImVec2( 700 * scale, 500 * scale ), ImGuiCond_FirstUseEver );
ImGui::Begin( "Frame statistics", &m_showFrameStatistics );
auto fsz = m_worker.GetFullFrameCount( *m_frames );
if( fsz != 0 )
{
TextFocused( "Frame set:", GetFrameSetName( *m_frames ) );
ImGui::SameLine();
ImGui::TextDisabled( "(%s)", m_frames->continuous ? "continuous" : "discontinuous" );
ImGui::SameLine();
ImGui::PushStyleVar( ImGuiStyleVar_FramePadding, ImVec2( 0, 0 ) );
if( ImGui::BeginCombo( "##frameCombo", nullptr, ImGuiComboFlags_NoPreview ) )
{
auto& frames = m_worker.GetFrames();
for( auto& fd : frames )
{
bool isSelected = m_frames == fd;
if( ImGui::Selectable( GetFrameSetName( *fd ), isSelected ) )
{
m_frames = fd;
fsz = m_worker.GetFullFrameCount( *m_frames );
}
if( isSelected )
{
ImGui::SetItemDefaultFocus();
}
ImGui::SameLine();
ImGui::TextDisabled( "(%s)", RealToString( fd->frames.size() ) );
}
ImGui::EndCombo();
}
ImGui::PopStyleVar();
ImGui::SameLine();
SmallCheckbox( "Limit to view", &m_frameSortData.limitToView );
if( m_frameSortData.limitToView )
{
ImGui::SameLine();
TextColoredUnformatted( 0xFF00FFFF, ICON_FA_TRIANGLE_EXCLAMATION );
}
const auto frameRange = m_worker.GetFrameRange( *m_frames, m_vd.zvStart, m_vd.zvEnd );
if( m_frameSortData.frameSet != m_frames || ( m_frameSortData.limitToView && m_frameSortData.limitRange != frameRange ) || ( !m_frameSortData.limitToView && m_frameSortData.limitRange.first != -1 ) )
{
m_frameSortData.frameSet = m_frames;
m_frameSortData.frameNum = 0;
m_frameSortData.data.clear();
m_frameSortData.total = 0;
m_frameSortData.sumSq = 0;
}
bool recalc = false;
int64_t total = 0;
double sumSq = 0;
if( !m_frameSortData.limitToView )
{
if( m_frameSortData.frameNum != fsz || m_frameSortData.limitRange.first != -1 )
{
auto& vec = m_frameSortData.data;
vec.reserve( fsz );
const auto midSz = vec.size();
total = m_frameSortData.total;
sumSq = m_frameSortData.sumSq;
for( size_t i=m_frameSortData.frameNum; i<fsz; i++ )
{
const auto t = m_worker.GetFrameTime( *m_frames, i );
if( t > 0 )
{
vec.emplace_back( t );
total += t;
sumSq += double( t ) * t;
}
}
auto mid = vec.begin() + midSz;
pdqsort_branchless( mid, vec.end() );
std::inplace_merge( vec.begin(), mid, vec.end() );
recalc = true;
m_frameSortData.limitRange.first = -1;
}
}
else
{
if( m_frameSortData.limitRange != frameRange )
{
auto& vec = m_frameSortData.data;
assert( vec.empty() );
vec.reserve( frameRange.second - frameRange.first );
for( int i=frameRange.first; i<frameRange.second; i++ )
{
const auto t = m_worker.GetFrameTime( *m_frames, i );
if( t > 0 )
{
vec.emplace_back( t );
total += t;
sumSq += double( t ) * t;
}
}
pdqsort_branchless( vec.begin(), vec.end() );
recalc = true;
m_frameSortData.limitRange = frameRange;
}
}
const auto vsz = m_frameSortData.data.size();
if( vsz == 0 )
{
TextFocused( "Count:", "0" );
ImGui::Separator();
ImGui::PushFont( g_fonts.normal, FontBig );
ImGui::Dummy( ImVec2( 0, ( ImGui::GetContentRegionAvail().y - ImGui::GetTextLineHeight() * 2 ) * 0.5f ) );
TextCentered( ICON_FA_COW );
TextCentered( "No frames to show" );
ImGui::PopFont();
ImGui::End();
return;
}
if( recalc )
{
auto& vec = m_frameSortData.data;
auto Percentile = [&vec, vsz]( double p ) { return vec[std::min<size_t>( vsz - 1, p * vsz )]; };
m_frameSortData.average = float( total ) / vsz;
m_frameSortData.median = Percentile( 0.5 );
m_frameSortData.p75 = Percentile( 0.75 );
m_frameSortData.p90 = Percentile( 0.9 );
m_frameSortData.p99 = Percentile( 0.99 );
m_frameSortData.p99_9 = Percentile( 0.999 );
m_frameSortData.total = total;
m_frameSortData.sumSq = sumSq;
m_frameSortData.frameNum = fsz;
if( vsz > 1 )
{
const auto avg = m_frameSortData.average;
const auto ss = m_frameSortData.sumSq - 2. * total * avg + double( avg ) * avg * vsz;
m_frameSortData.sd = sqrt( ss / ( vsz - 1 ) );
}
else
{
m_frameSortData.sd = 0;
}
}
const auto profileSpan = m_worker.GetLastTime();
TextFocused( "Count:", RealToString( vsz ) );
ImGui::SameLine();
ImGui::TextDisabled( "(%.2f%%)", 100.f * vsz / fsz );
if( ImGui::IsItemHovered() )
{
ImGui::BeginTooltip();
TextFocused( "Total count:", RealToString( fsz ) );
ImGui::EndTooltip();
}
ImGui::SameLine();
ImGui::Spacing();
ImGui::SameLine();
TextFocused( "Total time:", TimeToString( m_frameSortData.total ) );
ImGui::SameLine();
ImGui::TextDisabled( "(%.2f%% of profile time span)", m_frameSortData.total / float( profileSpan ) * 100.f );
ImGui::Separator();
const auto ty = ImGui::GetTextLineHeight();
auto& frames = m_frameSortData.data;
auto tmin = frames.front();
auto tmax = frames.back();
if( tmin != std::numeric_limits<int64_t>::max() )
{
TextDisabledUnformatted( "Minimum values in bin:" );
ImGui::SameLine();
ImGui::SetNextItemWidth( ImGui::CalcTextSize( "123456890123456" ).x );
ImGui::PushStyleVar( ImGuiStyleVar_FramePadding, ImVec2( 1, 1 ) );
ImGui::InputInt( "##minBinVal", &m_frameSortData.minBinVal );
if( m_frameSortData.minBinVal < 1 ) m_frameSortData.minBinVal = 1;
ImGui::SameLine();
if( ImGui::Button( "Reset" ) ) m_frameSortData.minBinVal = 1;
ImGui::PopStyleVar();
SmallCheckbox( "Log values", &m_frameSortData.logVal );
ImGui::SameLine();
SmallCheckbox( "Log time", &m_frameSortData.logTime );
TextDisabledUnformatted( "FPS range:" );
ImGui::SameLine();
ImGui::Text( "%s FPS - %s FPS", RealToString( round( 1000000000.0 / tmin ) ), RealToString( round( 1000000000.0 / tmax ) ) );
if( tmax - tmin > 0 )
{
const auto w = ImGui::GetContentRegionAvail().x;
const auto numBins = int64_t( w - 4 );
if( numBins > 1 )
{
if( numBins > m_frameSortData.numBins )
{
m_frameSortData.numBins = numBins;
m_frameSortData.bins = std::make_unique<int64_t[]>( numBins );
}
const auto& bins = m_frameSortData.bins;
memset( bins.get(), 0, sizeof( int64_t ) * numBins );
auto framesBegin = frames.begin();
auto framesEnd = frames.end();
while( framesBegin != framesEnd && *framesBegin == 0 ) ++framesBegin;
if( m_frameSortData.minBinVal > 1 )
{
if( m_frameSortData.logTime )
{
const auto tMinLog = log10( tmin );
const auto zmax = ( log10( tmax ) - tMinLog ) / numBins;
int64_t i;
for( i=0; i<numBins; i++ )
{
const auto nextBinVal = int64_t( pow( 10.0, tMinLog + ( i+1 ) * zmax ) );
auto nit = std::lower_bound( framesBegin, framesEnd, nextBinVal );
const auto distance = std::distance( framesBegin, nit );
if( distance >= m_frameSortData.minBinVal ) break;
framesBegin = nit;
}
for( int64_t j=numBins-1; j>i; j-- )
{
const auto nextBinVal = int64_t( pow( 10.0, tMinLog + ( j-1 ) * zmax ) );
auto nit = std::lower_bound( framesBegin, framesEnd, nextBinVal );
const auto distance = std::distance( nit, framesEnd );
if( distance >= m_frameSortData.minBinVal ) break;
framesEnd = nit;
}
}
else
{
const auto zmax = tmax - tmin;
int64_t i;
for( i=0; i<numBins; i++ )
{
const auto nextBinVal = tmin + ( i+1 ) * zmax / numBins;
auto nit = std::lower_bound( framesBegin, framesEnd, nextBinVal );
const auto distance = std::distance( framesBegin, nit );
if( distance >= m_frameSortData.minBinVal ) break;
framesBegin = nit;
}
for( int64_t j=numBins-1; j>i; j-- )
{
const auto nextBinVal = tmin + ( j-1 ) * zmax / numBins;
auto nit = std::lower_bound( framesBegin, framesEnd, nextBinVal );
const auto distance = std::distance( nit, framesEnd );
if( distance >= m_frameSortData.minBinVal ) break;
framesEnd = nit;
}
}
tmin = *framesBegin;
tmax = *(framesEnd-1);
}
if( m_frameSortData.logTime )
{
const auto tMinLog = log10( tmin );
const auto zmax = ( log10( tmax ) - tMinLog ) / numBins;
auto fit = framesBegin;
for( int64_t i=0; i<numBins; i++ )
{
const auto nextBinVal = int64_t( pow( 10.0, tMinLog + ( i+1 ) * zmax ) );
auto nit = std::lower_bound( fit, framesEnd, nextBinVal );
bins[i] = std::distance( fit, nit );
fit = nit;
}
bins[numBins-1] += std::distance( fit, framesEnd );
}
else
{
const auto zmax = tmax - tmin;
auto fit = framesBegin;
for( int64_t i=0; i<numBins; i++ )
{
const auto nextBinVal = tmin + ( i+1 ) * zmax / numBins;
auto nit = std::lower_bound( fit, framesEnd, nextBinVal );
bins[i] = std::distance( fit, nit );
fit = nit;
}
bins[numBins-1] += std::distance( fit, framesEnd );
}
int maxBin = 0;
int64_t maxVal = bins[0];
for( int i=1; i<numBins; i++ )
{
if( maxVal < bins[i] )
{
maxVal = bins[i];
maxBin = i;
}
}
ImGui::SameLine();
ImGui::Spacing();
ImGui::SameLine();
TextFocused( "Max counts:", RealToString( maxVal ) );
TextFocused( "Mean:", TimeToString( m_frameSortData.average ) );
ImGui::SameLine();
ImGui::TextDisabled( "(%s FPS)", RealToString( round( 1000000000.0 / m_frameSortData.average ) ) );
if( ImGui::IsItemHovered() )
{
ImGui::BeginTooltip();
ImGui::Text( "%s FPS", RealToString( 1000000000.0 / m_frameSortData.average ) );
ImGui::EndTooltip();
}
ImGui::SameLine();
ImGui::Spacing();
ImGui::SameLine();
TextFocused( "Median:", TimeToString( m_frameSortData.median ) );
ImGui::SameLine();
ImGui::TextDisabled( "(%s FPS)", RealToString( round( 1000000000.0 / m_frameSortData.median ) ) );
if( ImGui::IsItemHovered() )
{
ImGui::BeginTooltip();
ImGui::Text( "%s FPS", RealToString( 1000000000.0 / m_frameSortData.median ) );
ImGui::EndTooltip();
}
ImGui::SameLine();
ImGui::Spacing();
ImGui::SameLine();
{
int64_t t0, t1;
if( m_frameSortData.logTime )
{
const auto ltmin = log10( tmin );
const auto ltmax = log10( tmax );
t0 = int64_t( pow( 10, ltmin + double( maxBin ) / numBins * ( ltmax - ltmin ) ) );
t1 = int64_t( pow( 10, ltmin + double( maxBin+1 ) / numBins * ( ltmax - ltmin ) ) );
}
else
{
t0 = int64_t( tmin + double( maxBin ) / numBins * ( tmax - tmin ) );
t1 = int64_t( tmin + double( maxBin+1 ) / numBins * ( tmax - tmin ) );
}
const auto mode = ( t0 + t1 ) / 2;
TextFocused( "Mode:", TimeToString( mode ) );
ImGui::SameLine();
ImGui::TextDisabled( "(%s FPS)", RealToString( round( 1000000000.0 / mode ) ) );
if( ImGui::IsItemHovered() )
{
ImGui::BeginTooltip();
ImGui::Text( "%s FPS", RealToString( 1000000000.0 / mode ) );
ImGui::EndTooltip();
}
}
ImGui::SameLine();
ImGui::Spacing();
ImGui::SameLine();
TextFocused( "\xcf\x83:", TimeToString( m_frameSortData.sd ) );
TooltipIfHovered( "Standard deviation" );
ImGui::SameLine();
ImGui::TextDisabled( "(%.2f%%)", 100.f * m_frameSortData.sd / m_frameSortData.average );
TooltipIfHovered( "Coefficient of variation" );
constexpr auto PercentileLine = []( const char* label, int64_t t ) {
TextFocused( label, TimeToString( t ) );
ImGui::SameLine();
ImGui::TextDisabled( "(%s FPS)", RealToString( round( 1000000000.0 / t ) ) );
if( ImGui::IsItemHovered() )
{
ImGui::BeginTooltip();
ImGui::Text( "%s FPS", RealToString( 1000000000.0 / t ) );
ImGui::EndTooltip();
}
};
PercentileLine( "P75:", m_frameSortData.p75 );
ImGui::SameLine();
ImGui::Spacing();
ImGui::SameLine();
PercentileLine( "P90:", m_frameSortData.p90 );
ImGui::SameLine();
ImGui::Spacing();
ImGui::SameLine();
PercentileLine( "P99:", m_frameSortData.p99 );
ImGui::SameLine();
ImGui::Spacing();
ImGui::SameLine();
PercentileLine( "P99.9:", m_frameSortData.p99_9 );
ImGui::PushStyleVar( ImGuiStyleVar_FramePadding, ImVec2( 0, 0 ) );
ImGui::Checkbox( "###draw1", &m_frameSortData.drawAvgMed );
ImGui::SameLine();
ImGui::ColorButton( "c1", ImVec4( 0xFF/255.f, 0x44/255.f, 0x44/255.f, 1.f ), ImGuiColorEditFlags_NoTooltip | ImGuiColorEditFlags_NoDragDrop );
ImGui::SameLine();
ImGui::TextUnformatted( "Mean time" );
ImGui::SameLine();
ImGui::Spacing();
ImGui::SameLine();
ImGui::ColorButton( "c2", ImVec4( 0x44/255.f, 0x88/255.f, 0xFF/255.f, 1.f ), ImGuiColorEditFlags_NoTooltip | ImGuiColorEditFlags_NoDragDrop );
ImGui::SameLine();
ImGui::TextUnformatted( "Median time" );
ImGui::PopStyleVar();
const auto Height = 200 * scale;
const auto wpos = ImGui::GetCursorScreenPos();
const auto dpos = wpos + ImVec2( 0.5f, 0.5f );
ImGui::InvisibleButton( "##histogram", ImVec2( w, Height + round( ty * 2.5 ) ) );
const bool hover = ImGui::IsItemHovered();
auto draw = ImGui::GetWindowDrawList();
draw->AddRectFilled( wpos, wpos + ImVec2( w, Height ), 0x22FFFFFF );
draw->AddRect( wpos, wpos + ImVec2( w, Height ), 0x88FFFFFF );
if( m_frameSortData.logVal )
{
const auto hAdj = double( Height - 4 ) / log10( maxVal + 1 );
for( int i=0; i<numBins; i++ )
{
const auto val = bins[i];
if( val > 0 )
{
DrawLine( draw, dpos + ImVec2( 2+i, Height-3 ), dpos + ImVec2( 2+i, Height-3 - log10( val + 1 ) * hAdj ), 0xFF22DDDD );
}
}
}
else
{
const auto hAdj = double( Height - 4 ) / maxVal;
for( int i=0; i<numBins; i++ )
{
const auto val = bins[i];
if( val > 0 )
{
DrawLine( draw, dpos + ImVec2( 2+i, Height-3 ), dpos + ImVec2( 2+i, Height-3 - val * hAdj ), 0xFF22DDDD );
}
}
}
const auto xoff = 2;
const auto yoff = Height + 1;
DrawHistogramMinMaxLabel( draw, tmin, tmax, wpos + ImVec2( 0, yoff ), w, ty );
const auto ty05 = round( ty * 0.5f );
const auto ty025 = round( ty * 0.25f );
if( m_frameSortData.logTime )
{
const auto ltmin = log10( tmin );
const auto ltmax = log10( tmax );
const auto start = int( floor( ltmin ) );
const auto end = int( ceil( ltmax ) );
const auto range = ltmax - ltmin;
const auto step = w / range;
auto offset = start - ltmin;
int tw = 0;
int tx = 0;
auto tt = int64_t( pow( 10, start ) );
static const double logticks[] = { log10( 2 ), log10( 3 ), log10( 4 ), log10( 5 ), log10( 6 ), log10( 7 ), log10( 8 ), log10( 9 ) };
for( int i=start; i<=end; i++ )
{
const auto x = ( i - start + offset ) * step;
if( x >= 0 )
{
DrawLine( draw, dpos + ImVec2( x, yoff ), dpos + ImVec2( x, yoff + ty05 ), 0x66FFFFFF );
if( tw == 0 || x > tx + tw + ty * 1.1 )
{
tx = x;
auto txt = TimeToString( tt );
draw->AddText( wpos + ImVec2( x, yoff + ty05 ), 0x66FFFFFF, txt );
tw = ImGui::CalcTextSize( txt ).x;
}
}
for( int j=0; j<8; j++ )
{
const auto xoff = x + logticks[j] * step;
if( xoff >= 0 )
{
DrawLine( draw, dpos + ImVec2( xoff, yoff ), dpos + ImVec2( xoff, yoff + ty025 ), 0x66FFFFFF );
}
}
tt *= 10;
}
}
else
{
const auto pxns = numBins / double( tmax - tmin );
const auto nspx = 1.0 / pxns;
const auto scale = std::max<float>( 0.0f, round( log10( nspx ) + 2 ) );
const auto step = pow( 10, scale );
const auto dx = step * pxns;
double x = 0;
int tw = 0;
int tx = 0;
const auto sstep = step / 10.0;
const auto sdx = dx / 10.0;
static const double linelen[] = { 0.5, 0.25, 0.25, 0.25, 0.25, 0.375, 0.25, 0.25, 0.25, 0.25 };
int64_t tt = int64_t( ceil( tmin / sstep ) * sstep );
const auto diff = tmin / sstep - int64_t( tmin / sstep );
const auto xo = ( diff == 0 ? 0 : ( ( 1 - diff ) * sstep * pxns ) ) + xoff;
int iter = int( ceil( ( tmin - int64_t( tmin / step ) * step ) / sstep ) );
while( x < numBins )
{
DrawLine( draw, dpos + ImVec2( xo + x, yoff ), dpos + ImVec2( xo + x, yoff + round( ty * linelen[iter] ) ), 0x66FFFFFF );
if( iter == 0 && ( tw == 0 || x > tx + tw + ty * 1.1 ) )
{
tx = x;
auto txt = TimeToString( tt );
draw->AddText( wpos + ImVec2( xo + x, yoff + ty05 ), 0x66FFFFFF, txt );
tw = ImGui::CalcTextSize( txt ).x;
}
iter = ( iter + 1 ) % 10;
x += sdx;
tt += sstep;
}
}
if( m_frameSortData.drawAvgMed )
{
float ta, tm;
if( m_frameSortData.logTime )
{
const auto ltmin = log10( tmin );
const auto ltmax = log10( tmax );
ta = ( log10( m_frameSortData.average ) - ltmin ) / float( ltmax - ltmin ) * numBins;
tm = ( log10( m_frameSortData.median ) - ltmin ) / float( ltmax - ltmin ) * numBins;
}
else
{
ta = ( m_frameSortData.average - tmin ) / float( tmax - tmin ) * numBins;
tm = ( m_frameSortData.median - tmin ) / float( tmax - tmin ) * numBins;
}
ta = round( ta );
tm = round( tm );
if( ta == tm )
{
DrawLine( draw, ImVec2( dpos.x + ta, dpos.y ), ImVec2( dpos.x + ta, dpos.y+Height-2 ), 0xFFFF88FF );
}
else
{
DrawLine( draw, ImVec2( dpos.x + ta, dpos.y ), ImVec2( dpos.x + ta, dpos.y+Height-2 ), 0xFF4444FF );
DrawLine( draw, ImVec2( dpos.x + tm, dpos.y ), ImVec2( dpos.x + tm, dpos.y+Height-2 ), 0xFFFF8844 );
}
}
if( hover && ImGui::IsMouseHoveringRect( wpos + ImVec2( 2, 2 ), wpos + ImVec2( w-2, Height + round( ty * 1.5 ) ) ) )
{
const auto ltmin = log10( tmin );
const auto ltmax = log10( tmax );
auto& io = ImGui::GetIO();
DrawLine( draw, ImVec2( io.MousePos.x + 0.5f, dpos.y ), ImVec2( io.MousePos.x + 0.5f, dpos.y+Height-2 ), 0x33FFFFFF );
const auto bin = int64_t( io.MousePos.x - wpos.x - 2 );
int64_t t0, t1;
if( m_frameSortData.logTime )
{
t0 = int64_t( pow( 10, ltmin + double( bin ) / numBins * ( ltmax - ltmin ) ) );
// Hackfix for inability to select data in last bin.
// A proper solution would be nice.
if( bin+1 == numBins )
{
t1 = tmax;
}
else
{
t1 = int64_t( pow( 10, ltmin + double( bin+1 ) / numBins * ( ltmax - ltmin ) ) );
}
}
else
{
t0 = int64_t( tmin + double( bin ) / numBins * ( tmax - tmin ) );
t1 = int64_t( tmin + double( bin+1 ) / numBins * ( tmax - tmin ) );
}
ImGui::BeginTooltip();
TextDisabledUnformatted( "Time range:" );
ImGui::SameLine();
ImGui::Text( "%s - %s", TimeToString( t0 ), TimeToString( t1 ) );
ImGui::SameLine();
ImGui::TextDisabled( "(%s FPS - %s FPS)", RealToString( round( 1000000000.0 / t0 ) ), RealToString( round( 1000000000.0 / t1 ) ) );
TextFocused( "Count:", RealToString( bins[bin] ) );
ImGui::EndTooltip();
}
if( m_frameHover != -1 )
{
const auto frameTime = m_worker.GetFrameTime( *m_frames, m_frameHover );
float framePos;
if( m_frameSortData.logTime )
{
const auto ltmin = log10( tmin );
const auto ltmax = log10( tmax );
framePos = round( ( log10( frameTime ) - ltmin ) / float( ltmax - ltmin ) * numBins );
}
else
{
framePos = round( ( frameTime - tmin ) / float( tmax - tmin ) * numBins );
}
const auto c = uint32_t( ( sin( s_time * 10 ) * 0.25 + 0.75 ) * 255 );
const auto color = 0xFF000000 | ( c << 16 ) | ( c << 8 ) | c;
DrawLine( draw, ImVec2( dpos.x + framePos, dpos.y ), ImVec2( dpos.x + framePos, dpos.y+Height-2 ), color );
m_wasActive.store( true, std::memory_order_release );
}
}
}
}
}
ImGui::End();
}
}

View File

@@ -288,6 +288,7 @@ void View::DrawTimeline()
for( auto& r : m_ranges ) HandleRange( *r.range, timespan, ImGui::GetCursorScreenPos(), w );
for( auto& v : m_annotations )
{
if( !v->visible ) continue;
v->range.StartFrame();
HandleRange( v->range, timespan, ImGui::GetCursorScreenPos(), w );
}
@@ -296,7 +297,7 @@ void View::DrawTimeline()
const auto ty = ImGui::GetTextLineHeight();
for( auto& ann : m_annotations )
{
if( ann->range.min >= m_vd.zvEnd || ann->range.max <= m_vd.zvStart ) continue;
if( !ann->visible || ann->range.min >= m_vd.zvEnd || ann->range.max <= m_vd.zvStart ) continue;
const auto aMin = ( ann->range.min - m_vd.zvStart ) * pxns;
const auto aMax = ( ann->range.max - m_vd.zvStart ) * pxns;
if( ImGui::IsMouseHoveringRect( linepos + ImVec2( aMin, lineh - ty * 1.5f ), linepos + ImVec2( aMax, lineh ) ) )
@@ -411,7 +412,7 @@ void View::DrawTimeline()
const auto iconSize = ImGui::CalcTextSize( ICON_FA_NOTE_STICKY );
for( auto& ann : m_annotations )
{
if( ann->range.min < m_vd.zvEnd && ann->range.max > m_vd.zvStart )
if( ann->visible && ann->range.min < m_vd.zvEnd && ann->range.max > m_vd.zvStart )
{
uint32_t c0 = ( ann->color & 0xFFFFFF ) | ( m_selectedAnnotation == ann.get() ? 0x22000000 : 0x11000000 );
uint32_t c1 = ( ann->color & 0xFFFFFF ) | ( m_selectedAnnotation == ann.get() ? 0x88000000 : 0x66000000 );

View File

@@ -10,8 +10,6 @@
namespace tracy
{
extern double s_time;
void View::DrawInfo()
{
const auto scale = GetScale();
@@ -194,510 +192,6 @@ void View::DrawInfo()
ImGui::TreePop();
}
if( m_worker.AreFramesUsed() && ImGui::TreeNode( "Frame statistics" ) )
{
auto fsz = m_worker.GetFullFrameCount( *m_frames );
if( fsz != 0 )
{
TextFocused( "Frame set:", GetFrameSetName( *m_frames ) );
ImGui::SameLine();
ImGui::TextDisabled( "(%s)", m_frames->continuous ? "continuous" : "discontinuous" );
ImGui::SameLine();
ImGui::PushStyleVar( ImGuiStyleVar_FramePadding, ImVec2( 0, 0 ) );
if( ImGui::BeginCombo( "##frameCombo", nullptr, ImGuiComboFlags_NoPreview ) )
{
auto& frames = m_worker.GetFrames();
for( auto& fd : frames )
{
bool isSelected = m_frames == fd;
if( ImGui::Selectable( GetFrameSetName( *fd ), isSelected ) )
{
m_frames = fd;
fsz = m_worker.GetFullFrameCount( *m_frames );
}
if( isSelected )
{
ImGui::SetItemDefaultFocus();
}
ImGui::SameLine();
ImGui::TextDisabled( "(%s)", RealToString( fd->frames.size() ) );
}
ImGui::EndCombo();
}
ImGui::PopStyleVar();
ImGui::SameLine();
SmallCheckbox( "Limit to view", &m_frameSortData.limitToView );
if( m_frameSortData.limitToView )
{
ImGui::SameLine();
TextColoredUnformatted( 0xFF00FFFF, ICON_FA_TRIANGLE_EXCLAMATION );
}
const auto frameRange = m_worker.GetFrameRange( *m_frames, m_vd.zvStart, m_vd.zvEnd );
if( m_frameSortData.frameSet != m_frames || ( m_frameSortData.limitToView && m_frameSortData.limitRange != frameRange ) || ( !m_frameSortData.limitToView && m_frameSortData.limitRange.first != -1 ) )
{
m_frameSortData.frameSet = m_frames;
m_frameSortData.frameNum = 0;
m_frameSortData.data.clear();
m_frameSortData.total = 0;
}
bool recalc = false;
int64_t total = 0;
if( !m_frameSortData.limitToView )
{
if( m_frameSortData.frameNum != fsz || m_frameSortData.limitRange.first != -1 )
{
auto& vec = m_frameSortData.data;
vec.reserve( fsz );
const auto midSz = vec.size();
total = m_frameSortData.total;
for( size_t i=m_frameSortData.frameNum; i<fsz; i++ )
{
const auto t = m_worker.GetFrameTime( *m_frames, i );
if( t > 0 )
{
vec.emplace_back( t );
total += t;
}
}
auto mid = vec.begin() + midSz;
pdqsort_branchless( mid, vec.end() );
std::inplace_merge( vec.begin(), mid, vec.end() );
recalc = true;
m_frameSortData.limitRange.first = -1;
}
}
else
{
if( m_frameSortData.limitRange != frameRange )
{
auto& vec = m_frameSortData.data;
assert( vec.empty() );
vec.reserve( frameRange.second - frameRange.first );
for( int i=frameRange.first; i<frameRange.second; i++ )
{
const auto t = m_worker.GetFrameTime( *m_frames, i );
if( t > 0 )
{
vec.emplace_back( t );
total += t;
}
}
pdqsort_branchless( vec.begin(), vec.end() );
recalc = true;
m_frameSortData.limitRange = frameRange;
}
}
if( recalc )
{
auto& vec = m_frameSortData.data;
const auto vsz = vec.size();
m_frameSortData.average = float( total ) / vsz;
m_frameSortData.median = vec[vsz/2];
m_frameSortData.total = total;
m_frameSortData.frameNum = fsz;
}
const auto profileSpan = m_worker.GetLastTime();
TextFocused( "Count:", RealToString( fsz ) );
TextFocused( "Total time:", TimeToString( m_frameSortData.total ) );
ImGui::SameLine();
ImGui::TextDisabled( "(%.2f%% of profile time span)", m_frameSortData.total / float( profileSpan ) * 100.f );
TextFocused( "Mean frame time:", TimeToString( m_frameSortData.average ) );
ImGui::SameLine();
ImGui::TextDisabled( "(%s FPS)", RealToString( round( 1000000000.0 / m_frameSortData.average ) ) );
if( ImGui::IsItemHovered() )
{
ImGui::BeginTooltip();
ImGui::Text( "%s FPS", RealToString( 1000000000.0 / m_frameSortData.average ) );
ImGui::EndTooltip();
}
TextFocused( "Median frame time:", TimeToString( m_frameSortData.median ) );
ImGui::SameLine();
ImGui::TextDisabled( "(%s FPS)", RealToString( round( 1000000000.0 / m_frameSortData.median ) ) );
if( ImGui::IsItemHovered() )
{
ImGui::BeginTooltip();
ImGui::Text( "%s FPS", RealToString( 1000000000.0 / m_frameSortData.median ) );
ImGui::EndTooltip();
}
if( ImGui::TreeNodeEx( "Histogram", ImGuiTreeNodeFlags_DefaultOpen ) )
{
const auto ty = ImGui::GetTextLineHeight();
auto& frames = m_frameSortData.data;
auto tmin = frames.front();
auto tmax = frames.back();
if( tmin != std::numeric_limits<int64_t>::max() )
{
TextDisabledUnformatted( "Minimum values in bin:" );
ImGui::SameLine();
ImGui::SetNextItemWidth( ImGui::CalcTextSize( "123456890123456" ).x );
ImGui::PushStyleVar( ImGuiStyleVar_FramePadding, ImVec2( 1, 1 ) );
ImGui::InputInt( "##minBinVal", &m_frameSortData.minBinVal );
if( m_frameSortData.minBinVal < 1 ) m_frameSortData.minBinVal = 1;
ImGui::SameLine();
if( ImGui::Button( "Reset" ) ) m_frameSortData.minBinVal = 1;
ImGui::PopStyleVar();
SmallCheckbox( "Log values", &m_frameSortData.logVal );
ImGui::SameLine();
SmallCheckbox( "Log time", &m_frameSortData.logTime );
TextDisabledUnformatted( "FPS range:" );
ImGui::SameLine();
ImGui::Text( "%s FPS - %s FPS", RealToString( round( 1000000000.0 / tmin ) ), RealToString( round( 1000000000.0 / tmax ) ) );
if( tmax - tmin > 0 )
{
const auto w = ImGui::GetContentRegionAvail().x;
const auto numBins = int64_t( w - 4 );
if( numBins > 1 )
{
if( numBins > m_frameSortData.numBins )
{
m_frameSortData.numBins = numBins;
m_frameSortData.bins = std::make_unique<int64_t[]>( numBins );
}
const auto& bins = m_frameSortData.bins;
memset( bins.get(), 0, sizeof( int64_t ) * numBins );
auto framesBegin = frames.begin();
auto framesEnd = frames.end();
while( framesBegin != framesEnd && *framesBegin == 0 ) ++framesBegin;
if( m_frameSortData.minBinVal > 1 )
{
if( m_frameSortData.logTime )
{
const auto tMinLog = log10( tmin );
const auto zmax = ( log10( tmax ) - tMinLog ) / numBins;
int64_t i;
for( i=0; i<numBins; i++ )
{
const auto nextBinVal = int64_t( pow( 10.0, tMinLog + ( i+1 ) * zmax ) );
auto nit = std::lower_bound( framesBegin, framesEnd, nextBinVal );
const auto distance = std::distance( framesBegin, nit );
if( distance >= m_frameSortData.minBinVal ) break;
framesBegin = nit;
}
for( int64_t j=numBins-1; j>i; j-- )
{
const auto nextBinVal = int64_t( pow( 10.0, tMinLog + ( j-1 ) * zmax ) );
auto nit = std::lower_bound( framesBegin, framesEnd, nextBinVal );
const auto distance = std::distance( nit, framesEnd );
if( distance >= m_frameSortData.minBinVal ) break;
framesEnd = nit;
}
}
else
{
const auto zmax = tmax - tmin;
int64_t i;
for( i=0; i<numBins; i++ )
{
const auto nextBinVal = tmin + ( i+1 ) * zmax / numBins;
auto nit = std::lower_bound( framesBegin, framesEnd, nextBinVal );
const auto distance = std::distance( framesBegin, nit );
if( distance >= m_frameSortData.minBinVal ) break;
framesBegin = nit;
}
for( int64_t j=numBins-1; j>i; j-- )
{
const auto nextBinVal = tmin + ( j-1 ) * zmax / numBins;
auto nit = std::lower_bound( framesBegin, framesEnd, nextBinVal );
const auto distance = std::distance( nit, framesEnd );
if( distance >= m_frameSortData.minBinVal ) break;
framesEnd = nit;
}
}
tmin = *framesBegin;
tmax = *(framesEnd-1);
}
if( m_frameSortData.logTime )
{
const auto tMinLog = log10( tmin );
const auto zmax = ( log10( tmax ) - tMinLog ) / numBins;
auto fit = framesBegin;
for( int64_t i=0; i<numBins; i++ )
{
const auto nextBinVal = int64_t( pow( 10.0, tMinLog + ( i+1 ) * zmax ) );
auto nit = std::lower_bound( fit, framesEnd, nextBinVal );
bins[i] = std::distance( fit, nit );
fit = nit;
}
bins[numBins-1] += std::distance( fit, framesEnd );
}
else
{
const auto zmax = tmax - tmin;
auto fit = framesBegin;
for( int64_t i=0; i<numBins; i++ )
{
const auto nextBinVal = tmin + ( i+1 ) * zmax / numBins;
auto nit = std::lower_bound( fit, framesEnd, nextBinVal );
bins[i] = std::distance( fit, nit );
fit = nit;
}
bins[numBins-1] += std::distance( fit, framesEnd );
}
int64_t maxVal = bins[0];
for( int i=1; i<numBins; i++ )
{
maxVal = std::max( maxVal, bins[i] );
}
TextFocused( "Max counts:", RealToString( maxVal ) );
ImGui::PushStyleVar( ImGuiStyleVar_FramePadding, ImVec2( 0, 0 ) );
ImGui::Checkbox( "###draw1", &m_frameSortData.drawAvgMed );
ImGui::SameLine();
ImGui::ColorButton( "c1", ImVec4( 0xFF/255.f, 0x44/255.f, 0x44/255.f, 1.f ), ImGuiColorEditFlags_NoTooltip | ImGuiColorEditFlags_NoDragDrop );
ImGui::SameLine();
ImGui::TextUnformatted( "Mean time" );
ImGui::SameLine();
ImGui::Spacing();
ImGui::SameLine();
ImGui::ColorButton( "c2", ImVec4( 0x44/255.f, 0x88/255.f, 0xFF/255.f, 1.f ), ImGuiColorEditFlags_NoTooltip | ImGuiColorEditFlags_NoDragDrop );
ImGui::SameLine();
ImGui::TextUnformatted( "Median time" );
ImGui::PopStyleVar();
const auto Height = 200 * scale;
const auto wpos = ImGui::GetCursorScreenPos();
const auto dpos = wpos + ImVec2( 0.5f, 0.5f );
ImGui::InvisibleButton( "##histogram", ImVec2( w, Height + round( ty * 2.5 ) ) );
const bool hover = ImGui::IsItemHovered();
auto draw = ImGui::GetWindowDrawList();
draw->AddRectFilled( wpos, wpos + ImVec2( w, Height ), 0x22FFFFFF );
draw->AddRect( wpos, wpos + ImVec2( w, Height ), 0x88FFFFFF );
if( m_frameSortData.logVal )
{
const auto hAdj = double( Height - 4 ) / log10( maxVal + 1 );
for( int i=0; i<numBins; i++ )
{
const auto val = bins[i];
if( val > 0 )
{
DrawLine( draw, dpos + ImVec2( 2+i, Height-3 ), dpos + ImVec2( 2+i, Height-3 - log10( val + 1 ) * hAdj ), 0xFF22DDDD );
}
}
}
else
{
const auto hAdj = double( Height - 4 ) / maxVal;
for( int i=0; i<numBins; i++ )
{
const auto val = bins[i];
if( val > 0 )
{
DrawLine( draw, dpos + ImVec2( 2+i, Height-3 ), dpos + ImVec2( 2+i, Height-3 - val * hAdj ), 0xFF22DDDD );
}
}
}
const auto xoff = 2;
const auto yoff = Height + 1;
DrawHistogramMinMaxLabel( draw, tmin, tmax, wpos + ImVec2( 0, yoff ), w, ty );
const auto ty05 = round( ty * 0.5f );
const auto ty025 = round( ty * 0.25f );
if( m_frameSortData.logTime )
{
const auto ltmin = log10( tmin );
const auto ltmax = log10( tmax );
const auto start = int( floor( ltmin ) );
const auto end = int( ceil( ltmax ) );
const auto range = ltmax - ltmin;
const auto step = w / range;
auto offset = start - ltmin;
int tw = 0;
int tx = 0;
auto tt = int64_t( pow( 10, start ) );
static const double logticks[] = { log10( 2 ), log10( 3 ), log10( 4 ), log10( 5 ), log10( 6 ), log10( 7 ), log10( 8 ), log10( 9 ) };
for( int i=start; i<=end; i++ )
{
const auto x = ( i - start + offset ) * step;
if( x >= 0 )
{
DrawLine( draw, dpos + ImVec2( x, yoff ), dpos + ImVec2( x, yoff + ty05 ), 0x66FFFFFF );
if( tw == 0 || x > tx + tw + ty * 1.1 )
{
tx = x;
auto txt = TimeToString( tt );
draw->AddText( wpos + ImVec2( x, yoff + ty05 ), 0x66FFFFFF, txt );
tw = ImGui::CalcTextSize( txt ).x;
}
}
for( int j=0; j<8; j++ )
{
const auto xoff = x + logticks[j] * step;
if( xoff >= 0 )
{
DrawLine( draw, dpos + ImVec2( xoff, yoff ), dpos + ImVec2( xoff, yoff + ty025 ), 0x66FFFFFF );
}
}
tt *= 10;
}
}
else
{
const auto pxns = numBins / double( tmax - tmin );
const auto nspx = 1.0 / pxns;
const auto scale = std::max<float>( 0.0f, round( log10( nspx ) + 2 ) );
const auto step = pow( 10, scale );
const auto dx = step * pxns;
double x = 0;
int tw = 0;
int tx = 0;
const auto sstep = step / 10.0;
const auto sdx = dx / 10.0;
static const double linelen[] = { 0.5, 0.25, 0.25, 0.25, 0.25, 0.375, 0.25, 0.25, 0.25, 0.25 };
int64_t tt = int64_t( ceil( tmin / sstep ) * sstep );
const auto diff = tmin / sstep - int64_t( tmin / sstep );
const auto xo = ( diff == 0 ? 0 : ( ( 1 - diff ) * sstep * pxns ) ) + xoff;
int iter = int( ceil( ( tmin - int64_t( tmin / step ) * step ) / sstep ) );
while( x < numBins )
{
DrawLine( draw, dpos + ImVec2( xo + x, yoff ), dpos + ImVec2( xo + x, yoff + round( ty * linelen[iter] ) ), 0x66FFFFFF );
if( iter == 0 && ( tw == 0 || x > tx + tw + ty * 1.1 ) )
{
tx = x;
auto txt = TimeToString( tt );
draw->AddText( wpos + ImVec2( xo + x, yoff + ty05 ), 0x66FFFFFF, txt );
tw = ImGui::CalcTextSize( txt ).x;
}
iter = ( iter + 1 ) % 10;
x += sdx;
tt += sstep;
}
}
if( m_frameSortData.drawAvgMed )
{
float ta, tm;
if( m_frameSortData.logTime )
{
const auto ltmin = log10( tmin );
const auto ltmax = log10( tmax );
ta = ( log10( m_frameSortData.average ) - ltmin ) / float( ltmax - ltmin ) * numBins;
tm = ( log10( m_frameSortData.median ) - ltmin ) / float( ltmax - ltmin ) * numBins;
}
else
{
ta = ( m_frameSortData.average - tmin ) / float( tmax - tmin ) * numBins;
tm = ( m_frameSortData.median - tmin ) / float( tmax - tmin ) * numBins;
}
ta = round( ta );
tm = round( tm );
if( ta == tm )
{
DrawLine( draw, ImVec2( dpos.x + ta, dpos.y ), ImVec2( dpos.x + ta, dpos.y+Height-2 ), 0xFFFF88FF );
}
else
{
DrawLine( draw, ImVec2( dpos.x + ta, dpos.y ), ImVec2( dpos.x + ta, dpos.y+Height-2 ), 0xFF4444FF );
DrawLine( draw, ImVec2( dpos.x + tm, dpos.y ), ImVec2( dpos.x + tm, dpos.y+Height-2 ), 0xFFFF8844 );
}
}
if( hover && ImGui::IsMouseHoveringRect( wpos + ImVec2( 2, 2 ), wpos + ImVec2( w-2, Height + round( ty * 1.5 ) ) ) )
{
const auto ltmin = log10( tmin );
const auto ltmax = log10( tmax );
auto& io = ImGui::GetIO();
DrawLine( draw, ImVec2( io.MousePos.x + 0.5f, dpos.y ), ImVec2( io.MousePos.x + 0.5f, dpos.y+Height-2 ), 0x33FFFFFF );
const auto bin = int64_t( io.MousePos.x - wpos.x - 2 );
int64_t t0, t1;
if( m_frameSortData.logTime )
{
t0 = int64_t( pow( 10, ltmin + double( bin ) / numBins * ( ltmax - ltmin ) ) );
// Hackfix for inability to select data in last bin.
// A proper solution would be nice.
if( bin+1 == numBins )
{
t1 = tmax;
}
else
{
t1 = int64_t( pow( 10, ltmin + double( bin+1 ) / numBins * ( ltmax - ltmin ) ) );
}
}
else
{
t0 = int64_t( tmin + double( bin ) / numBins * ( tmax - tmin ) );
t1 = int64_t( tmin + double( bin+1 ) / numBins * ( tmax - tmin ) );
}
ImGui::BeginTooltip();
TextDisabledUnformatted( "Time range:" );
ImGui::SameLine();
ImGui::Text( "%s - %s", TimeToString( t0 ), TimeToString( t1 ) );
ImGui::SameLine();
ImGui::TextDisabled( "(%s FPS - %s FPS)", RealToString( round( 1000000000.0 / t0 ) ), RealToString( round( 1000000000.0 / t1 ) ) );
TextFocused( "Count:", RealToString( bins[bin] ) );
ImGui::EndTooltip();
}
if( m_frameHover != -1 )
{
const auto frameTime = m_worker.GetFrameTime( *m_frames, m_frameHover );
float framePos;
if( m_frameSortData.logTime )
{
const auto ltmin = log10( tmin );
const auto ltmax = log10( tmax );
framePos = round( ( log10( frameTime ) - ltmin ) / float( ltmax - ltmin ) * numBins );
}
else
{
framePos = round( ( frameTime - tmin ) / float( tmax - tmin ) * numBins );
}
const auto c = uint32_t( ( sin( s_time * 10 ) * 0.25 + 0.75 ) * 255 );
const auto color = 0xFF000000 | ( c << 16 ) | ( c << 8 ) | c;
DrawLine( draw, ImVec2( dpos.x + framePos, dpos.y ), ImVec2( dpos.x + framePos, dpos.y+Height-2 ), color );
m_wasActive.store( true, std::memory_order_release );
}
}
}
}
ImGui::TreePop();
}
}
ImGui::TreePop();
}
auto& topology = m_worker.GetCpuTopology();
if( !topology.empty() )
{

View File

@@ -10,10 +10,6 @@
#define TracyCUDAStartProfiling(ctx)
#define TracyCUDAStopProfiling(ctx)
#define TracyCUDAEnableKernelMetrics(ctx)
#define TracyCUDADisableKernelMetrics(ctx)
#define TracyCUDADumpKernelMetrics(ctx)
#define TracyCUDACollect(ctx)
namespace tracy{
@@ -61,24 +57,6 @@ using CUDACtx = std::nullptr_t;
#define TRACY_CUDA_ENABLE_CUDA_CALL_STATS (0)
#endif//TRACY_CUDA_ENABLE_CUDA_CALL_STATS
// Opt-in CUPTI Range Profiler support: collects hardware metrics for kernel
// dispatches / graph launches and surfaces them as Tracy plots. OFF by default
// because it pulls in an extra dependency (the NVPW host library) and may
// require elevated GPU performance-counter permissions at runtime.
#ifndef TRACY_CUDA_ENABLE_KERNEL_METRICS
#define TRACY_CUDA_ENABLE_KERNEL_METRICS (0)
#endif//TRACY_CUDA_ENABLE_KERNEL_METRICS
#if TRACY_CUDA_ENABLE_KERNEL_METRICS
#if CUDA_VERSION < 12060
#error "kernel metrics require CUDA v12.6 (or later)"
#endif
#include <cupti_target.h>
#include <cupti_profiler_target.h>
#include <cupti_profiler_host.h>
#include <cupti_range_profiler.h>
#endif//TRACY_CUDA_ENABLE_KERNEL_METRICS
namespace {
// TODO(marcos): wrap these in structs for better type safety
@@ -590,22 +568,6 @@ namespace tracy
printStats();
}
void EnableKernelMetrics()
{
ZoneScoped;
#if TRACY_CUDA_ENABLE_KERNEL_METRICS
CUPTI::BeginKernelMetrics(this);
#endif//TRACY_CUDA_ENABLE_KERNEL_METRICS
}
void DisableKernelMetrics()
{
ZoneScoped;
#if TRACY_CUDA_ENABLE_KERNEL_METRICS
CUPTI::EndKernelMetrics();
#endif//TRACY_CUDA_ENABLE_KERNEL_METRICS
}
void Name(const char *name, uint16_t len)
{
auto ptr = (char*)tracyMalloc(len);
@@ -1301,262 +1263,6 @@ namespace tracy
//CUPTI_ACTIVITY_KIND_DRIVER,
};
#if TRACY_CUDA_ENABLE_KERNEL_METRICS
// Curated set of PerfWorks metrics collected per range (dispatch / graph
// launch). Kept deliberately small so the whole set fits in a SINGLE
// collection pass (no kernel replay): all are high-level "% of peak"
// rollups, so each maps to only a few raw counters. BeginKernelMetrics
// asserts the resulting config needs exactly one pass; trim this list if
// a given architecture's counter budget is exceeded. Note that some
// metric names can vary across GPU architectures.
static constexpr const char* kKernelMetricNames[] = {
// Compute efficiency
"sm__throughput.avg.pct_of_peak_sustained_elapsed", // overall SM/compute throughput
"smsp__inst_executed.avg.per_cycle_active", // instructions issued per active cycle (IPC)
// Memory utilization
"gpu__compute_memory_throughput.avg.pct_of_peak_sustained_elapsed", // overall memory-subsystem throughput
"dram__throughput.avg.pct_of_peak_sustained_elapsed", // device-memory (DRAM) bandwidth
"lts__t_sector_hit_rate.pct", // L2 cache hit rate
// Occupancy / latency hiding
"sm__warps_active.avg.pct_of_peak_sustained_active", // achieved occupancy
// Stalls (issue-slot utilization: low value => warps stalled)
"smsp__issue_active.avg.pct_of_peak_sustained_active",
};
static constexpr size_t kKernelMetricCount = sizeof(kKernelMetricNames) / sizeof(kKernelMetricNames[0]);
// Upper bound on ranges (dispatches / graph launches) buffered in one
// counter-data image before it must be decoded and reset. Tunable; sizes
// the counter-data allocation and caps ranges-per-pass at SetConfig time.
static constexpr size_t kMaxRangesPerPass = 64;
// Range Profiler bring-up: one-time process-wide profiler init plus
// creation of a Range Profiler object bound to the current CUDA context.
// Metric configuration (chip name + curated single-pass metric list ->
// config image), cuptiRangeProfilerSetConfig() in user-range mode, and
// cuptiRangeProfilerStart() are wired up in a later step.
static void BeginKernelMetrics(CUDACtx* profilerHost) {
ZoneScoped;
UNREFERENCED(profilerHost);
auto& rangeProfiler = PersistentState::Get().rangeProfiler;
if (rangeProfiler != nullptr) {
return; // already initialized
}
// Process-wide CUPTI profiler initialization (idempotent).
CUpti_Profiler_Initialize_Params initParams = { CUpti_Profiler_Initialize_Params_STRUCT_SIZE };
CUPTI_API_CALL(cuptiProfilerInitialize(&initParams));
// Bind a Range Profiler object to the calling thread's current
// context (a null ctx tells CUPTI to use the current context).
CUcontext cuCtx = nullptr;
DRIVER_API_CALL(cuCtxGetCurrent(&cuCtx));
CUpti_RangeProfiler_Enable_Params createParams = { CUpti_RangeProfiler_Enable_Params_STRUCT_SIZE };
createParams.ctx = cuCtx;
CUPTI_API_CALL(cuptiRangeProfilerEnable(&createParams));
rangeProfiler = createParams.pRangeProfilerObject;
// Query the device chip name: the host profiler is chip-specific and
// needs it to resolve metric names into hardware counters.
CUdevice cuDevice = 0;
DRIVER_API_CALL(cuCtxGetDevice(&cuDevice));
CUpti_Device_GetChipName_Params chipNameParams = { CUpti_Device_GetChipName_Params_STRUCT_SIZE };
chipNameParams.deviceIndex = (size_t)cuDevice;
CUPTI_API_CALL(cuptiDeviceGetChipName(&chipNameParams));
auto& chipName = PersistentState::Get().chipName;
chipName = chipNameParams.pChipName;
// Snapshot which counters are actually available on this context so
// the host can reject unsupported metrics up front. The first call
// reports the image size; the second fills the allocated buffer.
auto& counterAvailabilityImage = PersistentState::Get().counterAvailabilityImage;
CUpti_Profiler_GetCounterAvailability_Params availabilityParams = { CUpti_Profiler_GetCounterAvailability_Params_STRUCT_SIZE };
availabilityParams.ctx = cuCtx;
CUPTI_API_CALL(cuptiProfilerGetCounterAvailability(&availabilityParams));
counterAvailabilityImage.resize(availabilityParams.counterAvailabilityImageSize);
availabilityParams.pCounterAvailabilityImage = counterAvailabilityImage.data();
CUPTI_API_CALL(cuptiProfilerGetCounterAvailability(&availabilityParams));
// Create the host profiler object: it builds metric config images
// and later evaluates counter data into metric values.
CUpti_Profiler_Host_Initialize_Params hostInitParams = { CUpti_Profiler_Host_Initialize_Params_STRUCT_SIZE };
hostInitParams.profilerType = CUPTI_PROFILER_TYPE_RANGE_PROFILER;
hostInitParams.pChipName = chipName.c_str();
hostInitParams.pCounterAvailabilityImage = counterAvailabilityImage.data();
CUPTI_API_CALL(cuptiProfilerHostInitialize(&hostInitParams));
auto* hostObject = hostInitParams.pHostObject;
PersistentState::Get().profilerHostObject = hostObject;
// Add the curated metric list to the host config and bake a config
// image describing the hardware counters to collect.
CUpti_Profiler_Host_ConfigAddMetrics_Params addMetricsParams = { CUpti_Profiler_Host_ConfigAddMetrics_Params_STRUCT_SIZE };
addMetricsParams.pHostObject = hostObject;
addMetricsParams.ppMetricNames = const_cast<const char**>(kKernelMetricNames);
addMetricsParams.numMetrics = kKernelMetricCount;
CUPTI_API_CALL(cuptiProfilerHostConfigAddMetrics(&addMetricsParams));
auto& configImage = PersistentState::Get().configImage;
CUpti_Profiler_Host_GetConfigImageSize_Params configSizeParams = { CUpti_Profiler_Host_GetConfigImageSize_Params_STRUCT_SIZE };
configSizeParams.pHostObject = hostObject;
CUPTI_API_CALL(cuptiProfilerHostGetConfigImageSize(&configSizeParams));
configImage.resize(configSizeParams.configImageSize);
CUpti_Profiler_Host_GetConfigImage_Params configImageParams = { CUpti_Profiler_Host_GetConfigImage_Params_STRUCT_SIZE };
configImageParams.pHostObject = hostObject;
configImageParams.pConfigImage = configImage.data();
configImageParams.configImageSize = configImage.size();
CUPTI_API_CALL(cuptiProfilerHostGetConfigImage(&configImageParams));
// Enforce the single-pass (no-replay) design constraint up front: if
// the selected metrics don't fit one pass, trim kKernelMetricNames.
CUpti_Profiler_Host_GetNumOfPasses_Params numPassesParams = { CUpti_Profiler_Host_GetNumOfPasses_Params_STRUCT_SIZE };
numPassesParams.pConfigImage = configImage.data();
numPassesParams.configImageSize = configImage.size();
CUPTI_API_CALL(cuptiProfilerHostGetNumOfPasses(&numPassesParams));
if (numPassesParams.numOfPasses != 1) {
fprintf(stderr, "ERROR:\tTracyCUDA kernel metrics need a single-pass config, "
"but the selected metrics require %llu passes; trim kKernelMetricNames.\n",
(unsigned long long)numPassesParams.numOfPasses);
assert(numPassesParams.numOfPasses == 1);
}
// Allocate and initialize the counter-data image the Range Profiler
// fills during collection (sized for the metric set and range batch).
auto& counterDataImage = PersistentState::Get().counterDataImage;
CUpti_RangeProfiler_GetCounterDataSize_Params counterDataSizeParams = { CUpti_RangeProfiler_GetCounterDataSize_Params_STRUCT_SIZE };
counterDataSizeParams.pRangeProfilerObject = rangeProfiler;
counterDataSizeParams.pMetricNames = const_cast<const char**>(kKernelMetricNames);
counterDataSizeParams.numMetrics = kKernelMetricCount;
counterDataSizeParams.maxNumOfRanges = kMaxRangesPerPass;
counterDataSizeParams.maxNumRangeTreeNodes = kMaxRangesPerPass;
CUPTI_API_CALL(cuptiRangeProfilerGetCounterDataSize(&counterDataSizeParams));
counterDataImage.resize(counterDataSizeParams.counterDataSize);
CUpti_RangeProfiler_CounterDataImage_Initialize_Params counterDataInitParams = { CUpti_RangeProfiler_CounterDataImage_Initialize_Params_STRUCT_SIZE };
counterDataInitParams.pRangeProfilerObject = rangeProfiler;
counterDataInitParams.counterDataSize = counterDataImage.size();
counterDataInitParams.pCounterData = counterDataImage.data();
CUPTI_API_CALL(cuptiRangeProfilerCounterDataImageInitialize(&counterDataInitParams));
// Bind config + counter-data to the Range Profiler in user-range mode.
// A single pass means the user-replay loop runs exactly once, so there
// is no kernel re-execution and no forced serialization of overlap.
CUpti_RangeProfiler_SetConfig_Params setConfigParams = { CUpti_RangeProfiler_SetConfig_Params_STRUCT_SIZE };
setConfigParams.pRangeProfilerObject = rangeProfiler;
setConfigParams.pConfig = configImage.data();
setConfigParams.configSize = configImage.size();
setConfigParams.pCounterDataImage = counterDataImage.data();
setConfigParams.counterDataImageSize = counterDataImage.size();
setConfigParams.range = CUPTI_UserRange;
setConfigParams.replayMode = CUPTI_UserReplay;
setConfigParams.maxRangesPerPass = kMaxRangesPerPass;
setConfigParams.numNestingLevels = 1;
setConfigParams.minNestingLevel = 1;
CUPTI_API_CALL(cuptiRangeProfilerSetConfig(&setConfigParams));
}
static void EndKernelMetrics() {
ZoneScoped;
auto& rangeProfiler = PersistentState::Get().rangeProfiler;
if (rangeProfiler == nullptr) {
return;
}
// TODO(metrics): cuptiRangeProfilerStop() once Start() is wired up.
CUpti_RangeProfiler_Disable_Params destroyParams = { CUpti_RangeProfiler_Disable_Params_STRUCT_SIZE };
destroyParams.pRangeProfilerObject = rangeProfiler;
CUPTI_API_CALL(cuptiRangeProfilerDisable(&destroyParams));
rangeProfiler = nullptr;
auto& profilerHostObject = PersistentState::Get().profilerHostObject;
if (profilerHostObject != nullptr) {
CUpti_Profiler_Host_Deinitialize_Params hostDeinitParams = { CUpti_Profiler_Host_Deinitialize_Params_STRUCT_SIZE };
hostDeinitParams.pHostObject = profilerHostObject;
CUPTI_API_CALL(cuptiProfilerHostDeinitialize(&hostDeinitParams));
profilerHostObject = nullptr;
}
}
// Debug utility: print host/chip information and the metrics available
// for collection on the current device. Self-contained — spins up a
// temporary host object and tears it down, so it is safe to call without
// an active profiling session. Output goes to stdout and can be verbose.
static void DumpAvailableMetrics() {
ZoneScoped;
CUcontext cuCtx = nullptr;
DRIVER_API_CALL(cuCtxGetCurrent(&cuCtx));
if (cuCtx == nullptr) {
fprintf(stderr, "TracyCUDA: no current CUDA context; cannot list metrics.\n");
return;
}
// Device identity (human-readable), independent of the profiler.
CUdevice cuDevice = 0;
DRIVER_API_CALL(cuCtxGetDevice(&cuDevice));
char deviceName[256] = {};
DRIVER_API_CALL(cuDeviceGetName(deviceName, sizeof(deviceName), cuDevice));
int ccMajor = 0, ccMinor = 0;
DRIVER_API_CALL(cuDeviceGetAttribute(&ccMajor, CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR, cuDevice));
DRIVER_API_CALL(cuDeviceGetAttribute(&ccMinor, CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR, cuDevice));
// Profiler-side chip name (what the host object keys metrics on).
CUpti_Profiler_Initialize_Params initParams = { CUpti_Profiler_Initialize_Params_STRUCT_SIZE };
CUPTI_API_CALL(cuptiProfilerInitialize(&initParams));
CUpti_Device_GetChipName_Params chipNameParams = { CUpti_Device_GetChipName_Params_STRUCT_SIZE };
chipNameParams.deviceIndex = (size_t)cuDevice;
CUPTI_API_CALL(cuptiDeviceGetChipName(&chipNameParams));
fprintf(stdout, "\nTracyCUDA kernel-metrics host info:\n");
fprintf(stdout, " Device : %s (compute capability %d.%d)\n", deviceName, ccMajor, ccMinor);
fprintf(stdout, " Chip : %s\n", chipNameParams.pChipName);
// Listing metrics is a pure host-side query keyed on the chip name,
// so it must NOT touch the device: we deliberately skip
// cuptiProfilerGetCounterAvailability() (which reserves the perf
// counters and is admin-gated -> INSUFFICIENT_PRIVILEGES / HARDWARE_BUSY)
// and pass a null counter-availability image. Per the CUPTI docs the
// image is only required for chips newer than the CUPTI build; for a
// chip known at release time the name alone enumerates every metric.
// The listing therefore reflects everything the chip defines rather
// than what this context can currently collect — which is exactly
// what a static "available metrics" dump should report.
CUpti_Profiler_Host_Initialize_Params hostInitParams = { CUpti_Profiler_Host_Initialize_Params_STRUCT_SIZE };
hostInitParams.profilerType = CUPTI_PROFILER_TYPE_RANGE_PROFILER;
hostInitParams.pChipName = chipNameParams.pChipName;
hostInitParams.pCounterAvailabilityImage = nullptr;
CUPTI_API_CALL(cuptiProfilerHostInitialize(&hostInitParams));
auto* hostObject = hostInitParams.pHostObject;
// Enumerate base metrics per type, and each base metric's submetrics
// (the fully-qualified, collectible names like "<base>.avg.pct_...").
const CUpti_MetricType metricTypes[] = {
CUPTI_METRIC_TYPE_COUNTER,
CUPTI_METRIC_TYPE_RATIO,
CUPTI_METRIC_TYPE_THROUGHPUT,
};
const char* const metricTypeNames[] = { "counter", "ratio", "throughput" };
for (size_t t = 0; t < sizeof(metricTypes) / sizeof(metricTypes[0]); ++t) {
CUpti_Profiler_Host_GetBaseMetrics_Params baseParams = { CUpti_Profiler_Host_GetBaseMetrics_Params_STRUCT_SIZE };
baseParams.pHostObject = hostObject;
baseParams.metricType = metricTypes[t];
CUPTI_API_CALL(cuptiProfilerHostGetBaseMetrics(&baseParams));
fprintf(stdout, "\n %zu %s metrics:\n", (size_t)baseParams.numMetrics, metricTypeNames[t]);
for (size_t i = 0; i < baseParams.numMetrics; ++i) {
const char* baseName = baseParams.ppMetricNames[i];
fprintf(stdout, " %s\n", baseName);
CUpti_Profiler_Host_GetSubMetrics_Params subParams = { CUpti_Profiler_Host_GetSubMetrics_Params_STRUCT_SIZE };
subParams.pHostObject = hostObject;
subParams.metricType = metricTypes[t];
subParams.pMetricName = baseName;
CUPTI_API_CALL(cuptiProfilerHostGetSubMetrics(&subParams));
for (size_t s = 0; s < subParams.numOfSubmetrics; ++s) {
fprintf(stdout, " .%s\n", subParams.ppSubMetrics[s]);
}
}
}
CUpti_Profiler_Host_Deinitialize_Params hostDeinitParams = { CUpti_Profiler_Host_Deinitialize_Params_STRUCT_SIZE };
hostDeinitParams.pHostObject = hostObject;
CUPTI_API_CALL(cuptiProfilerHostDeinitialize(&hostDeinitParams));
}
#endif//TRACY_CUDA_ENABLE_KERNEL_METRICS
static void BeginInstrumentation(CUDACtx* profilerHost) {
auto& currentProfilerHost = PersistentState::Get().profilerHost;
if (currentProfilerHost != nullptr) {
@@ -1568,8 +1274,6 @@ namespace tracy
// CUDA API calls and device activities that happens past this point
cudaDeviceSynchronize();
DumpAvailableMetrics();
auto& subscriber = PersistentState::Get().subscriber;
CUPTI_API_CALL(cuptiSubscribe(&subscriber, CUPTI::OnCallbackAPI, profilerHost));
CUPTI_API_CALL(cuptiActivityRegisterCallbacks(CUPTI::OnBufferRequested, CUPTI::OnBufferCompleted));
@@ -1695,25 +1399,6 @@ namespace tracy
CUpti_SubscriberHandle subscriber = {};
CUDACtx* profilerHost = nullptr;
#if TRACY_CUDA_ENABLE_KERNEL_METRICS
// CUPTI Range Profiler object, created in BeginKernelMetrics() and
// destroyed in EndKernelMetrics(). nullptr when metrics are inactive.
CUpti_RangeProfiler_Object* rangeProfiler = nullptr;
// Host-side profiler object: builds metric config images from a
// metric list and decodes counter data into metric values. It is
// chip-specific, hence the cached chip name and counter availability
// snapshot it is initialized from.
CUpti_Profiler_Host_Object* profilerHostObject = nullptr;
std::string chipName;
std::vector<uint8_t> counterAvailabilityImage;
// Config image: which counters to collect (baked from the metric
// list). Counter-data image: where the Range Profiler writes the
// collected values. Both must outlive the active profiling session,
// as cuptiRangeProfilerSetConfig() retains pointers into them.
std::vector<uint8_t> configImage;
std::vector<uint8_t> counterDataImage;
#endif//TRACY_CUDA_ENABLE_KERNEL_METRICS
Collector collector;
static PersistentState& Get() {
@@ -1806,10 +1491,6 @@ namespace tracy
#define TracyCUDAStartProfiling(ctx) ctx->StartProfiling()
#define TracyCUDAStopProfiling(ctx) ctx->StopProfiling()
#define TracyCUDAEnableKernelMetrics(ctx) ctx->EnableKernelMetrics()
#define TracyCUDADisableKernelMetrics(ctx) ctx->DisableKernelMetrics()
#define TracyCUDADumpKernelMetrics(ctx) ctx->DumpKernelMetrics()
#define TracyCUDACollect(ctx) ctx->Collect()
#endif