Compare commits
357 Commits
fix/timeli
...
slomp/cuda
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
57ac18bc83 | ||
|
|
280475ff3d | ||
|
|
922604fe6a | ||
|
|
5b29550ded | ||
|
|
56eb3b776f | ||
|
|
98ad778495 | ||
|
|
fa4c28af8a | ||
|
|
a73a733644 | ||
|
|
d4cceb3e0d | ||
|
|
b715c0c32e | ||
|
|
f71620c8c8 | ||
|
|
51e467d7b9 | ||
|
|
eba8c1e99a | ||
|
|
9a785976c0 | ||
|
|
711e3a7bcf | ||
|
|
d84a085b71 | ||
|
|
ae38665b95 | ||
|
|
800e415400 | ||
|
|
f1dfedffaf | ||
|
|
67bb927d61 | ||
|
|
415197bd5d | ||
|
|
5177e587e9 | ||
|
|
ee7d2b4fd8 | ||
|
|
1d0806c374 | ||
|
|
9cc8f3752e | ||
|
|
79e0efac96 | ||
|
|
8531e879a9 | ||
|
|
89a75d5c71 | ||
|
|
e9e85ca6ee | ||
|
|
3a6a23ca98 | ||
|
|
3157dfae3e | ||
|
|
978b720759 | ||
|
|
77c655eb7d | ||
|
|
bee6ac566e | ||
|
|
6e4041b14d | ||
|
|
1706ac57ac | ||
|
|
72d62bea25 | ||
|
|
0ae37172dc | ||
|
|
9502ab728d | ||
|
|
7dd4dbc8a5 | ||
|
|
96c681cf61 | ||
|
|
2052aa2ab8 | ||
|
|
977218ea04 | ||
|
|
fcd5944d5e | ||
|
|
34395f97ed | ||
|
|
c897729b74 | ||
|
|
f8208ce5ef | ||
|
|
3516c96afd | ||
|
|
3bb5027565 | ||
|
|
eefc7aee4b | ||
|
|
3ccda1d3c4 | ||
|
|
802e88585a | ||
|
|
74bbe5988b | ||
|
|
066cb89f75 | ||
|
|
8c289104d6 | ||
|
|
274e453bb6 | ||
|
|
d89055908b | ||
|
|
801a34ea66 | ||
|
|
0cdb6d365e | ||
|
|
e3f4c72c85 | ||
|
|
eeab1bf1a9 | ||
|
|
d26176baae | ||
|
|
021aa6a9e6 | ||
|
|
8b86ada40e | ||
|
|
616f33ff65 | ||
|
|
023cb20ba9 | ||
|
|
073cd266ac | ||
|
|
2e252d3988 | ||
|
|
77978b68ca | ||
|
|
d9939362f5 | ||
|
|
bba05bd3ad | ||
|
|
83a4a13cbd | ||
|
|
9e9aabe9a1 | ||
|
|
382a887ce9 | ||
|
|
fbd1c55151 | ||
|
|
398bab8041 | ||
|
|
69245e751b | ||
|
|
d571f2bd59 | ||
|
|
781938317d | ||
|
|
f9365abe4f | ||
|
|
3f91f35d59 | ||
|
|
1de94aa856 | ||
|
|
ec1d5bd3d7 | ||
|
|
69af195c98 | ||
|
|
60699c4a92 | ||
|
|
cc45cf6046 | ||
|
|
62560a6429 | ||
|
|
f7b4e177ff | ||
|
|
084daf0516 | ||
|
|
a98956f2d9 | ||
|
|
ac6f0f88fa | ||
|
|
33fccb3530 | ||
|
|
45576f6972 | ||
|
|
17e13bc2e0 | ||
|
|
ee0c73bf25 | ||
|
|
343567a3f2 | ||
|
|
20b3535623 | ||
|
|
5298316480 | ||
|
|
83719fb29b | ||
|
|
f7d789eddb | ||
|
|
3816b2485e | ||
|
|
f8aa88d522 | ||
|
|
b5ae187f76 | ||
|
|
3f203806e2 | ||
|
|
15c6b49de2 | ||
|
|
a153f3a562 | ||
|
|
c2998310cf | ||
|
|
a43b74ed8f | ||
|
|
d3047f8069 | ||
|
|
3804b2580a | ||
|
|
329ac6c9f1 | ||
|
|
a091bb4ad2 | ||
|
|
86b5f43959 | ||
|
|
39dc688340 | ||
|
|
832234838b | ||
|
|
daba5acfbc | ||
|
|
07bfe3465e | ||
|
|
0544440a34 | ||
|
|
f287508772 | ||
|
|
f622b97436 | ||
|
|
dfded9d55d | ||
|
|
a2555fbb33 | ||
|
|
7180ea381f | ||
|
|
0c74658dd3 | ||
|
|
debda1df55 | ||
|
|
d98608b022 | ||
|
|
eb88c6eba0 | ||
|
|
e83429c926 | ||
|
|
cc091a99a2 | ||
|
|
1b207d3e2a | ||
|
|
f89709e99e | ||
|
|
a4c5f15312 | ||
|
|
3455fd9f82 | ||
|
|
cfc046abcd | ||
|
|
9ab39d8af3 | ||
|
|
bfab6d03f4 | ||
|
|
0d848c3042 | ||
|
|
54270d3fd5 | ||
|
|
1341f98c61 | ||
|
|
6fc279eef4 | ||
|
|
66e4f5cef7 | ||
|
|
7637971e9e | ||
|
|
4e3cffc4ba | ||
|
|
28d3a91980 | ||
|
|
3956616fc2 | ||
|
|
0fbb2eaaa4 | ||
|
|
b27dab4584 | ||
|
|
75bee5370f | ||
|
|
e7499458e9 | ||
|
|
d34c45fa5a | ||
|
|
8fe5a511c9 | ||
|
|
afdd2e2f81 | ||
|
|
3c1b1b2f80 | ||
|
|
992134f85e | ||
|
|
37bc986584 | ||
|
|
feb4e7c989 | ||
|
|
4a8fe6f56e | ||
|
|
a960a25285 | ||
|
|
958cb8d7f8 | ||
|
|
59f17794a5 | ||
|
|
3b2c7dbacb | ||
|
|
56ed480ed2 | ||
|
|
0572c86551 | ||
|
|
6499e3383b | ||
|
|
8278ace0c1 | ||
|
|
5981eca141 | ||
|
|
1b2856b885 | ||
|
|
118f18cf4b | ||
|
|
bfbc1d3bee | ||
|
|
831779508f | ||
|
|
286309af3f | ||
|
|
3db70a2237 | ||
|
|
da952f3f38 | ||
|
|
efba4685ef | ||
|
|
598984c45d | ||
|
|
860011c604 | ||
|
|
0cdcbfc75d | ||
|
|
e5d4be95df | ||
|
|
7b3863d93d | ||
|
|
de2a18d964 | ||
|
|
9588912aa9 | ||
|
|
7ee4380f64 | ||
|
|
01e639db97 | ||
|
|
030e699eb5 | ||
|
|
16cdf3d645 | ||
|
|
2f143491eb | ||
|
|
796050ac1e | ||
|
|
31dbfef97d | ||
|
|
19519bbeb0 | ||
|
|
fc4f52e61d | ||
|
|
e2ac8f7973 | ||
|
|
e5aa8eba51 | ||
|
|
7437c41514 | ||
|
|
f441a5070b | ||
|
|
00b6abd67b | ||
|
|
e4e3d75eb8 | ||
|
|
fc5318dcad | ||
|
|
661c664b75 | ||
|
|
6dbebca666 | ||
|
|
73d78ad517 | ||
|
|
e5371d7987 | ||
|
|
9806f35714 | ||
|
|
d40289d594 | ||
|
|
86fbe529ed | ||
|
|
9b169ef3f9 | ||
|
|
64797dc735 | ||
|
|
76797799c0 | ||
|
|
19549693a0 | ||
|
|
10d64d69b5 | ||
|
|
d89c956394 | ||
|
|
79467b4b31 | ||
|
|
ae275f239d | ||
|
|
77fb86155f | ||
|
|
e627fcce98 | ||
|
|
e80893ac20 | ||
|
|
912f8c048c | ||
|
|
d16f627cbc | ||
|
|
7cb98245ce | ||
|
|
3974cc8026 | ||
|
|
55d5436fb9 | ||
|
|
2b11785b05 | ||
|
|
715815374d | ||
|
|
4f64b974c6 | ||
|
|
7c58db4c0a | ||
|
|
f9f772e507 | ||
|
|
af2a369dff | ||
|
|
bebf20846f | ||
|
|
5f6bc2238a | ||
|
|
4564a626b2 | ||
|
|
e95a757e6c | ||
|
|
2ebf18be7f | ||
|
|
fc97af4c68 | ||
|
|
8eada19734 | ||
|
|
a56b08e539 | ||
|
|
9c9dca8ea5 | ||
|
|
268cab7f89 | ||
|
|
aeda64d36b | ||
|
|
f5526d01a2 | ||
|
|
20e6835da6 | ||
|
|
6b92ec1a23 | ||
|
|
45e90549aa | ||
|
|
1690ac0d9d | ||
|
|
a8fab8c977 | ||
|
|
a0b50ee68e | ||
|
|
896a59d00b | ||
|
|
d3db50c201 | ||
|
|
21434e9877 | ||
|
|
185482c0d9 | ||
|
|
a27dae3e88 | ||
|
|
7d139a7bf1 | ||
|
|
02e279bd38 | ||
|
|
4e593d91f5 | ||
|
|
75e721cf0c | ||
|
|
c4321bc83c | ||
|
|
224ff6d0e8 | ||
|
|
cc76d0f60e | ||
|
|
d4d1f78263 | ||
|
|
c570288145 | ||
|
|
98b61e0096 | ||
|
|
5c75032cad | ||
|
|
1381b427db | ||
|
|
21b9f50cea | ||
|
|
cbad012980 | ||
|
|
d4298f7794 | ||
|
|
f113bbb212 | ||
|
|
0b4128f76c | ||
|
|
bab8c8eb54 | ||
|
|
cbe5347593 | ||
|
|
189e8a1a89 | ||
|
|
a5316d525c | ||
|
|
c3e9ff17da | ||
|
|
0cf438a78e | ||
|
|
e37b12aacb | ||
|
|
dd53f721ac | ||
|
|
058d5ca7c3 | ||
|
|
eef525243d | ||
|
|
1f2bbe918f | ||
|
|
0089fab94c | ||
|
|
0778ef85c6 | ||
|
|
5e6d872940 | ||
|
|
b7ed5bd9ef | ||
|
|
de1f84d52b | ||
|
|
9f88bc6d04 | ||
|
|
bd0ba00513 | ||
|
|
0e52e387bd | ||
|
|
e93ddd2aa7 | ||
|
|
33905b2f15 | ||
|
|
d06755652f | ||
|
|
dbdbd710d8 | ||
|
|
15ee99ae41 | ||
|
|
17f6be4ad4 | ||
|
|
30fd92de0f | ||
|
|
0b27b9ec1a | ||
|
|
cc4b7dcea9 | ||
|
|
4a9e3ea095 | ||
|
|
55de5bc5ca | ||
|
|
2e87eecc67 | ||
|
|
74eea83051 | ||
|
|
c9fa58f2bb | ||
|
|
b7fdc8c0eb | ||
|
|
f5581d7dcb | ||
|
|
b682a77f82 | ||
|
|
800334a953 | ||
|
|
9da66d4c6b | ||
|
|
04c1a84159 | ||
|
|
b736e4590e | ||
|
|
6b28296ef3 | ||
|
|
c11fd010d8 | ||
|
|
121e10c837 | ||
|
|
e57c0869df | ||
|
|
4cf754f3fe | ||
|
|
22d1c2d3c3 | ||
|
|
884415264b | ||
|
|
ea8cbc849f | ||
|
|
20d7135c24 | ||
|
|
aaf1304308 | ||
|
|
37750e27ab | ||
|
|
ab62c00be5 | ||
|
|
757b582ae7 | ||
|
|
b81ef061f8 | ||
|
|
702b16977b | ||
|
|
3f1e572a23 | ||
|
|
00d4e9ba21 | ||
|
|
ee37bb40f0 | ||
|
|
215199239f | ||
|
|
f93d17a96f | ||
|
|
84570487bf | ||
|
|
03227f7ae1 | ||
|
|
05c2638467 | ||
|
|
fd68959223 | ||
|
|
5c0263c2f0 | ||
|
|
5fd0dc569b | ||
|
|
ec1c2d4267 | ||
|
|
b7e405ebbd | ||
|
|
e8a0676c0b | ||
|
|
32b46a3b90 | ||
|
|
cffd0007c5 | ||
|
|
564e0e1e71 | ||
|
|
6eb9bf81ad | ||
|
|
f99a02505a | ||
|
|
f63c25fb79 | ||
|
|
a0af290db5 | ||
|
|
65db47ed0d | ||
|
|
d06c4ce816 | ||
|
|
c91215bd99 | ||
|
|
dfac4cb8a4 | ||
|
|
34a1bb6107 | ||
|
|
6957f968a4 | ||
|
|
896cdc9462 | ||
|
|
efc33a09d9 | ||
|
|
61875e1bd0 | ||
|
|
e003885946 | ||
|
|
d40806caff | ||
|
|
e7c71c991c | ||
|
|
ebd3d9c3e6 | ||
|
|
bc8d8f5302 | ||
|
|
b049746853 |
35
.github/actions/test-tracy/action.yml
vendored
Normal file
@@ -0,0 +1,35 @@
|
||||
name: 'Test Tracy'
|
||||
description: 'Build the Tracy test application with various cmake flag combinations'
|
||||
|
||||
inputs:
|
||||
extra_cmake_flags:
|
||||
description: 'Additional cmake flags appended to each configure command (e.g. cross-compilation flags)'
|
||||
required: false
|
||||
default: ''
|
||||
|
||||
runs:
|
||||
using: 'composite'
|
||||
steps:
|
||||
- name: Test application
|
||||
shell: bash
|
||||
run: |
|
||||
# test compilation with different flags
|
||||
# we clean the build folder to reset cached variables between runs
|
||||
cmake -B tests/tracy/build -S tests/tracy -DCMAKE_BUILD_TYPE=Release ${{ inputs.extra_cmake_flags }}
|
||||
cmake --build tests/tracy/build --parallel
|
||||
cmake -E rm -rf tests/tracy/build
|
||||
|
||||
# same with TRACY_ON_DEMAND
|
||||
cmake -B tests/tracy/build -S tests/tracy -DCMAKE_BUILD_TYPE=Release -DTRACY_ON_DEMAND=ON ${{ inputs.extra_cmake_flags }}
|
||||
cmake --build tests/tracy/build --parallel
|
||||
cmake -E rm -rf tests/tracy/build
|
||||
|
||||
# same with TRACY_DELAYED_INIT and TRACY_MANUAL_LIFETIME
|
||||
cmake -B tests/tracy/build -S tests/tracy -DCMAKE_BUILD_TYPE=Release -DTRACY_DELAYED_INIT=ON -DTRACY_MANUAL_LIFETIME=ON ${{ inputs.extra_cmake_flags }}
|
||||
cmake --build tests/tracy/build --parallel
|
||||
cmake -E rm -rf tests/tracy/build
|
||||
|
||||
# same with TRACY_DEMANGLE
|
||||
cmake -B tests/tracy/build -S tests/tracy -DCMAKE_BUILD_TYPE=Release -DTRACY_DEMANGLE=ON ${{ inputs.extra_cmake_flags }}
|
||||
cmake --build tests/tracy/build --parallel
|
||||
cmake -E rm -rf tests/tracy/build
|
||||
2
.github/workflows/emscripten.yml
vendored
@@ -20,7 +20,7 @@ jobs:
|
||||
- name: Setup emscripten
|
||||
uses: emscripten-core/setup-emsdk@v16
|
||||
with:
|
||||
version: 4.0.10
|
||||
version: 5.0.7
|
||||
- name: Trust git repo
|
||||
run: git config --global --add safe.directory '*'
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
24
.github/workflows/linux.yml
vendored
@@ -32,7 +32,7 @@ jobs:
|
||||
if [ "${ACT:-}" != "true" ] && [ "${FORGEJO_ACTIONS:-}" != "true" ]; then
|
||||
cmake --build profiler/build
|
||||
else
|
||||
cmake --build profiler/build --parallel
|
||||
cmake --build profiler/build --parallel 2
|
||||
fi
|
||||
- name: Update utility
|
||||
run: |
|
||||
@@ -66,27 +66,7 @@ jobs:
|
||||
meson setup -Dprefix=$GITHUB_WORKSPACE/bin/lib -Dtracy_enable=true build-meson
|
||||
meson compile -C build-meson
|
||||
- name: Test application
|
||||
run: |
|
||||
# test compilation with different flags
|
||||
# we clean the build folder to reset cached variables between runs
|
||||
cmake -B test/build -S test -DCMAKE_BUILD_TYPE=Release
|
||||
cmake --build test/build --parallel
|
||||
rm -rf test/build
|
||||
|
||||
# same with TRACY_ON_DEMAND
|
||||
cmake -B test/build -S test -DCMAKE_BUILD_TYPE=Release -DTRACY_ON_DEMAND=ON .
|
||||
cmake --build test/build --parallel
|
||||
rm -rf test/build
|
||||
|
||||
# same with TRACY_DELAYED_INIT TRACY_MANUAL_LIFETIME
|
||||
cmake -B test/build -S test -DCMAKE_BUILD_TYPE=Release -DTRACY_DELAYED_INIT=ON -DTRACY_MANUAL_LIFETIME=ON .
|
||||
cmake --build test/build --parallel
|
||||
rm -rf test/build
|
||||
|
||||
# same with TRACY_DEMANGLE
|
||||
cmake -B test/build -S test -DCMAKE_BUILD_TYPE=Release -DTRACY_DEMANGLE=ON .
|
||||
cmake --build test/build --parallel
|
||||
rm -rf test/build
|
||||
uses: ./.github/actions/test-tracy
|
||||
- name: Find Artifacts
|
||||
id: find_artifacts
|
||||
run: |
|
||||
|
||||
4
.github/workflows/macos.yml
vendored
@@ -28,7 +28,7 @@ jobs:
|
||||
- name: Build profiler
|
||||
run: |
|
||||
cmake -B profiler/build -S profiler -DCMAKE_BUILD_TYPE=Release -DGIT_REV=${{ github.sha }}
|
||||
cmake --build profiler/build --parallel --config Release
|
||||
cmake --build profiler/build --parallel 2 --config Release
|
||||
- name: Build update
|
||||
run: |
|
||||
cmake -B update/build -S update -DCMAKE_BUILD_TYPE=Release -DGIT_REV=${{ github.sha }}
|
||||
@@ -51,6 +51,8 @@ jobs:
|
||||
cmake --build merge/build --parallel --config Release
|
||||
- name: Build library
|
||||
run: meson setup -Dprefix=$GITHUB_WORKSPACE/bin/lib -Dtracy_enable=true build && meson compile -C build && meson install -C build
|
||||
- name: Test application
|
||||
uses: ./.github/actions/test-tracy
|
||||
- name: Package artifacts
|
||||
run: |
|
||||
mkdir -p bin
|
||||
|
||||
28
.github/workflows/mingw.yml
vendored
@@ -53,27 +53,9 @@ jobs:
|
||||
meson setup build-meson --cross-file mingw-cross.txt -Ddefault_library=static -Dtracy_enable=true
|
||||
meson compile -C build-meson
|
||||
- name: Test application
|
||||
run: |
|
||||
cmake -B test/build -S test -DCMAKE_BUILD_TYPE=Release \
|
||||
-DCMAKE_SYSTEM_NAME=Windows \
|
||||
-DCMAKE_C_COMPILER=x86_64-w64-mingw32-gcc \
|
||||
uses: ./.github/actions/test-tracy
|
||||
with:
|
||||
extra_cmake_flags: >-
|
||||
-DCMAKE_SYSTEM_NAME=Windows
|
||||
-DCMAKE_C_COMPILER=x86_64-w64-mingw32-gcc
|
||||
-DCMAKE_CXX_COMPILER=x86_64-w64-mingw32-g++
|
||||
cmake --build test/build --parallel
|
||||
rm -rf test/build
|
||||
cmake -B test/build -S test -DCMAKE_BUILD_TYPE=Release -DTRACY_ON_DEMAND=ON \
|
||||
-DCMAKE_SYSTEM_NAME=Windows \
|
||||
-DCMAKE_C_COMPILER=x86_64-w64-mingw32-gcc \
|
||||
-DCMAKE_CXX_COMPILER=x86_64-w64-mingw32-g++
|
||||
cmake --build test/build --parallel
|
||||
rm -rf test/build
|
||||
cmake -B test/build -S test -DCMAKE_BUILD_TYPE=Release -DTRACY_DELAYED_INIT=ON -DTRACY_MANUAL_LIFETIME=ON \
|
||||
-DCMAKE_SYSTEM_NAME=Windows \
|
||||
-DCMAKE_C_COMPILER=x86_64-w64-mingw32-gcc \
|
||||
-DCMAKE_CXX_COMPILER=x86_64-w64-mingw32-g++
|
||||
cmake --build test/build --parallel
|
||||
rm -rf test/build
|
||||
cmake -B test/build -S test -DCMAKE_BUILD_TYPE=Release -DTRACY_DEMANGLE=ON \
|
||||
-DCMAKE_SYSTEM_NAME=Windows \
|
||||
-DCMAKE_C_COMPILER=x86_64-w64-mingw32-gcc \
|
||||
-DCMAKE_CXX_COMPILER=x86_64-w64-mingw32-g++
|
||||
cmake --build test/build --parallel
|
||||
|
||||
4
.github/workflows/windows.yml
vendored
@@ -32,7 +32,7 @@ jobs:
|
||||
- name: Build profiler
|
||||
run: |
|
||||
cmake -B profiler/build -S profiler -DCMAKE_BUILD_TYPE=Release -DGIT_REV=${{ github.sha }}
|
||||
cmake --build profiler/build --parallel --config Release
|
||||
cmake --build profiler/build --parallel 2 --config Release
|
||||
- name: Build update
|
||||
run: |
|
||||
cmake -B update/build -S update -DCMAKE_BUILD_TYPE=Release -DGIT_REV=${{ github.sha }}
|
||||
@@ -53,6 +53,8 @@ jobs:
|
||||
run: |
|
||||
cmake -B merge/build -S merge -DCMAKE_BUILD_TYPE=Release -DGIT_REV=${{ github.sha }}
|
||||
cmake --build merge/build --parallel --config Release
|
||||
- name: Test application
|
||||
uses: ./.github/actions/test-tracy
|
||||
- name: Package artifacts
|
||||
run: |
|
||||
mkdir bin
|
||||
|
||||
2
.gitignore
vendored
@@ -30,6 +30,8 @@ profiler/build/win32/Tracy.aps
|
||||
extra/vswhere.exe
|
||||
extra/tracy-build
|
||||
.cache
|
||||
.uv-cache/
|
||||
.venv/
|
||||
compile_commands.json
|
||||
profiler/build/wasm/Tracy-release.*
|
||||
profiler/build/wasm/Tracy-debug.*
|
||||
|
||||
2
.vscode/settings.json
vendored
@@ -6,7 +6,7 @@
|
||||
"${workspaceFolder}/import",
|
||||
"${workspaceFolder}/merge",
|
||||
"${workspaceFolder}/update",
|
||||
"${workspaceFolder}/test",
|
||||
"${workspaceFolder}/tests/tracy",
|
||||
"${workspaceFolder}",
|
||||
],
|
||||
"cmake.buildDirectory": "${sourceDirectory}/build",
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
cmake_minimum_required(VERSION 3.10)
|
||||
cmake_minimum_required(VERSION 3.13)
|
||||
|
||||
# Run version helper script
|
||||
include(cmake/version.cmake)
|
||||
@@ -108,60 +108,46 @@ endif()
|
||||
|
||||
include(cmake/options.cmake)
|
||||
|
||||
# Local wrapper that also sets compile definitions for TracyClient
|
||||
macro(tracy_set_option option help value)
|
||||
set_option(${option} "${help}" ${value})
|
||||
if(${option})
|
||||
target_compile_definitions(TracyClient PUBLIC ${option})
|
||||
endif()
|
||||
endmacro()
|
||||
|
||||
# Local wrapper for value options that also sets compile definitions for TracyClient
|
||||
macro(tracy_set_option_value var help default)
|
||||
set_option_value(${var} "${help}" "${default}")
|
||||
if(${var})
|
||||
target_compile_definitions(TracyClient PUBLIC ${var}=${${var}})
|
||||
endif()
|
||||
endmacro()
|
||||
|
||||
tracy_set_option(TRACY_ENABLE "Enable profiling" OFF)
|
||||
tracy_set_option(TRACY_ON_DEMAND "On-demand profiling" OFF)
|
||||
tracy_set_option_value(TRACY_CALLSTACK "Override the callstack collection depth for tracy zones" "")
|
||||
tracy_set_option(TRACY_NO_CALLSTACK "Disable all callstack related functionality" OFF)
|
||||
tracy_set_option(TRACY_NO_CALLSTACK_INLINES "Disables the inline functions in callstacks" OFF)
|
||||
tracy_set_option(TRACY_ONLY_LOCALHOST "Only listen on the localhost interface" OFF)
|
||||
tracy_set_option(TRACY_NO_BROADCAST "Disable client discovery by broadcast to local network" OFF)
|
||||
tracy_set_option(TRACY_ONLY_IPV4 "Tracy will only accept connections on IPv4 addresses (disable IPv6)" OFF)
|
||||
tracy_set_option(TRACY_NO_CODE_TRANSFER "Disable collection of source code" OFF)
|
||||
tracy_set_option(TRACY_NO_CONTEXT_SWITCH "Disable capture of context switches" OFF)
|
||||
tracy_set_option(TRACY_NO_EXIT "Client executable does not exit until all profile data is sent to server" OFF)
|
||||
tracy_set_option(TRACY_NO_SAMPLING "Disable call stack sampling" OFF)
|
||||
tracy_set_option(TRACY_NO_VERIFY "Disable zone validation for C API" OFF)
|
||||
tracy_set_option(TRACY_NO_VSYNC_CAPTURE "Disable capture of hardware Vsync events" OFF)
|
||||
tracy_set_option(TRACY_NO_FRAME_IMAGE "Disable the frame image support and its thread" OFF)
|
||||
tracy_set_option(TRACY_NO_SYSTEM_TRACING "Disable systrace sampling" OFF)
|
||||
tracy_set_option(TRACY_PATCHABLE_NOPSLEDS "Enable nopsleds for efficient patching by system-level tools (e.g. rr)" OFF)
|
||||
tracy_set_option(TRACY_DELAYED_INIT "Enable delayed initialization of the library (init on first call)" OFF)
|
||||
tracy_set_option(TRACY_MANUAL_LIFETIME "Enable the manual lifetime management of the profile" OFF)
|
||||
tracy_set_option(TRACY_FIBERS "Enable fibers support" OFF)
|
||||
tracy_set_option(TRACY_NO_CRASH_HANDLER "Disable crash handling" OFF)
|
||||
tracy_set_option(TRACY_TIMER_FALLBACK "Use lower resolution timers" OFF)
|
||||
tracy_set_option(TRACY_DISALLOW_HW_TIMER "Disallow hardware timer (may be useful on VMs). Requires TRACY_TIMER_FALLBACK=ON" OFF)
|
||||
tracy_set_option(TRACY_LIBUNWIND_BACKTRACE "Use libunwind backtracing where supported" OFF)
|
||||
tracy_set_option(TRACY_SYMBOL_OFFLINE_RESOLVE "Instead of full runtime symbol resolution, only resolve the image path and offset to enable offline symbol resolution" OFF)
|
||||
tracy_set_option(TRACY_LIBBACKTRACE_ELF_DYNLOAD_SUPPORT "Enable libbacktrace to support dynamically loaded elfs in symbol resolution resolution after the first symbol resolve operation" OFF)
|
||||
tracy_set_option(TRACY_DEBUGINFOD "Enable debuginfod support" OFF)
|
||||
tracy_set_option(TRACY_IGNORE_MEMORY_FAULTS "Ignore instrumentation errors from memory free events that do not have a matching allocation" OFF)
|
||||
set_option(TRACY_ENABLE "Enable profiling" OFF TracyClient)
|
||||
set_option(TRACY_ON_DEMAND "On-demand profiling" OFF TracyClient)
|
||||
set_option_value(TRACY_CALLSTACK "Override the callstack collection depth for tracy zones" "" TracyClient)
|
||||
set_option_value_as_string(TRACY_PLATFORM_HEADER "Path to a header providing TRACY_HAS_CUSTOM_* hooks for an unsupported platform" "" TracyClient)
|
||||
set_option(TRACY_NO_CALLSTACK "Disable all callstack related functionality" OFF TracyClient)
|
||||
set_option(TRACY_NO_CALLSTACK_INLINES "Disables the inline functions in callstacks" OFF TracyClient)
|
||||
set_option(TRACY_ONLY_LOCALHOST "Only listen on the localhost interface" OFF TracyClient)
|
||||
set_option(TRACY_NO_BROADCAST "Disable client discovery by broadcast to local network" OFF TracyClient)
|
||||
set_option(TRACY_ONLY_IPV4 "Tracy will only accept connections on IPv4 addresses (disable IPv6)" OFF TracyClient)
|
||||
set_option(TRACY_NO_CODE_TRANSFER "Disable collection of source code" OFF TracyClient)
|
||||
set_option(TRACY_NO_CONTEXT_SWITCH "Disable capture of context switches" OFF TracyClient)
|
||||
set_option(TRACY_NO_EXIT "Client executable does not exit until all profile data is sent to server" OFF TracyClient)
|
||||
set_option(TRACY_NO_SAMPLING "Disable call stack sampling" OFF TracyClient)
|
||||
set_option(TRACY_NO_VERIFY "Disable zone validation for C API" OFF TracyClient)
|
||||
set_option(TRACY_NO_VSYNC_CAPTURE "Disable capture of hardware Vsync events" OFF TracyClient)
|
||||
set_option(TRACY_NO_FRAME_IMAGE "Disable the frame image support and its thread" OFF TracyClient)
|
||||
set_option(TRACY_NO_SYSTEM_TRACING "Disable systrace sampling" OFF TracyClient)
|
||||
set_option(TRACY_PATCHABLE_NOPSLEDS "Enable nopsleds for efficient patching by system-level tools (e.g. rr)" OFF TracyClient)
|
||||
set_option(TRACY_DELAYED_INIT "Enable delayed initialization of the library (init on first call)" OFF TracyClient)
|
||||
set_option(TRACY_MANUAL_LIFETIME "Enable the manual lifetime management of the profile" OFF TracyClient)
|
||||
set_option(TRACY_FIBERS "Enable fibers support" OFF TracyClient)
|
||||
set_option(TRACY_NO_CRASH_HANDLER "Disable crash handling" OFF TracyClient)
|
||||
set_option(TRACY_TIMER_FALLBACK "Use lower resolution timers" OFF TracyClient)
|
||||
set_option(TRACY_DISALLOW_HW_TIMER "Disallow hardware timer (may be useful on VMs). Requires TRACY_TIMER_FALLBACK=ON" OFF TracyClient)
|
||||
set_option(TRACY_LIBUNWIND_BACKTRACE "Use libunwind backtracing where supported" OFF TracyClient)
|
||||
set_option(TRACY_SYMBOL_OFFLINE_RESOLVE "Instead of full runtime symbol resolution, only resolve the image path and offset to enable offline symbol resolution" OFF TracyClient)
|
||||
set_option(TRACY_LIBBACKTRACE_ELF_DYNLOAD_SUPPORT "Enable libbacktrace to support dynamically loaded elfs in symbol resolution resolution after the first symbol resolve operation" OFF TracyClient)
|
||||
set_option(TRACY_DEBUGINFOD "Enable debuginfod support" OFF TracyClient)
|
||||
set_option(TRACY_IGNORE_MEMORY_FAULTS "Ignore instrumentation errors from memory free events that do not have a matching allocation" OFF TracyClient)
|
||||
set_option(TRACY_OPENGL_AUTO_CALIBRATION "Periodically recalibrate OpenGL GPU/CPU clock drift (forces a CPU/GPU sync each time)" OFF TracyClient)
|
||||
|
||||
# advanced
|
||||
tracy_set_option(TRACY_VERBOSE "[advanced] Verbose output from the profiler" OFF)
|
||||
set_option(TRACY_VERBOSE "[advanced] Verbose output from the profiler" OFF TracyClient)
|
||||
mark_as_advanced(TRACY_VERBOSE)
|
||||
tracy_set_option(TRACY_NO_INTERNAL_MESSAGE "[advanced] Prevent the profiler from logging messages" OFF)
|
||||
set_option(TRACY_NO_INTERNAL_MESSAGE "[advanced] Prevent the profiler from logging messages" OFF TracyClient)
|
||||
mark_as_advanced(TRACY_NO_INTERNAL_MESSAGE)
|
||||
tracy_set_option(TRACY_DEMANGLE "[advanced] Don't use default demangling function - You'll need to provide your own" OFF)
|
||||
set_option(TRACY_DEMANGLE "[advanced] Don't use default demangling function - You'll need to provide your own" OFF TracyClient)
|
||||
mark_as_advanced(TRACY_DEMANGLE)
|
||||
if(rocprofiler-sdk_FOUND)
|
||||
tracy_set_option(TRACY_ROCPROF_CALIBRATION "[advanced] Use continuous calibration of the Rocprof GPU time." OFF)
|
||||
set_option(TRACY_ROCPROF_CALIBRATION "[advanced] Use continuous calibration of the Rocprof GPU time." OFF TracyClient)
|
||||
mark_as_advanced(TRACY_ROCPROF_CALIBRATION)
|
||||
endif()
|
||||
|
||||
@@ -298,3 +284,7 @@ if(TRACY_CLIENT_PYTHON)
|
||||
|
||||
add_subdirectory(python)
|
||||
endif()
|
||||
|
||||
if(PROJECT_IS_TOP_LEVEL)
|
||||
set(CMAKE_COLOR_DIAGNOSTICS ON)
|
||||
endif()
|
||||
|
||||
50
NEWS
@@ -5,9 +5,16 @@ here.
|
||||
vx.xx.x (2026-xx-xx)
|
||||
--------------------
|
||||
|
||||
- API break: removed "secure" variants of memory alloc and free macros. The
|
||||
secure code path is now always enabled. Migrate by removing "Secure" from
|
||||
the macros you use, e.g. TracySecureAlloc(...) -> TracyAlloc(...).
|
||||
- Added tracy-capture-daemon for automated multi-client trace capture.
|
||||
- Added tracy-merge utility for combining multiple trace files into one.
|
||||
- Added support for Windows on ARM64 with MSVC.
|
||||
- Added support for WebGPU.
|
||||
- Trace-specific settings storage has been completely overhauled. It is now
|
||||
possible to make the settings sidecar file public, saved next to the trace
|
||||
file.
|
||||
- External frames are now omitted in the single-line call stack list visible
|
||||
in messages list, or in memory allocation info window.
|
||||
- External frames are now hidden by default in various contexts where they
|
||||
@@ -15,7 +22,7 @@ vx.xx.x (2026-xx-xx)
|
||||
- Flame graph window.
|
||||
- Call stack window.
|
||||
- Statistics window (sampling mode).
|
||||
- External frames are now dimmed out in call stacks.
|
||||
- External frames are now dimmed out in call stacks in various parts of UI.
|
||||
- Single-line call stacks now have ellipsis at the end, if there are frames
|
||||
remaining.
|
||||
- System tracing on Windows has been refactored to be more robust.
|
||||
@@ -42,10 +49,15 @@ vx.xx.x (2026-xx-xx)
|
||||
- The protocol has been updated to use model templates. As a result, tools
|
||||
are now specified in a common way and the reasoning is performed in a
|
||||
separate content stream.
|
||||
- Several new tools were added, which in concert enable the assistant to
|
||||
answer very general questions, such as "how to optimize this program?".
|
||||
- Smaller models are now viable to use. Models as small as 4B parameters do
|
||||
now work really well. You can run such models on virtually all hardware.
|
||||
- Added horizontal scroll bars to code segments.
|
||||
- LLM thinking regions are now hidden by default.
|
||||
- The assistant may notify the user about its current findings, then
|
||||
resume thinking, after which it may give a more complete answer. In
|
||||
such cases, the initial part of the reply will be faded out.
|
||||
- Sampled execution costs are now included in assembly attachments.
|
||||
- Source code retrieval now has an optional line context parameter.
|
||||
- Added ability to search the code for keywords.
|
||||
@@ -138,7 +150,17 @@ vx.xx.x (2026-xx-xx)
|
||||
options for the entire program.
|
||||
- Message windows will now properly show full message in a tooltip for
|
||||
multi-line messages.
|
||||
- The in-profiler user manual now properly handles links to chapters.
|
||||
- Greatly improved the in-profiler user manual.
|
||||
- There is now chapter tree and the manual contents are displayed section
|
||||
by section.
|
||||
- Links to chapters are now properly working.
|
||||
- The "bclogo" blocks are now correctly processed and displayed as proper
|
||||
admonitions.
|
||||
- The font awesome icons now show as in the rest of the UI.
|
||||
- Footnotes are now rendered as proper footnotes.
|
||||
- Tables are now rendered as intended.
|
||||
- LaTeX math is now converted to readable form.
|
||||
- Added a button to download the full PDF manual to the user manual window.
|
||||
- Call stack window will now show the thread viewed call stack originates
|
||||
from (if possible).
|
||||
- "Visible threads" checkboxes in messages, flame graph and wait stacks
|
||||
@@ -153,6 +175,30 @@ vx.xx.x (2026-xx-xx)
|
||||
- Prototype implementation of system tracing on Apple devices.
|
||||
- Local (inline) call stack printouts were added to tooltips in statistics
|
||||
window, in sampling mode.
|
||||
- Ironed out some code corners to make integration of closed gaming console
|
||||
platforms easier. Added support for custom platform headers.
|
||||
- Bottom and top sample trees (in wait stacks, or in entry call stacks)
|
||||
now display aggregation counts if "group by function name" is enabled.
|
||||
- HW sample view in symbol view are now disabled by default.
|
||||
- The profiler can no longer be built with the statistics disabled.
|
||||
- Fixed NVCC builds.
|
||||
- Fixed possible lockups in Vulkan timer calibration loop.
|
||||
- The flame graph view now supports zooming in and panning with the mouse.
|
||||
- General application crash information polish in the profiler UI.
|
||||
- The achievements system has been converted to use markdown renderer.
|
||||
- Offline symbol resolution with the update utility now supports custom
|
||||
addr2line-compatible tools via -a and -A command line parameters.
|
||||
Additionally, it is now possible to reset all call stack frame symbols to
|
||||
unresolved with the -R parameter.
|
||||
- Periodic recalibration of the clock drift in OpenGL contexts can be enabled
|
||||
with the TRACY_OPENGL_AUTO_CALIBRATION compilation define. Note that this
|
||||
requires a full CPU/GPU sync on each calibration event. These events will
|
||||
not fire more often than once every second.
|
||||
- Added missing C API for shared locks.
|
||||
- Implemented semi-unique, nonsense random name generator.
|
||||
- Can be used to set a trace description.
|
||||
- Will be used to provide default description for newly added annotations.
|
||||
- Polished look and feel of annotation regions on the timeline.
|
||||
|
||||
|
||||
v0.13.1 (2025-12-11)
|
||||
|
||||
@@ -4,7 +4,7 @@
|
||||
|
||||
### A real time, nanosecond resolution, remote telemetry, hybrid frame and sampling profiler for games and other applications.
|
||||
|
||||
Tracy supports profiling CPU (Direct support is provided for C, C++, Lua, Python and Fortran integration. At the same time, third-party bindings to many other languages exist on the internet, such as [Rust](https://github.com/nagisa/rust_tracy_client), [Zig](https://github.com/tealsnow/zig-tracy), [C#](https://github.com/clibequilibrium/Tracy-CSharp), [OCaml](https://github.com/imandra-ai/ocaml-tracy), [Odin](https://github.com/oskarnp/odin-tracy), etc.), GPU (All major graphic APIs: OpenGL, Vulkan, Direct3D 11/12, Metal, OpenCL, CUDA.), memory allocations, locks, context switches, automatically attribute screenshots to captured frames, and much more.
|
||||
Tracy supports profiling CPU (Direct support is provided for C, C++, Lua, Python and Fortran integration. At the same time, third-party bindings to many other languages exist on the internet, such as [Rust](https://github.com/nagisa/rust_tracy_client), [Zig](https://github.com/tealsnow/zig-tracy), [C#](https://github.com/clibequilibrium/Tracy-CSharp), [OCaml](https://github.com/imandra-ai/ocaml-tracy), [Odin](https://github.com/oskarnp/odin-tracy), etc.), GPU (All major graphics/compute APIs: OpenGL, Vulkan, Direct3D 11/12, Metal, OpenCL, CUDA, WebGPU.), memory allocations, locks, context switches, automatically attribute screenshots to captured frames, and much more.
|
||||
|
||||
- [Documentation](https://github.com/wolfpld/tracy/releases/latest/download/tracy.pdf) for usage and build process instructions
|
||||
- [Releases](https://github.com/wolfpld/tracy/releases) containing the documentation (`tracy.pdf`) and compiled Windows x64 binaries (`Tracy-<version>.7z`) as assets
|
||||
|
||||
13
cmake/imgui-no-samplers.patch
Normal file
@@ -0,0 +1,13 @@
|
||||
diff --git a/backends/imgui_impl_opengl3.cpp b/backends/imgui_impl_opengl3.cpp
|
||||
index a9e32b7ac..2cdbc4812 100644
|
||||
--- a/backends/imgui_impl_opengl3.cpp
|
||||
+++ b/backends/imgui_impl_opengl3.cpp
|
||||
@@ -1069,7 +1069,7 @@ bool ImGui_ImplOpenGL3_Init(const char* glsl_version)
|
||||
bd->HasPolygonMode = (!bd->GlProfileIsES2 && !bd->GlProfileIsES3);
|
||||
#endif
|
||||
#ifdef IMGUI_IMPL_OPENGL_MAY_HAVE_BIND_SAMPLER
|
||||
- bd->HasBindSampler = (bd->GlVersion >= 330 || bd->GlProfileIsES3);
|
||||
+ //bd->HasBindSampler = (bd->GlVersion >= 330 || bd->GlProfileIsES3);
|
||||
#endif
|
||||
bd->HasClipOrigin = (bd->GlVersion >= 450);
|
||||
#ifdef IMGUI_IMPL_OPENGL_HAS_EXTENSIONS
|
||||
@@ -1,24 +1,47 @@
|
||||
# Reusable option macros for CMake projects
|
||||
# Reusable option macros for Tracy CMake projects
|
||||
#
|
||||
# Usage:
|
||||
# set_option(OPTION_NAME "Help text" ON/OFF) - for boolean options
|
||||
# set_option_value(VAR_NAME "Help text" "value") - for value options (CACHE STRING)
|
||||
# set_option(OPTION_NAME "Help text" ON/OFF [TARGET]) - for boolean options
|
||||
# set_option_value(VAR_NAME "Help text" "value" [TARGET]) - for value options (CACHE STRING)
|
||||
# set_option_value_as_string(VAR_NAME "Help text" "value" [TARGET]) - for value options as C string literals
|
||||
#
|
||||
# [TARGET] is optional and specifies a target to which the option will
|
||||
# be added as a compile definition (e.g., -DOPTION_NAME or -DVAR_NAME=value).
|
||||
|
||||
# Boolean options (ON/OFF)
|
||||
# Boolean option (ON/OFF).
|
||||
macro(set_option option help value)
|
||||
option(${option} ${help} ${value})
|
||||
if(${option})
|
||||
message(STATUS "${option}: ON")
|
||||
if(${ARGC} GREATER 3)
|
||||
target_compile_definitions(${ARGV3} PUBLIC ${option})
|
||||
endif()
|
||||
else()
|
||||
message(STATUS "${option}: OFF")
|
||||
endif()
|
||||
endmacro()
|
||||
|
||||
# Value options (strings, numbers, etc.)
|
||||
# Value option (string/number).
|
||||
macro(set_option_value var help default)
|
||||
set(${var} ${default} CACHE STRING "${help}")
|
||||
if(${var})
|
||||
message(STATUS "${var}: ${${var}}")
|
||||
if(${ARGC} GREATER 3)
|
||||
target_compile_definitions(${ARGV3} PUBLIC ${var}=${${var}})
|
||||
endif()
|
||||
else()
|
||||
message(STATUS "${var}: (not set)")
|
||||
endif()
|
||||
endmacro()
|
||||
|
||||
# Value option embedded as a C string literal (VAR="value").
|
||||
macro(set_option_value_as_string var help default)
|
||||
set(${var} ${default} CACHE STRING "${help}")
|
||||
if(${var})
|
||||
message(STATUS "${var}: ${${var}}")
|
||||
if(${ARGC} GREATER 3)
|
||||
target_compile_definitions(${ARGV3} PUBLIC "${var}=\"${${var}}\"")
|
||||
endif()
|
||||
else()
|
||||
message(STATUS "${var}: (not set)")
|
||||
endif()
|
||||
|
||||
@@ -26,7 +26,7 @@ else()
|
||||
CPMAddPackage(
|
||||
NAME capstone
|
||||
GITHUB_REPOSITORY capstone-engine/capstone
|
||||
GIT_TAG 6.0.0-Alpha7
|
||||
GIT_TAG 6.0.0-Alpha9
|
||||
OPTIONS
|
||||
"CAPSTONE_X86_ATT_DISABLE ON"
|
||||
"CAPSTONE_ALPHA_SUPPORT OFF"
|
||||
@@ -142,6 +142,7 @@ CPMAddPackage(
|
||||
PATCHES
|
||||
"${CMAKE_CURRENT_LIST_DIR}/imgui-emscripten.patch"
|
||||
"${CMAKE_CURRENT_LIST_DIR}/imgui-loader.patch"
|
||||
"${CMAKE_CURRENT_LIST_DIR}/imgui-no-samplers.patch"
|
||||
)
|
||||
|
||||
set(IMGUI_SOURCES
|
||||
@@ -217,7 +218,9 @@ CPMAddPackage(
|
||||
CPMAddPackage(
|
||||
NAME md4c
|
||||
GITHUB_REPOSITORY mity/md4c
|
||||
GIT_TAG release-0.5.2
|
||||
GIT_TAG 755ce49acdc7cd682d4502b4796db5ed6a1230fb
|
||||
OPTIONS
|
||||
"BUILD_SHARED_LIBS OFF"
|
||||
EXCLUDE_FROM_ALL TRUE
|
||||
)
|
||||
|
||||
@@ -254,7 +257,7 @@ if(NOT EMSCRIPTEN)
|
||||
CPMAddPackage(
|
||||
NAME usearch
|
||||
GITHUB_REPOSITORY unum-cloud/usearch
|
||||
GIT_TAG v2.23.0
|
||||
GIT_TAG v2.25.2
|
||||
EXCLUDE_FROM_ALL TRUE
|
||||
)
|
||||
|
||||
@@ -269,7 +272,7 @@ if(NOT EMSCRIPTEN)
|
||||
CPMAddPackage(
|
||||
NAME pugixml
|
||||
GITHUB_REPOSITORY zeux/pugixml
|
||||
GIT_TAG v1.15
|
||||
GIT_TAG v1.16
|
||||
EXCLUDE_FROM_ALL TRUE
|
||||
)
|
||||
add_library(TracyPugixml INTERFACE)
|
||||
@@ -287,7 +290,7 @@ if(NOT EMSCRIPTEN)
|
||||
CPMAddPackage(
|
||||
NAME libcurl
|
||||
GITHUB_REPOSITORY curl/curl
|
||||
GIT_TAG curl-8_19_0
|
||||
GIT_TAG curl-8_20_0
|
||||
OPTIONS
|
||||
"BUILD_STATIC_LIBS ON"
|
||||
"BUILD_SHARED_LIBS OFF"
|
||||
|
||||
@@ -1,49 +0,0 @@
|
||||
TRACY_PUBLIC := ../../public
|
||||
NVCC := nvcc
|
||||
CXX := g++
|
||||
CUPTI_INC := /usr/local/cuda/include
|
||||
CUPTI_LIB := /usr/local/cuda/lib64
|
||||
|
||||
TRACY_SRCS := $(TRACY_PUBLIC)/TracyClient.cpp
|
||||
INCLUDES := -I$(TRACY_PUBLIC) -I$(CUPTI_INC)
|
||||
LIBS := -L$(CUPTI_LIB) -lcuda -lcupti -lpthread -ldl
|
||||
|
||||
CXXFLAGS_REL := -O2 -DTRACY_ENABLE
|
||||
CXXFLAGS_DBG := -g -O0 -DTRACY_ENABLE
|
||||
NVCCFLAGS_REL := -arch=native -O2 -DTRACY_ENABLE
|
||||
NVCCFLAGS_DBG := -arch=native -g -O0 -DTRACY_ENABLE
|
||||
|
||||
.PHONY: all debug investigate investigate2 clean
|
||||
|
||||
all: repro
|
||||
|
||||
debug: repro_debug
|
||||
|
||||
investigate: test_corr_reuse
|
||||
|
||||
investigate2: test_graphid_recycle
|
||||
|
||||
# Release build
|
||||
repro: repro.cu tracy_client.o
|
||||
$(NVCC) $(NVCCFLAGS_REL) $(INCLUDES) -o $@ $< tracy_client.o $(LIBS)
|
||||
|
||||
tracy_client.o: $(TRACY_SRCS)
|
||||
$(CXX) $(CXXFLAGS_REL) $(INCLUDES) -c -o $@ $<
|
||||
|
||||
# Debug build (asserts enabled, no NDEBUG)
|
||||
repro_debug: repro.cu tracy_client_debug.o
|
||||
$(NVCC) $(NVCCFLAGS_DBG) $(INCLUDES) -o $@ $< tracy_client_debug.o $(LIBS)
|
||||
|
||||
tracy_client_debug.o: $(TRACY_SRCS)
|
||||
$(CXX) $(CXXFLAGS_DBG) $(INCLUDES) -c -o $@ $<
|
||||
|
||||
# Investigation: correlationId uniqueness per graph launch (no Tracy dependency)
|
||||
test_corr_reuse: test_corr_reuse.cu
|
||||
$(NVCC) $(NVCCFLAGS_REL) $(INCLUDES) -o $@ $< $(LIBS)
|
||||
|
||||
# Investigation: does CUPTI recycle graphId values after cudaGraphExecDestroy?
|
||||
test_graphid_recycle: test_graphid_recycle.cu
|
||||
$(NVCC) $(NVCCFLAGS_REL) $(INCLUDES) -o $@ $< $(LIBS)
|
||||
|
||||
clean:
|
||||
rm -f repro repro_debug test_corr_reuse test_graphid_recycle tracy_client.o tracy_client_debug.o
|
||||
57
examples/CustomPlatform/CustomPlatform.cpp
Normal file
@@ -0,0 +1,57 @@
|
||||
// Template implementations of the tracy::Platform* hooks. Pair with the
|
||||
// platform header (see CustomPlatform.h) and link this into your final
|
||||
// binary.
|
||||
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
|
||||
#include "CustomPlatform.h"
|
||||
|
||||
namespace tracy
|
||||
{
|
||||
|
||||
uint32_t PlatformGetThreadId()
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
void PlatformGetHostname( char* buf, size_t size )
|
||||
{
|
||||
const char* placeholder = "(?)";
|
||||
if( size == 0 ) return;
|
||||
const size_t n = strlen( placeholder );
|
||||
const size_t copy = n < size - 1 ? n : size - 1;
|
||||
memcpy( buf, placeholder, copy );
|
||||
buf[copy] = '\0';
|
||||
}
|
||||
|
||||
const char* PlatformGetUserLogin()
|
||||
{
|
||||
return "(?)";
|
||||
}
|
||||
|
||||
const char* PlatformGetUserFullName()
|
||||
{
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
bool PlatformSafeMemcpy( void* dst, const void* src, size_t size )
|
||||
{
|
||||
// Stub: report failure so Tracy skips the snapshot. Real impls use SEH
|
||||
// on Win32, pipe(2) on POSIX, or an equivalent probe-and-copy primitive.
|
||||
(void)dst; (void)src; (void)size;
|
||||
return false;
|
||||
}
|
||||
|
||||
// Stubs forward to the C runtime. Swap in the allocator you actually want.
|
||||
|
||||
void* PlatformMalloc( size_t size ) { return malloc( size ); }
|
||||
void PlatformFree( void* ptr ) { free( ptr ); }
|
||||
void* PlatformRealloc( void* ptr, size_t size ) { return realloc( ptr, size ); }
|
||||
|
||||
void PlatformAllocatorInit() {}
|
||||
void PlatformAllocatorThreadInit() {}
|
||||
void PlatformAllocatorFinalize() {}
|
||||
void PlatformAllocatorThreadFinalize(){}
|
||||
|
||||
}
|
||||
73
examples/CustomPlatform/CustomPlatform.h
Normal file
@@ -0,0 +1,73 @@
|
||||
// Template platform header for unsupported targets.
|
||||
//
|
||||
// Copy into your project, fill in the sections you need, and point Tracy at
|
||||
// it via -DTRACY_PLATFORM_HEADER="\"my_platform.h\"". Provide the
|
||||
// implementations in any TU linked into your final binary (see
|
||||
// CustomPlatform.cpp).
|
||||
//
|
||||
// Use this only for the TRACY_HAS_CUSTOM_* hooks and matching Platform*
|
||||
// declarations — don't set unrelated TRACY_* options here. Some are checked
|
||||
// before this header is included, so the result would depend on which TU
|
||||
// consulted them; set those at the build system level instead.
|
||||
//
|
||||
// For platform-specific features without a custom hook (call stacks,
|
||||
// context switches, crash handling, system tracing, etc.), disable them at
|
||||
// the build system level with the matching TRACY_NO_* macro.
|
||||
|
||||
#ifndef __MY_TRACY_PLATFORM_H__
|
||||
#define __MY_TRACY_PLATFORM_H__
|
||||
|
||||
#include <stddef.h>
|
||||
#include <stdint.h>
|
||||
|
||||
namespace tracy
|
||||
{
|
||||
|
||||
// --- Thread id --------------------------------------------------------------
|
||||
//
|
||||
// Required if defaults in TracySystem.cpp do not matches your platform.
|
||||
// Note pthread_self() is NOT suitable, it returns a library handle, not a kernel id.
|
||||
//#define TRACY_HAS_CUSTOM_THREAD_ID
|
||||
uint32_t PlatformGetThreadId();
|
||||
|
||||
|
||||
// --- User info --------------------------------------------------------------
|
||||
//
|
||||
// Identifies the machine and user in the trace header. Return placeholder
|
||||
// strings (e.g. "(?)") from any of these if your platform has no equivalent
|
||||
// notion.
|
||||
//#define TRACY_HAS_CUSTOM_USER_INFO
|
||||
void PlatformGetHostname( char* buf, size_t size );
|
||||
const char* PlatformGetUserLogin();
|
||||
const char* PlatformGetUserFullName();
|
||||
|
||||
|
||||
// --- Safe memory copy -------------------------------------------------------
|
||||
//
|
||||
// Tracy uses this to snapshot potentially-unmapped memory during sampling.
|
||||
// Must not crash on unreadable input — return false instead. Plain memcpy()
|
||||
// is NOT a valid implementation.
|
||||
//#define TRACY_HAS_CUSTOM_SAFE_COPY
|
||||
bool PlatformSafeMemcpy( void* dst, const void* src, size_t size );
|
||||
|
||||
|
||||
// --- Allocator --------------------------------------------------------------
|
||||
//
|
||||
// Replaces Tracy's internal allocator. Drop in the system allocator, an
|
||||
// in-house one, or any third-party allocator you like. Malloc/Free/Realloc
|
||||
// must be thread-safe; ThreadInit is an optional prime, not a precondition.
|
||||
// Finalize must also tear down the calling thread's per-thread state, the
|
||||
// way rpmalloc_finalize() does — Tracy does not call ThreadFinalize for the
|
||||
// shutdown thread before Finalize.
|
||||
//#define TRACY_HAS_CUSTOM_ALLOCATOR
|
||||
void* PlatformMalloc( size_t size );
|
||||
void PlatformFree( void* ptr );
|
||||
void* PlatformRealloc( void* ptr, size_t size );
|
||||
void PlatformAllocatorInit();
|
||||
void PlatformAllocatorThreadInit();
|
||||
void PlatformAllocatorFinalize();
|
||||
void PlatformAllocatorThreadFinalize();
|
||||
|
||||
}
|
||||
|
||||
#endif
|
||||
0
examples/cuda/README.md
Normal file
39
examples/cuda/graph/CMakeLists.txt
Normal file
@@ -0,0 +1,39 @@
|
||||
cmake_minimum_required(VERSION 3.18)
|
||||
project(CUDAGraphDemo LANGUAGES CXX CUDA)
|
||||
|
||||
set(CMAKE_CXX_STANDARD 17)
|
||||
set(CMAKE_CUDA_STANDARD 17)
|
||||
|
||||
if(CMAKE_VERSION VERSION_GREATER_EQUAL "3.24")
|
||||
set(CMAKE_CUDA_ARCHITECTURES native)
|
||||
endif()
|
||||
|
||||
set(TRACY_PATH "${CMAKE_CURRENT_SOURCE_DIR}/../../.."
|
||||
CACHE PATH "Root of the Tracy repository")
|
||||
set(TRACY_PUBLIC "${TRACY_PATH}/public")
|
||||
|
||||
find_package(CUDAToolkit REQUIRED)
|
||||
find_package(Threads REQUIRED)
|
||||
|
||||
# cuda-graph-demo.cu embeds Tracy via #include <TracyClient.cpp> (unity build),
|
||||
# so no separate TracyClient library is needed — just expose the public headers.
|
||||
add_executable(cuda-graph-demo cuda-graph-demo.cu)
|
||||
target_include_directories(cuda-graph-demo PRIVATE ${TRACY_PUBLIC})
|
||||
target_link_libraries(cuda-graph-demo PRIVATE
|
||||
CUDA::cupti CUDA::cuda_driver Threads::Threads ${CMAKE_DL_LIBS})
|
||||
|
||||
# ctest-related integration below
|
||||
# to run the binaries via ctest:
|
||||
# ctest --test-dir <cmake-build-dir> -R <binary-name> -C <build-config>
|
||||
|
||||
enable_testing()
|
||||
add_test(NAME cuda-graph-demo COMMAND cuda-graph-demo)
|
||||
|
||||
# On Windows, CUPTI's DLL must be on PATH at runtime.
|
||||
if(WIN32)
|
||||
set(_cupti_dir "$<TARGET_FILE_DIR:CUDA::cupti>")
|
||||
set_target_properties(cuda-graph-demo PROPERTIES
|
||||
VS_DEBUGGER_ENVIRONMENT "PATH=${_cupti_dir};$ENV{PATH}")
|
||||
set_tests_properties(cuda-graph-demo PROPERTIES
|
||||
ENVIRONMENT "PATH=${_cupti_dir};$ENV{PATH}")
|
||||
endif()
|
||||
11
examples/cuda/graph/build.sh
Normal file
@@ -0,0 +1,11 @@
|
||||
TRACY_PATH=<path-to-tracy>
|
||||
CUDA_TOOLKIT_PATH=/usr/local/cuda
|
||||
CUDA_CUPTI_PATH=${CUDA_TOOLKIT_PATH}/extras/CUPTI
|
||||
|
||||
# pass -v to nvcc for verbose build information
|
||||
nvcc -O2 -std=c++17 cuda-graph-demo.cu \
|
||||
-o cuda-graph-demo \
|
||||
-I "${TRACY_PATH}/public" \
|
||||
-I "${CUDA_CUPTI_PATH}/include" -I "${CUDA_TOOLKIT_PATH}/include" \
|
||||
-L "${CUDA_CUPTI_PATH}/lib64" -L "${CUDA_TOOLKIT_PATH}/lib64" \
|
||||
-lcupti -lcuda
|
||||
146
examples/cuda/graph/cuda-graph-demo.cu
Normal file
@@ -0,0 +1,146 @@
|
||||
#include <cuda_runtime.h>
|
||||
|
||||
// WARN: for simplicity, we enable and "embed" the Tracy client directly into the code
|
||||
#define TRACY_ENABLE
|
||||
#include <TracyClient.cpp>
|
||||
|
||||
#include <tracy/Tracy.hpp>
|
||||
#include <tracy/TracyCUDA.hpp>
|
||||
|
||||
#include <cstdio>
|
||||
#include <cstdlib>
|
||||
#include <vector>
|
||||
|
||||
#define CUDA_CHECK(call) \
|
||||
do { \
|
||||
cudaError_t err__ = (call); \
|
||||
if (err__ != cudaSuccess) { \
|
||||
std::fprintf(stderr, "CUDA error %s at %s:%d: %s\n", \
|
||||
cudaGetErrorName(err__), __FILE__, __LINE__, \
|
||||
cudaGetErrorString(err__)); \
|
||||
std::exit(EXIT_FAILURE); \
|
||||
} \
|
||||
} while (0)
|
||||
|
||||
__global__ void saxpy(float a, const float* x, float* y, int n)
|
||||
{
|
||||
int i = blockIdx.x * blockDim.x + threadIdx.x;
|
||||
if (i < n) y[i] = a * x[i] + y[i];
|
||||
}
|
||||
|
||||
int main()
|
||||
{
|
||||
// CUPTI-backed Tracy context. Auto-captures all CUDA activity from the
|
||||
// point StartProfiling() is called until StopProfiling(). The background
|
||||
// collector thread flushes activity into Tracy; the explicit Collect()
|
||||
// calls below just force a flush at known phase boundaries.
|
||||
auto* cudaCtx = TracyCUDAContext();
|
||||
{
|
||||
constexpr char ctxName[] = "CUDA Graph Demo";
|
||||
TracyCUDAContextName(cudaCtx, ctxName, sizeof(ctxName) - 1);
|
||||
}
|
||||
TracyCUDAStartProfiling(cudaCtx);
|
||||
|
||||
constexpr int N = 1 << 16; // small N => kernel is short => launch overhead dominates
|
||||
constexpr int KERNELS_PER_GRAPH = 32; // chain length captured into the graph
|
||||
constexpr int OUTER_ITERS = 2000; // how many times we replay the chain
|
||||
|
||||
// allocate device buffers
|
||||
float *dX = nullptr, *dY = nullptr;
|
||||
CUDA_CHECK(cudaMalloc(&dX, N * sizeof(float)));
|
||||
CUDA_CHECK(cudaMalloc(&dY, N * sizeof(float)));
|
||||
|
||||
std::vector<float> hX(N, 1.0f);
|
||||
CUDA_CHECK(cudaMemcpy(dX, hX.data(), N * sizeof(float), cudaMemcpyHostToDevice));
|
||||
|
||||
cudaStream_t stream = nullptr;
|
||||
CUDA_CHECK(cudaStreamCreate(&stream));
|
||||
|
||||
const dim3 block(256);
|
||||
const dim3 grid((N + block.x - 1) / block.x);
|
||||
|
||||
cudaEvent_t evStart, evStop;
|
||||
CUDA_CHECK(cudaEventCreate(&evStart));
|
||||
CUDA_CHECK(cudaEventCreate(&evStop));
|
||||
|
||||
// warm-up (so first-launch lazy-init and/or JIT doesn't bias the measurement)
|
||||
saxpy<<<grid, block, 0, stream>>>(0.0f, dX, dY, N);
|
||||
CUDA_CHECK(cudaStreamSynchronize(stream));
|
||||
|
||||
// baseline: launch each kernel directly on the stream
|
||||
float msStream = 0.0f;
|
||||
{
|
||||
ZoneScopedN("stream-launches");
|
||||
CUDA_CHECK(cudaMemsetAsync(dY, 0, N * sizeof(float), stream));
|
||||
CUDA_CHECK(cudaEventRecord(evStart, stream));
|
||||
for (int outer = 0; outer < OUTER_ITERS; ++outer) {
|
||||
for (int k = 0; k < KERNELS_PER_GRAPH; ++k) {
|
||||
saxpy<<<grid, block, 0, stream>>>(1.0e-6f, dX, dY, N);
|
||||
}
|
||||
}
|
||||
CUDA_CHECK(cudaEventRecord(evStop, stream));
|
||||
CUDA_CHECK(cudaEventSynchronize(evStop));
|
||||
CUDA_CHECK(cudaEventElapsedTime(&msStream, evStart, evStop));
|
||||
TracyCUDACollect(cudaCtx);
|
||||
}
|
||||
|
||||
// capture: record the same kernel chain into a graph
|
||||
cudaGraph_t graph = nullptr;
|
||||
cudaGraphExec_t graphExec = nullptr;
|
||||
{
|
||||
ZoneScopedN("graph-capture");
|
||||
// cudaStreamCaptureModeRelaxed allows the calling thread to perform
|
||||
// unrelated CUDA work during capture; ThreadLocal is stricter if you need
|
||||
// isolation. Most short, single-stream captures work fine in either mode.
|
||||
CUDA_CHECK(cudaStreamBeginCapture(stream, cudaStreamCaptureModeRelaxed));
|
||||
for (int k = 0; k < KERNELS_PER_GRAPH; ++k) {
|
||||
saxpy<<<grid, block, 0, stream>>>(1.0e-6f, dX, dY, N);
|
||||
}
|
||||
CUDA_CHECK(cudaStreamEndCapture(stream, &graph));
|
||||
|
||||
// Instantiate once -> reusable executable graph.
|
||||
CUDA_CHECK(cudaGraphInstantiate(&graphExec, graph, nullptr, nullptr, 0));
|
||||
|
||||
// The template graph isn't needed once instantiated.
|
||||
CUDA_CHECK(cudaGraphDestroy(graph));
|
||||
}
|
||||
|
||||
// replay: launch the instantiated graph OUTER_ITERS times
|
||||
float msGraph = 0.0f;
|
||||
{
|
||||
ZoneScopedN("graph-launches");
|
||||
CUDA_CHECK(cudaMemsetAsync(dY, 0, N * sizeof(float), stream));
|
||||
CUDA_CHECK(cudaEventRecord(evStart, stream));
|
||||
for (int outer = 0; outer < OUTER_ITERS; ++outer) {
|
||||
CUDA_CHECK(cudaGraphLaunch(graphExec, stream));
|
||||
}
|
||||
CUDA_CHECK(cudaEventRecord(evStop, stream));
|
||||
CUDA_CHECK(cudaEventSynchronize(evStop));
|
||||
CUDA_CHECK(cudaEventElapsedTime(&msGraph, evStart, evStop));
|
||||
TracyCUDACollect(cudaCtx);
|
||||
}
|
||||
|
||||
// sanity check: y[i] = OUTER_ITERS * KERNELS_PER_GRAPH * 1e-6 * x[i]
|
||||
std::vector<float> hY(N);
|
||||
CUDA_CHECK(cudaMemcpy(hY.data(), dY, N * sizeof(float), cudaMemcpyDeviceToHost));
|
||||
const float expected = float(OUTER_ITERS) * float(KERNELS_PER_GRAPH) * 1.0e-6f;
|
||||
|
||||
std::printf("Stream launches: %8.3f ms (%d kernels)\n",
|
||||
msStream, OUTER_ITERS * KERNELS_PER_GRAPH);
|
||||
std::printf("Graph launches: %8.3f ms (%d graph launches x %d kernels)\n",
|
||||
msGraph, OUTER_ITERS, KERNELS_PER_GRAPH);
|
||||
std::printf("Speedup : %8.2fx\n", msStream / msGraph);
|
||||
std::printf("hY[0] = %.6e (expected %.6e)\n", hY[0], expected);
|
||||
|
||||
// shutdown
|
||||
CUDA_CHECK(cudaGraphExecDestroy(graphExec));
|
||||
CUDA_CHECK(cudaEventDestroy(evStart));
|
||||
CUDA_CHECK(cudaEventDestroy(evStop));
|
||||
CUDA_CHECK(cudaStreamDestroy(stream));
|
||||
CUDA_CHECK(cudaFree(dX));
|
||||
CUDA_CHECK(cudaFree(dY));
|
||||
|
||||
TracyCUDAStopProfiling(cudaCtx);
|
||||
TracyCUDAContextDestroy(cudaCtx);
|
||||
return 0;
|
||||
}
|
||||
63
examples/dyna/CMakeLists.txt
Normal file
@@ -0,0 +1,63 @@
|
||||
cmake_minimum_required(VERSION 3.29)
|
||||
project(dyna LANGUAGES C CXX)
|
||||
|
||||
option(TRACY_ENABLE "Enable Tracy" ON)
|
||||
|
||||
set(CMAKE_CXX_STANDARD 20)
|
||||
set(CMAKE_CXX_STANDARD_REQUIRED ON)
|
||||
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
|
||||
set(CMAKE_COLOR_DIAGNOSTICS ON)
|
||||
|
||||
include(cmake/CPM.cmake)
|
||||
|
||||
CPMAddPackage(
|
||||
NAME glad
|
||||
VERSION 2.0.8
|
||||
GIT_REPOSITORY https://github.com/Dav1dde/glad.git
|
||||
GIT_TAG glad2
|
||||
)
|
||||
|
||||
add_subdirectory(${glad_SOURCE_DIR}/cmake ${CMAKE_CURRENT_BINARY_DIR}/glad)
|
||||
add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/../.. client/)
|
||||
|
||||
glad_add_library(glad_gl_core_33 STATIC API gl:core=3.3)
|
||||
|
||||
find_package(SDL3 REQUIRED)
|
||||
find_package(SDL3_image REQUIRED)
|
||||
|
||||
add_executable(dyna
|
||||
src/main.cpp
|
||||
src/datapath.cpp
|
||||
src/timer.cpp
|
||||
src/gfx.cpp
|
||||
src/texture.cpp
|
||||
src/entity.cpp
|
||||
src/world.cpp
|
||||
src/map.cpp
|
||||
src/player.cpp
|
||||
src/monster.cpp
|
||||
src/bomb.cpp
|
||||
src/bonus.cpp
|
||||
src/game.cpp
|
||||
)
|
||||
|
||||
target_link_libraries(dyna
|
||||
PRIVATE
|
||||
glad_gl_core_33
|
||||
SDL3::SDL3
|
||||
SDL3_image::SDL3_image
|
||||
Tracy::TracyClient
|
||||
)
|
||||
|
||||
target_include_directories(dyna PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/src)
|
||||
|
||||
# Mirror the data/ tree next to the executable so the game finds its assets
|
||||
# when launched from the build directory (paths are resolved via SDL_GetBasePath).
|
||||
add_custom_command(TARGET dyna POST_BUILD
|
||||
COMMAND ${CMAKE_COMMAND} -E copy_directory
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/data
|
||||
$<TARGET_FILE_DIR:dyna>/data
|
||||
COMMENT "Copying data/ next to dyna executable"
|
||||
)
|
||||
|
||||
file(GENERATE OUTPUT .gitignore CONTENT "*")
|
||||
7
examples/dyna/LICENSE
Normal file
@@ -0,0 +1,7 @@
|
||||
Dyna.net copyright 2005 by Bartosz Taudul and Ralf Wrześniewski.
|
||||
|
||||
This program (including source code and the asset it uses) is NOT licensed
|
||||
for any use other than being an example of how to integrate Tracy Profiler.
|
||||
|
||||
The license terms written in other parts of this repository DO NOT apply
|
||||
here.
|
||||
24
examples/dyna/cmake/CPM.cmake
Normal file
@@ -0,0 +1,24 @@
|
||||
# SPDX-License-Identifier: MIT
|
||||
#
|
||||
# SPDX-FileCopyrightText: Copyright (c) 2019-2023 Lars Melchior and contributors
|
||||
|
||||
set(CPM_DOWNLOAD_VERSION 0.42.3)
|
||||
set(CPM_HASH_SUM "a609e875fd532b067174250f6abbc3dac22fe2d64869783fb1e80bda1625c844")
|
||||
|
||||
if(CPM_SOURCE_CACHE)
|
||||
set(CPM_DOWNLOAD_LOCATION "${CPM_SOURCE_CACHE}/cpm/CPM_${CPM_DOWNLOAD_VERSION}.cmake")
|
||||
elseif(DEFINED ENV{CPM_SOURCE_CACHE})
|
||||
set(CPM_DOWNLOAD_LOCATION "$ENV{CPM_SOURCE_CACHE}/cpm/CPM_${CPM_DOWNLOAD_VERSION}.cmake")
|
||||
else()
|
||||
set(CPM_DOWNLOAD_LOCATION "${CMAKE_BINARY_DIR}/cmake/CPM_${CPM_DOWNLOAD_VERSION}.cmake")
|
||||
endif()
|
||||
|
||||
# Expand relative path. This is important if the provided path contains a tilde (~)
|
||||
get_filename_component(CPM_DOWNLOAD_LOCATION ${CPM_DOWNLOAD_LOCATION} ABSOLUTE)
|
||||
|
||||
file(DOWNLOAD
|
||||
https://github.com/cpm-cmake/CPM.cmake/releases/download/v${CPM_DOWNLOAD_VERSION}/CPM.cmake
|
||||
${CPM_DOWNLOAD_LOCATION} EXPECTED_HASH SHA256=${CPM_HASH_SUM}
|
||||
)
|
||||
|
||||
include(${CPM_DOWNLOAD_LOCATION})
|
||||
BIN
examples/dyna/data/gfx/Bomb.png
Normal file
|
After Width: | Height: | Size: 78 KiB |
BIN
examples/dyna/data/gfx/Player.png
Normal file
|
After Width: | Height: | Size: 93 KiB |
BIN
examples/dyna/data/gfx/bonusy.png
Normal file
|
After Width: | Height: | Size: 16 KiB |
BIN
examples/dyna/data/gfx/crate.png
Normal file
|
After Width: | Height: | Size: 768 B |
BIN
examples/dyna/data/gfx/menu.png
Normal file
|
After Width: | Height: | Size: 144 KiB |
BIN
examples/dyna/data/gfx/monster1.png
Normal file
|
After Width: | Height: | Size: 42 KiB |
BIN
examples/dyna/data/gfx/monster2.png
Normal file
|
After Width: | Height: | Size: 56 KiB |
BIN
examples/dyna/data/gfx/monster3.png
Normal file
|
After Width: | Height: | Size: 54 KiB |
BIN
examples/dyna/data/gfx/portal.png
Normal file
|
After Width: | Height: | Size: 54 KiB |
BIN
examples/dyna/data/gfx/sand.png
Normal file
|
After Width: | Height: | Size: 667 B |
BIN
examples/dyna/data/gfx/wall.png
Normal file
|
After Width: | Height: | Size: 490 B |
12
examples/dyna/data/levels/1
Normal file
@@ -0,0 +1,12 @@
|
||||
10 1 0 0
|
||||
@............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
12
examples/dyna/data/levels/2
Normal file
@@ -0,0 +1,12 @@
|
||||
20 4 0 0
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#.#.#@#.#.#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
12
examples/dyna/data/levels/3
Normal file
@@ -0,0 +1,12 @@
|
||||
40 3 2 0
|
||||
@............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
12
examples/dyna/data/levels/4
Normal file
@@ -0,0 +1,12 @@
|
||||
40 3 3 0
|
||||
@............
|
||||
.###.#.#.###.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.###.#.#.###.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.###.#.#.###.
|
||||
.............
|
||||
12
examples/dyna/data/levels/5
Normal file
@@ -0,0 +1,12 @@
|
||||
40 2 4 1
|
||||
@............
|
||||
.###.#.#.###.
|
||||
.#.........#.
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.#.........#.
|
||||
.###.#.#.###.
|
||||
.............
|
||||
12
examples/dyna/data/levels/6
Normal file
@@ -0,0 +1,12 @@
|
||||
50 2 2 3
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.....#.#.....
|
||||
.#.###.###.#.
|
||||
.............
|
||||
.#.###.###.#.
|
||||
.....#.#.....
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#.#.#.#.#.#@
|
||||
12
examples/dyna/data/levels/7
Normal file
@@ -0,0 +1,12 @@
|
||||
60 3 3 3
|
||||
@............
|
||||
.#.#.###.#.#.
|
||||
.............
|
||||
.###.#.#.###.
|
||||
.............
|
||||
.#.#.###.#.#.
|
||||
.............
|
||||
.###.#.#.###.
|
||||
.............
|
||||
.#.#.###.#.#.
|
||||
.............
|
||||
12
examples/dyna/data/levels/8
Normal file
@@ -0,0 +1,12 @@
|
||||
60 5 3 3
|
||||
@............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#...#.#...#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#...#.#...#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
12
examples/dyna/data/levels/9
Normal file
@@ -0,0 +1,12 @@
|
||||
90 5 5 5
|
||||
@............
|
||||
.#.........#.
|
||||
.............
|
||||
.............
|
||||
.............
|
||||
.............
|
||||
.............
|
||||
.............
|
||||
.............
|
||||
.#.........#.
|
||||
.............
|
||||
12
examples/dyna/data/levels/menu
Normal file
@@ -0,0 +1,12 @@
|
||||
30 4 4 4
|
||||
@............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#...#.#...#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
.#...#.#...#.
|
||||
.............
|
||||
.#.#.#.#.#.#.
|
||||
.............
|
||||
143
examples/dyna/src/bomb.cpp
Normal file
@@ -0,0 +1,143 @@
|
||||
#include "bomb.hpp"
|
||||
|
||||
#include "gfx.hpp"
|
||||
#include "map.hpp"
|
||||
#include "texture.hpp"
|
||||
#include "timer.hpp"
|
||||
#include "world.hpp"
|
||||
|
||||
#include <tracy/Tracy.hpp>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
Bomb::Bomb( int x_, int y_ )
|
||||
: x( x_ )
|
||||
, y( y_ )
|
||||
, left( 9 )
|
||||
{
|
||||
}
|
||||
|
||||
void Bomb::draw()
|
||||
{
|
||||
ZoneScoped;
|
||||
if( stage == Stage::exploding )
|
||||
return;
|
||||
|
||||
if( stage == Stage::appear )
|
||||
{
|
||||
Textures::bomb_appear.bind( 9 - left );
|
||||
}
|
||||
else
|
||||
{
|
||||
int frame = static_cast<int>( ( time - left ) / static_cast<float>( time ) * 8 );
|
||||
if( Timer::get_timestamp() / 100 % 2 == 0 )
|
||||
frame++;
|
||||
Textures::bomb.bind( frame );
|
||||
}
|
||||
|
||||
Gfx::draw_square( x, y );
|
||||
}
|
||||
|
||||
void Bomb::tick( World& world )
|
||||
{
|
||||
ZoneScoped;
|
||||
delta += Timer::delta;
|
||||
|
||||
while( delta > 10 )
|
||||
{
|
||||
delta -= 10;
|
||||
|
||||
if( stage == Stage::appear )
|
||||
{
|
||||
if( left > 0 )
|
||||
{
|
||||
delta -= 10; // the fade-in advances at double speed
|
||||
left--;
|
||||
}
|
||||
else
|
||||
{
|
||||
stage = Stage::ticking;
|
||||
left = time;
|
||||
}
|
||||
}
|
||||
else if( left > 0 )
|
||||
{
|
||||
left--;
|
||||
}
|
||||
else if( stage == Stage::ticking )
|
||||
{
|
||||
explode( world );
|
||||
}
|
||||
else
|
||||
{
|
||||
die( world );
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void Bomb::explode( World& world )
|
||||
{
|
||||
ZoneScoped;
|
||||
stage = Stage::exploding;
|
||||
left = 200;
|
||||
|
||||
Map& map = world.map();
|
||||
map.at( x, y ) = Field::explosion( Field::ExplosionType::center );
|
||||
|
||||
struct Dir
|
||||
{
|
||||
int dx, dy;
|
||||
Field::ExplosionType through, tip;
|
||||
};
|
||||
const Dir dirs[4] = {
|
||||
{ -1, 0, Field::ExplosionType::horizontal, Field::ExplosionType::left },
|
||||
{ 1, 0, Field::ExplosionType::horizontal, Field::ExplosionType::right },
|
||||
{ 0, -1, Field::ExplosionType::vertical, Field::ExplosionType::up },
|
||||
{ 0, 1, Field::ExplosionType::vertical, Field::ExplosionType::down },
|
||||
};
|
||||
|
||||
for( const Dir& d : dirs )
|
||||
{
|
||||
for( int i = 1; i <= maxrange; i++ )
|
||||
{
|
||||
int tx = x + d.dx * i;
|
||||
int ty = y + d.dy * i;
|
||||
|
||||
if( tx < 0 || tx > map.getx() - 1 || ty < 0 || ty > map.gety() - 1 )
|
||||
break;
|
||||
|
||||
Destruction destr = map.at( tx, ty ).destructible();
|
||||
if( destr == Destruction::none )
|
||||
break;
|
||||
|
||||
etiles.emplace_back( tx, ty );
|
||||
|
||||
if( map.at( tx, ty ).kind == Field::Kind::crate )
|
||||
world.crates_left--;
|
||||
|
||||
if( i == maxrange || destr == Destruction::single )
|
||||
{
|
||||
map.at( tx, ty ) = Field::explosion( d.tip );
|
||||
break;
|
||||
}
|
||||
else
|
||||
{
|
||||
map.at( tx, ty ) = Field::explosion( d.through );
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void Bomb::die( World& world )
|
||||
{
|
||||
ZoneScoped;
|
||||
dead = true;
|
||||
|
||||
Map& map = world.map();
|
||||
map.at( x, y ) = Field::floor();
|
||||
for( const auto& [tx, ty] : etiles )
|
||||
map.at( tx, ty ) = Field::floor();
|
||||
}
|
||||
|
||||
}
|
||||
44
examples/dyna/src/bomb.hpp
Normal file
@@ -0,0 +1,44 @@
|
||||
#pragma once
|
||||
|
||||
#include <utility>
|
||||
#include <vector>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
class World;
|
||||
|
||||
// A bomb on the grid: fades in, counts down, then paints a cross-shaped
|
||||
// explosion onto the map and clears it again. Ported from bomb.cs.
|
||||
class Bomb
|
||||
{
|
||||
public:
|
||||
Bomb( int x, int y );
|
||||
|
||||
void draw();
|
||||
void tick( World& world );
|
||||
|
||||
bool is_dead() const { return dead; }
|
||||
|
||||
private:
|
||||
void explode( World& world );
|
||||
void die( World& world );
|
||||
|
||||
enum class Stage
|
||||
{
|
||||
appear,
|
||||
ticking,
|
||||
exploding
|
||||
};
|
||||
|
||||
int x, y; // grid coordinates
|
||||
Stage stage = Stage::appear;
|
||||
int left;
|
||||
int delta = 0;
|
||||
static constexpr int time = 150;
|
||||
static constexpr int maxrange = 1;
|
||||
std::vector<std::pair<int, int>> etiles; // tiles to revert to floor
|
||||
bool dead = false;
|
||||
};
|
||||
|
||||
}
|
||||
58
examples/dyna/src/bonus.cpp
Normal file
@@ -0,0 +1,58 @@
|
||||
#include "bonus.hpp"
|
||||
|
||||
#include "gfx.hpp"
|
||||
#include "texture.hpp"
|
||||
#include "timer.hpp"
|
||||
|
||||
#include <tracy/Tracy.hpp>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
Vortex::Vortex( int gx, int gy )
|
||||
{
|
||||
x = gx; // stored in grid units, drawn via draw_square
|
||||
y = gy;
|
||||
set_action( Action::appear );
|
||||
left = 79;
|
||||
}
|
||||
|
||||
void Vortex::draw()
|
||||
{
|
||||
ZoneScoped;
|
||||
int frame = static_cast<int>( ( Timer::get_timestamp() - action_start ) / 40 );
|
||||
|
||||
switch( action )
|
||||
{
|
||||
case Action::appear:
|
||||
Textures::vortex_appear.bind( frame );
|
||||
break;
|
||||
case Action::wait:
|
||||
Textures::vortex.bind( frame );
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
|
||||
Gfx::draw_square( x, y );
|
||||
}
|
||||
|
||||
void Vortex::tick( World& )
|
||||
{
|
||||
ZoneScoped;
|
||||
delta += Timer::delta;
|
||||
|
||||
while( delta > 10 )
|
||||
{
|
||||
delta -= 10;
|
||||
|
||||
if( left > 0 )
|
||||
left--;
|
||||
else if( action == Action::appear )
|
||||
set_action( Action::wait );
|
||||
}
|
||||
}
|
||||
|
||||
void Vortex::die( World& ) {}
|
||||
|
||||
}
|
||||
20
examples/dyna/src/bonus.hpp
Normal file
@@ -0,0 +1,20 @@
|
||||
#pragma once
|
||||
|
||||
#include "entity.hpp"
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
// The level-exit portal. Unlike the other entities its coordinates are stored in
|
||||
// grid units (it draws via draw_square), matching bonus.cs.
|
||||
class Vortex : public Entity
|
||||
{
|
||||
public:
|
||||
Vortex( int gx, int gy );
|
||||
|
||||
void draw() override;
|
||||
void tick( World& world ) override;
|
||||
void die( World& world ) override;
|
||||
};
|
||||
|
||||
}
|
||||
25
examples/dyna/src/datapath.cpp
Normal file
@@ -0,0 +1,25 @@
|
||||
#include "datapath.hpp"
|
||||
|
||||
#include <SDL3/SDL.h>
|
||||
|
||||
#include <tracy/Tracy.hpp>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
std::string data_path( const std::string& rel )
|
||||
{
|
||||
ZoneScoped;
|
||||
ZoneText( rel.c_str(), rel.size() );
|
||||
|
||||
// SDL_GetBasePath returns the executable's directory (with a trailing
|
||||
// separator) and is owned by SDL, so cache it for the program's lifetime.
|
||||
static const std::string base = []
|
||||
{
|
||||
const char* p = SDL_GetBasePath();
|
||||
return std::string( p ? p : "" );
|
||||
}();
|
||||
return base + rel;
|
||||
}
|
||||
|
||||
}
|
||||
14
examples/dyna/src/datapath.hpp
Normal file
@@ -0,0 +1,14 @@
|
||||
#pragma once
|
||||
|
||||
#include <string>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
// Resolve a path relative to the directory containing the executable, so the
|
||||
// game finds its data files regardless of the current working directory (e.g.
|
||||
// when launched from the build tree). The data/ tree is copied next to the
|
||||
// binary at build time; see CMakeLists.txt.
|
||||
std::string data_path( const std::string& rel );
|
||||
|
||||
}
|
||||
39
examples/dyna/src/entity.cpp
Normal file
@@ -0,0 +1,39 @@
|
||||
#include "entity.hpp"
|
||||
|
||||
#include "map.hpp"
|
||||
#include "timer.hpp"
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
void Entity::set_action( Action a )
|
||||
{
|
||||
action = a;
|
||||
action_start = Timer::get_timestamp();
|
||||
}
|
||||
|
||||
bool Entity::can_move( Action a, const Map& map ) const
|
||||
{
|
||||
switch( a )
|
||||
{
|
||||
case Action::up:
|
||||
return y > 0 && !map.at( x / 64, y / 64 - 1 ).solid();
|
||||
case Action::down:
|
||||
return y / 64 < map.gety() - 1 && !map.at( x / 64, y / 64 + 1 ).solid();
|
||||
case Action::left:
|
||||
return x > 0 && !map.at( x / 64 - 1, y / 64 ).solid();
|
||||
case Action::right:
|
||||
return x / 64 < map.getx() - 1 && !map.at( x / 64 + 1, y / 64 ).solid();
|
||||
default:
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
bool Entity::killed( const Map& map ) const
|
||||
{
|
||||
int tx = ( x + 32 ) / 64;
|
||||
int ty = ( y + 32 ) / 64;
|
||||
return map.at( tx, ty ).kind == Field::Kind::explosion;
|
||||
}
|
||||
|
||||
}
|
||||
52
examples/dyna/src/entity.hpp
Normal file
@@ -0,0 +1,52 @@
|
||||
#pragma once
|
||||
|
||||
#include <cstdint>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
class Map;
|
||||
class World;
|
||||
|
||||
// Movement/state verbs shared by the player and monsters. In the C# source this
|
||||
// lived as Entity.Action; promoted to namespace scope so Game can refer to it.
|
||||
enum class Action
|
||||
{
|
||||
wait,
|
||||
up,
|
||||
down,
|
||||
left,
|
||||
right,
|
||||
death,
|
||||
place_bomb,
|
||||
appear
|
||||
};
|
||||
|
||||
// Base for everything that moves on the grid. Coordinates are in pixels
|
||||
// (64 per tile) and laid out top-left origin, matching entity.cs.
|
||||
class Entity
|
||||
{
|
||||
public:
|
||||
virtual ~Entity() = default;
|
||||
|
||||
virtual void set_action( Action a );
|
||||
|
||||
int getx() const { return x; }
|
||||
int gety() const { return y; }
|
||||
|
||||
virtual void draw() = 0;
|
||||
virtual void tick( World& world ) = 0;
|
||||
virtual void die( World& world ) = 0;
|
||||
|
||||
protected:
|
||||
bool can_move( Action a, const Map& map ) const;
|
||||
virtual bool killed( const Map& map ) const;
|
||||
|
||||
int x = 0, y = 0;
|
||||
std::int64_t action_start = 0;
|
||||
int delta = 0;
|
||||
Action action = Action::wait;
|
||||
int left = 0;
|
||||
};
|
||||
|
||||
}
|
||||
210
examples/dyna/src/game.cpp
Normal file
@@ -0,0 +1,210 @@
|
||||
#include "game.hpp"
|
||||
|
||||
#include "datapath.hpp"
|
||||
#include "gfx.hpp"
|
||||
#include "map.hpp"
|
||||
#include "player.hpp"
|
||||
#include "timer.hpp"
|
||||
#include "world.hpp"
|
||||
|
||||
#include <SDL3/SDL.h>
|
||||
#include <tracy/Tracy.hpp>
|
||||
|
||||
#include <string>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
namespace Game
|
||||
{
|
||||
|
||||
namespace
|
||||
{
|
||||
|
||||
struct TracySection
|
||||
{
|
||||
explicit TracySection( const char* name ) { Enter( name ); }
|
||||
~TracySection() { Leave(); }
|
||||
|
||||
void Enter( const char* name )
|
||||
{
|
||||
idx = TracySectionEnter( "%s", name );
|
||||
}
|
||||
|
||||
void Leave()
|
||||
{
|
||||
if( idx > 0 )
|
||||
{
|
||||
TracySectionLeave( idx );
|
||||
idx = 0;
|
||||
}
|
||||
}
|
||||
|
||||
private:
|
||||
uint32_t idx;
|
||||
};
|
||||
|
||||
SDL_Keycode key = 0; // most recently pressed movement key
|
||||
bool help = false;
|
||||
|
||||
// Run one level to completion. Returns true if the player asked to quit the
|
||||
// whole application (window close), false if the level simply ended (death,
|
||||
// escape, or reaching the exit) and control should return to the caller.
|
||||
bool level_loop( World& world )
|
||||
{
|
||||
TracySection section( ( std::string( "Level " ) + world.name() ).c_str() );
|
||||
|
||||
Player* p = world.player();
|
||||
|
||||
for( ;; )
|
||||
{
|
||||
SDL_Event ev;
|
||||
while( SDL_PollEvent( &ev ) )
|
||||
{
|
||||
if( ev.type == SDL_EVENT_QUIT )
|
||||
return true;
|
||||
|
||||
if( ev.type == SDL_EVENT_KEY_DOWN && !ev.key.repeat )
|
||||
{
|
||||
switch( ev.key.key )
|
||||
{
|
||||
case SDLK_ESCAPE:
|
||||
world.killed = true;
|
||||
return false;
|
||||
case SDLK_LEFT:
|
||||
key = SDLK_LEFT;
|
||||
p->move( Action::left );
|
||||
break;
|
||||
case SDLK_RIGHT:
|
||||
key = SDLK_RIGHT;
|
||||
p->move( Action::right );
|
||||
break;
|
||||
case SDLK_UP:
|
||||
key = SDLK_UP;
|
||||
p->move( Action::up );
|
||||
break;
|
||||
case SDLK_DOWN:
|
||||
key = SDLK_DOWN;
|
||||
p->move( Action::down );
|
||||
break;
|
||||
case SDLK_SPACE:
|
||||
world.map().place_bomb( ( p->getx() + 32 ) / 64, ( p->gety() + 32 ) / 64 );
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
if( ev.type == SDL_EVENT_KEY_UP )
|
||||
{
|
||||
switch( ev.key.key )
|
||||
{
|
||||
case SDLK_LEFT:
|
||||
if( key == SDLK_LEFT ) p->move( Action::wait );
|
||||
break;
|
||||
case SDLK_RIGHT:
|
||||
if( key == SDLK_RIGHT ) p->move( Action::wait );
|
||||
break;
|
||||
case SDLK_UP:
|
||||
if( key == SDLK_UP ) p->move( Action::wait );
|
||||
break;
|
||||
case SDLK_DOWN:
|
||||
if( key == SDLK_DOWN ) p->move( Action::wait );
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Gfx::clear();
|
||||
|
||||
Timer::tick();
|
||||
|
||||
world.tick();
|
||||
world.draw();
|
||||
|
||||
Gfx::swap();
|
||||
|
||||
if( world.killed || world.next_level )
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
// Play through the levels in order. Returns true if the application should quit.
|
||||
bool new_game()
|
||||
{
|
||||
TracySection section( "In-game" );
|
||||
|
||||
int level = 1;
|
||||
|
||||
for( ;; )
|
||||
{
|
||||
World world( data_path( "data/levels/" + std::to_string( level ) ), true );
|
||||
|
||||
if( level_loop( world ) )
|
||||
return true; // window closed
|
||||
|
||||
if( world.killed )
|
||||
return false; // died or escaped to the menu
|
||||
if( ++level >= 10 )
|
||||
return false; // cleared the last level
|
||||
}
|
||||
}
|
||||
|
||||
} // namespace
|
||||
|
||||
void menu_loop()
|
||||
{
|
||||
constexpr const char* sectionName = "Main menu";
|
||||
TracySection section( sectionName );
|
||||
|
||||
World world( data_path( "data/levels/menu" ), false );
|
||||
|
||||
for( ;; )
|
||||
{
|
||||
SDL_Event ev;
|
||||
while( SDL_PollEvent( &ev ) )
|
||||
{
|
||||
if( ev.type == SDL_EVENT_QUIT )
|
||||
return;
|
||||
|
||||
if( ev.type == SDL_EVENT_KEY_DOWN && !ev.key.repeat )
|
||||
{
|
||||
switch( ev.key.key )
|
||||
{
|
||||
case SDLK_ESCAPE:
|
||||
return;
|
||||
case SDLK_SPACE:
|
||||
section.Leave();
|
||||
if( new_game() )
|
||||
return; // window closed during play
|
||||
section.Enter( sectionName );
|
||||
break;
|
||||
case SDLK_H:
|
||||
help = !help;
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Gfx::clear();
|
||||
|
||||
Timer::tick();
|
||||
world.tick();
|
||||
world.draw();
|
||||
|
||||
if( help )
|
||||
Gfx::show_help();
|
||||
else
|
||||
Gfx::show_menu();
|
||||
|
||||
Gfx::swap();
|
||||
}
|
||||
}
|
||||
|
||||
} // namespace Game
|
||||
|
||||
}
|
||||
14
examples/dyna/src/game.hpp
Normal file
@@ -0,0 +1,14 @@
|
||||
#pragma once
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
// Top-level game flow, ported from game.cs. The C# original kept the running
|
||||
// game's state (player, map, win/lose flags) in static fields; that state now
|
||||
// lives in a World object owned by the loops below, so nothing leaks out here.
|
||||
namespace Game
|
||||
{
|
||||
void menu_loop();
|
||||
}
|
||||
|
||||
}
|
||||
517
examples/dyna/src/gfx.cpp
Normal file
@@ -0,0 +1,517 @@
|
||||
#include "gfx.hpp"
|
||||
|
||||
#include "texture.hpp"
|
||||
#include "timer.hpp"
|
||||
|
||||
#include <SDL3/SDL.h>
|
||||
#include <tracy/Tracy.hpp>
|
||||
|
||||
#include <cassert>
|
||||
#include <cstdio>
|
||||
#include <vector>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
namespace
|
||||
{
|
||||
|
||||
SDL_Window* g_window = nullptr;
|
||||
SDL_GLContext g_gl_context = nullptr;
|
||||
|
||||
GLuint g_program = 0;
|
||||
GLuint g_vao = 0;
|
||||
GLuint g_vbo = 0;
|
||||
|
||||
// Current draw state, applied to every quad appended to the batch.
|
||||
GLuint g_current_tex = 0;
|
||||
int g_current_layer = 0;
|
||||
float g_alpha = 1.0f;
|
||||
|
||||
// One vertex of the streaming batch: screen position, atlas-array texcoord,
|
||||
// the array layer to sample and a per-vertex alpha multiplier.
|
||||
struct GlVert
|
||||
{
|
||||
float px, py, tx, ty, layer, a;
|
||||
};
|
||||
|
||||
// A run of consecutive vertices that share one texture, drawn in a single call.
|
||||
struct DrawCmd
|
||||
{
|
||||
GLuint tex;
|
||||
GLsizei count;
|
||||
};
|
||||
|
||||
std::vector<GlVert> g_verts;
|
||||
std::vector<DrawCmd> g_cmds;
|
||||
|
||||
const char* VERT_SRC = R"(
|
||||
#version 330 core
|
||||
uniform mat4 uProjection;
|
||||
layout(location = 0) in vec2 aPosition;
|
||||
layout(location = 1) in vec2 aTexCoord;
|
||||
layout(location = 2) in float aLayer;
|
||||
layout(location = 3) in float aAlpha;
|
||||
out vec3 vTexCoord;
|
||||
out float vAlpha;
|
||||
void main() {
|
||||
gl_Position = uProjection * vec4(aPosition, 0.0, 1.0);
|
||||
vTexCoord = vec3(aTexCoord, aLayer);
|
||||
vAlpha = aAlpha;
|
||||
}
|
||||
)";
|
||||
|
||||
const char* FRAG_SRC = R"(
|
||||
#version 330 core
|
||||
uniform sampler2DArray uTexture;
|
||||
in vec3 vTexCoord;
|
||||
in float vAlpha;
|
||||
out vec4 fragColor;
|
||||
void main() {
|
||||
fragColor = texture(uTexture, vTexCoord) * vec4(1.0, 1.0, 1.0, vAlpha);
|
||||
}
|
||||
)";
|
||||
|
||||
GLuint compile_shader( GLenum type, const char* src )
|
||||
{
|
||||
ZoneScoped;
|
||||
GLuint s = glCreateShader( type );
|
||||
glShaderSource( s, 1, &src, nullptr );
|
||||
glCompileShader( s );
|
||||
GLint ok = 0;
|
||||
glGetShaderiv( s, GL_COMPILE_STATUS, &ok );
|
||||
if( !ok )
|
||||
{
|
||||
char log[512];
|
||||
glGetShaderInfoLog( s, 512, nullptr, log );
|
||||
std::fprintf( stderr, "Shader compile error: %s\n", log );
|
||||
glDeleteShader( s );
|
||||
return 0;
|
||||
}
|
||||
return s;
|
||||
}
|
||||
|
||||
bool init_shaders()
|
||||
{
|
||||
ZoneScoped;
|
||||
GLuint vs = compile_shader( GL_VERTEX_SHADER, VERT_SRC );
|
||||
if( !vs ) return false;
|
||||
GLuint fs = compile_shader( GL_FRAGMENT_SHADER, FRAG_SRC );
|
||||
if( !fs )
|
||||
{
|
||||
glDeleteShader( vs );
|
||||
return false;
|
||||
}
|
||||
|
||||
g_program = glCreateProgram();
|
||||
glAttachShader( g_program, vs );
|
||||
glAttachShader( g_program, fs );
|
||||
glLinkProgram( g_program );
|
||||
glDeleteShader( vs );
|
||||
glDeleteShader( fs );
|
||||
|
||||
GLint ok = 0;
|
||||
glGetProgramiv( g_program, GL_LINK_STATUS, &ok );
|
||||
if( !ok )
|
||||
{
|
||||
char log[512];
|
||||
glGetProgramInfoLog( g_program, 512, nullptr, log );
|
||||
std::fprintf( stderr, "Program link error: %s\n", log );
|
||||
glDeleteProgram( g_program );
|
||||
g_program = 0;
|
||||
return false;
|
||||
}
|
||||
|
||||
// Bottom-left origin orthographic projection, matching the original
|
||||
// gluOrtho2D(0, w, 0, h) so the ported draw code carries over verbatim.
|
||||
float l = 0.0f, r = static_cast<float>( Gfx::w );
|
||||
float b = 0.0f, t = static_cast<float>( Gfx::h );
|
||||
float proj[16] = {
|
||||
2.0f / ( r - l ), 0.0f, 0.0f, 0.0f,
|
||||
0.0f, 2.0f / ( t - b ), 0.0f, 0.0f,
|
||||
0.0f, 0.0f, -1.0f, 0.0f,
|
||||
-( r + l ) / ( r - l ), -( t + b ) / ( t - b ), 0.0f, 1.0f };
|
||||
|
||||
glUseProgram( g_program );
|
||||
glUniformMatrix4fv( glGetUniformLocation( g_program, "uProjection" ), 1, GL_FALSE, proj );
|
||||
glUniform1i( glGetUniformLocation( g_program, "uTexture" ), 0 );
|
||||
glUseProgram( 0 );
|
||||
return true;
|
||||
}
|
||||
|
||||
void init_quad_vao()
|
||||
{
|
||||
ZoneScoped;
|
||||
glGenVertexArrays( 1, &g_vao );
|
||||
glGenBuffers( 1, &g_vbo );
|
||||
|
||||
glBindVertexArray( g_vao );
|
||||
glBindBuffer( GL_ARRAY_BUFFER, g_vbo );
|
||||
|
||||
const GLsizei stride = sizeof( GlVert );
|
||||
glEnableVertexAttribArray( 0 );
|
||||
glVertexAttribPointer( 0, 2, GL_FLOAT, GL_FALSE, stride, (void*)0 );
|
||||
glEnableVertexAttribArray( 1 );
|
||||
glVertexAttribPointer( 1, 2, GL_FLOAT, GL_FALSE, stride, (void*)8 );
|
||||
glEnableVertexAttribArray( 2 );
|
||||
glVertexAttribPointer( 2, 1, GL_FLOAT, GL_FALSE, stride, (void*)16 );
|
||||
glEnableVertexAttribArray( 3 );
|
||||
glVertexAttribPointer( 3, 1, GL_FLOAT, GL_FALSE, stride, (void*)20 );
|
||||
|
||||
glBindVertexArray( 0 );
|
||||
glBindBuffer( GL_ARRAY_BUFFER, 0 );
|
||||
}
|
||||
|
||||
// Draw and clear everything accumulated since the last flush, in submission
|
||||
// order. Consecutive quads that share a texture collapse into one draw call.
|
||||
void flush_batch()
|
||||
{
|
||||
ZoneScoped;
|
||||
if( g_verts.empty() )
|
||||
return;
|
||||
|
||||
glBindBuffer( GL_ARRAY_BUFFER, g_vbo );
|
||||
glBufferData( GL_ARRAY_BUFFER,
|
||||
static_cast<GLsizeiptr>( g_verts.size() * sizeof( GlVert ) ),
|
||||
g_verts.data(), GL_STREAM_DRAW );
|
||||
|
||||
glUseProgram( g_program );
|
||||
glBindVertexArray( g_vao );
|
||||
|
||||
GLint offset = 0;
|
||||
for( const DrawCmd& cmd : g_cmds )
|
||||
{
|
||||
glBindTexture( GL_TEXTURE_2D_ARRAY, cmd.tex );
|
||||
glDrawArrays( GL_TRIANGLES, offset, cmd.count );
|
||||
offset += cmd.count;
|
||||
}
|
||||
|
||||
glBindVertexArray( 0 );
|
||||
glUseProgram( 0 );
|
||||
glBindBuffer( GL_ARRAY_BUFFER, 0 );
|
||||
|
||||
g_verts.clear();
|
||||
g_cmds.clear();
|
||||
}
|
||||
|
||||
// Frame image capture, following the OpenGL example in the Tracy manual. The
|
||||
// backbuffer is downscaled on the GPU to a small fixed size and read back
|
||||
// asynchronously, so a screenshot can be attached to every frame without
|
||||
// stalling the CPU on the GPU. Several buffer sets are cycled because rendering
|
||||
// runs a few frames ahead of the GPU.
|
||||
// Half the render resolution, preserving its aspect ratio; both dimensions
|
||||
// stay divisible by 4 as FrameImage requires.
|
||||
constexpr int FI_W = Gfx::w / 2;
|
||||
constexpr int FI_H = Gfx::h / 2;
|
||||
constexpr int FI_COUNT = 4;
|
||||
|
||||
GLuint g_fi_texture[FI_COUNT];
|
||||
GLuint g_fi_framebuffer[FI_COUNT];
|
||||
GLuint g_fi_pbo[FI_COUNT];
|
||||
GLsync g_fi_fence[FI_COUNT] = {};
|
||||
int g_fi_idx = 0;
|
||||
std::vector<int> g_fi_queue;
|
||||
|
||||
void init_frame_images()
|
||||
{
|
||||
ZoneScoped;
|
||||
glGenTextures( FI_COUNT, g_fi_texture );
|
||||
glGenFramebuffers( FI_COUNT, g_fi_framebuffer );
|
||||
glGenBuffers( FI_COUNT, g_fi_pbo );
|
||||
for( int i = 0; i < FI_COUNT; i++ )
|
||||
{
|
||||
glBindTexture( GL_TEXTURE_2D, g_fi_texture[i] );
|
||||
glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST );
|
||||
glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST );
|
||||
glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA, FI_W, FI_H, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr );
|
||||
|
||||
glBindFramebuffer( GL_FRAMEBUFFER, g_fi_framebuffer[i] );
|
||||
glFramebufferTexture2D( GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, g_fi_texture[i], 0 );
|
||||
|
||||
glBindBuffer( GL_PIXEL_PACK_BUFFER, g_fi_pbo[i] );
|
||||
glBufferData( GL_PIXEL_PACK_BUFFER, FI_W * FI_H * 4, nullptr, GL_STREAM_READ );
|
||||
}
|
||||
glBindFramebuffer( GL_FRAMEBUFFER, 0 );
|
||||
glBindBuffer( GL_PIXEL_PACK_BUFFER, 0 );
|
||||
}
|
||||
|
||||
void shutdown_frame_images()
|
||||
{
|
||||
ZoneScoped;
|
||||
glDeleteTextures( FI_COUNT, g_fi_texture );
|
||||
glDeleteFramebuffers( FI_COUNT, g_fi_framebuffer );
|
||||
glDeleteBuffers( FI_COUNT, g_fi_pbo );
|
||||
}
|
||||
|
||||
// Send any captures the GPU has already finished, then queue a capture of the
|
||||
// frame just rendered. Call after the batch is flushed but before swapping.
|
||||
void capture_frame_image()
|
||||
{
|
||||
ZoneScoped;
|
||||
|
||||
// Hand finished captures from earlier frames to the profiler. The queue
|
||||
// size is the number of frames we are still ahead of the GPU, which is the
|
||||
// frame lag Tracy needs as the FrameImage offset.
|
||||
while( !g_fi_queue.empty() )
|
||||
{
|
||||
const int idx = g_fi_queue.front();
|
||||
if( glClientWaitSync( g_fi_fence[idx], 0, 0 ) == GL_TIMEOUT_EXPIRED ) break;
|
||||
glDeleteSync( g_fi_fence[idx] );
|
||||
glBindBuffer( GL_PIXEL_PACK_BUFFER, g_fi_pbo[idx] );
|
||||
void* ptr = glMapBufferRange( GL_PIXEL_PACK_BUFFER, 0, FI_W * FI_H * 4, GL_MAP_READ_BIT );
|
||||
FrameImage( ptr, FI_W, FI_H, g_fi_queue.size(), true );
|
||||
glUnmapBuffer( GL_PIXEL_PACK_BUFFER );
|
||||
g_fi_queue.erase( g_fi_queue.begin() );
|
||||
}
|
||||
|
||||
// Downscale the current backbuffer into the next buffer set and start an
|
||||
// asynchronous read-back, signalled by a fence.
|
||||
assert( g_fi_queue.empty() || g_fi_queue.front() != g_fi_idx ); // buffer overrun
|
||||
glBindFramebuffer( GL_DRAW_FRAMEBUFFER, g_fi_framebuffer[g_fi_idx] );
|
||||
glBlitFramebuffer( 0, 0, Gfx::w, Gfx::h, 0, 0, FI_W, FI_H, GL_COLOR_BUFFER_BIT, GL_LINEAR );
|
||||
glBindFramebuffer( GL_DRAW_FRAMEBUFFER, 0 );
|
||||
glBindFramebuffer( GL_READ_FRAMEBUFFER, g_fi_framebuffer[g_fi_idx] );
|
||||
glBindBuffer( GL_PIXEL_PACK_BUFFER, g_fi_pbo[g_fi_idx] );
|
||||
glReadPixels( 0, 0, FI_W, FI_H, GL_RGBA, GL_UNSIGNED_BYTE, nullptr );
|
||||
glBindFramebuffer( GL_READ_FRAMEBUFFER, 0 );
|
||||
g_fi_fence[g_fi_idx] = glFenceSync( GL_SYNC_GPU_COMMANDS_COMPLETE, 0 );
|
||||
g_fi_queue.emplace_back( g_fi_idx );
|
||||
g_fi_idx = ( g_fi_idx + 1 ) % FI_COUNT;
|
||||
}
|
||||
|
||||
} // namespace
|
||||
|
||||
namespace Render
|
||||
{
|
||||
|
||||
bool init()
|
||||
{
|
||||
ZoneScoped;
|
||||
if( !init_shaders() ) return false;
|
||||
init_quad_vao();
|
||||
init_frame_images();
|
||||
return true;
|
||||
}
|
||||
|
||||
void shutdown()
|
||||
{
|
||||
ZoneScoped;
|
||||
shutdown_frame_images();
|
||||
if( g_vbo ) glDeleteBuffers( 1, &g_vbo );
|
||||
if( g_vao ) glDeleteVertexArrays( 1, &g_vao );
|
||||
if( g_program ) glDeleteProgram( g_program );
|
||||
g_vbo = g_vao = g_program = 0;
|
||||
}
|
||||
|
||||
void use_texture( GLuint tex, int layer )
|
||||
{
|
||||
g_current_tex = tex;
|
||||
g_current_layer = layer;
|
||||
}
|
||||
|
||||
GLuint make_texture( int w, int h, int layers, const void* rgba )
|
||||
{
|
||||
ZoneScoped;
|
||||
GLuint tex = 0;
|
||||
glGenTextures( 1, &tex );
|
||||
glBindTexture( GL_TEXTURE_2D_ARRAY, tex );
|
||||
glTexParameteri( GL_TEXTURE_2D_ARRAY, GL_TEXTURE_MIN_FILTER, GL_NEAREST );
|
||||
glTexParameteri( GL_TEXTURE_2D_ARRAY, GL_TEXTURE_MAG_FILTER, GL_NEAREST );
|
||||
glTexParameteri( GL_TEXTURE_2D_ARRAY, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE );
|
||||
glTexParameteri( GL_TEXTURE_2D_ARRAY, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE );
|
||||
glPixelStorei( GL_UNPACK_ALIGNMENT, 1 );
|
||||
glTexImage3D( GL_TEXTURE_2D_ARRAY, 0, GL_RGBA, w, h, layers, 0, GL_RGBA, GL_UNSIGNED_BYTE, rgba );
|
||||
return tex;
|
||||
}
|
||||
|
||||
} // namespace Render
|
||||
|
||||
namespace Gfx
|
||||
{
|
||||
|
||||
void clear()
|
||||
{
|
||||
glClear( GL_COLOR_BUFFER_BIT );
|
||||
}
|
||||
|
||||
void swap()
|
||||
{
|
||||
ZoneScoped;
|
||||
flush_batch();
|
||||
capture_frame_image();
|
||||
SDL_GL_SwapWindow( g_window );
|
||||
FrameMark;
|
||||
}
|
||||
|
||||
void alpha( float a )
|
||||
{
|
||||
g_alpha = a;
|
||||
}
|
||||
|
||||
void draw_quad( const Vertex corners[4] )
|
||||
{
|
||||
ZoneScoped;
|
||||
// Two triangles, vertices appended in submission order so painter ordering
|
||||
// (and the transient per-monster alpha) is preserved by the batch.
|
||||
const int idx[6] = { 0, 1, 2, 0, 2, 3 };
|
||||
for( int i : idx )
|
||||
{
|
||||
const Vertex& c = corners[i];
|
||||
g_verts.push_back( { c.x, c.y, c.u, c.v,
|
||||
static_cast<float>( g_current_layer ), g_alpha } );
|
||||
}
|
||||
|
||||
if( !g_cmds.empty() && g_cmds.back().tex == g_current_tex )
|
||||
g_cmds.back().count += 6;
|
||||
else
|
||||
g_cmds.push_back( { g_current_tex, 6 } );
|
||||
}
|
||||
|
||||
void draw_sprite( int x, int y )
|
||||
{
|
||||
ZoneScoped;
|
||||
float fx = static_cast<float>( x );
|
||||
float fy = static_cast<float>( y );
|
||||
float top = static_cast<float>( h ) - fy;
|
||||
float bottom = static_cast<float>( h ) - ( fy + 64.0f );
|
||||
Vertex corners[4] = {
|
||||
{ fx, top, 0.0f, 0.0f },
|
||||
{ fx + 64.0f, top, 1.0f, 0.0f },
|
||||
{ fx + 64.0f, bottom, 1.0f, 1.0f },
|
||||
{ fx, bottom, 0.0f, 1.0f },
|
||||
};
|
||||
draw_quad( corners );
|
||||
}
|
||||
|
||||
void draw_square( int x, int y )
|
||||
{
|
||||
draw_sprite( x * 64, y * 64 );
|
||||
}
|
||||
|
||||
void show_help()
|
||||
{
|
||||
ZoneScoped;
|
||||
Textures::menu.bind();
|
||||
|
||||
const float fw = static_cast<float>( w );
|
||||
const float fh = static_cast<float>( h );
|
||||
Vertex bg[4] = {
|
||||
{ 0.0f, fh, 0.0f, 0.0f },
|
||||
{ fw, fh, 832.0f / 1024, 0.0f },
|
||||
{ fw, 0.0f, 832.0f / 1024, 704.0f / 1024 },
|
||||
{ 0.0f, 0.0f, 0.0f, 704.0f / 1024 },
|
||||
};
|
||||
draw_quad( bg );
|
||||
|
||||
int t = static_cast<int>( Timer::get_timestamp() / 40 );
|
||||
|
||||
Textures::p_r.bind( t );
|
||||
draw_sprite( 150, 85 );
|
||||
Textures::m1_r.bind( t );
|
||||
draw_sprite( 75, 160 );
|
||||
Textures::m2_r.bind( t );
|
||||
draw_sprite( 150, 160 );
|
||||
Textures::m3_r.bind( t );
|
||||
draw_sprite( 225, 160 );
|
||||
Textures::bomb.bind( static_cast<int>( Timer::get_timestamp() / 100 % 2 ) );
|
||||
draw_sprite( 150, 235 );
|
||||
Textures::wall.bind();
|
||||
draw_sprite( 150, 310 );
|
||||
Textures::crate.bind();
|
||||
draw_sprite( 150, 385 );
|
||||
Textures::vortex.bind( t );
|
||||
draw_sprite( 150, 460 );
|
||||
Textures::bonus1.bind( t );
|
||||
draw_sprite( 112, 535 );
|
||||
Textures::bonus2.bind( t );
|
||||
draw_sprite( 187, 535 );
|
||||
}
|
||||
|
||||
void show_menu()
|
||||
{
|
||||
ZoneScoped;
|
||||
Textures::menu.bind();
|
||||
|
||||
Vertex logo[4] = {
|
||||
{ float( ( w - 594 ) / 2 ), float( h - 50 ), 1.0f, 0.0f },
|
||||
{ float( ( w + 594 ) / 2 ), float( h - 50 ), 1.0f, 594.0f / 1024 },
|
||||
{ float( ( w + 594 ) / 2 ), float( h - 50 - 180 ), 1.0f - 180.0f / 1024, 594.0f / 1024 },
|
||||
{ float( ( w - 594 ) / 2 ), float( h - 50 - 180 ), 1.0f - 180.0f / 1024, 0.0f },
|
||||
};
|
||||
draw_quad( logo );
|
||||
|
||||
Vertex prompt[4] = {
|
||||
{ float( ( w - 527 ) / 2 ), 335.0f, 0.0f, 704.0f / 1024 },
|
||||
{ float( ( w + 527 ) / 2 ), 335.0f, 527.0f / 1024, 704.0f / 1024 },
|
||||
{ float( ( w + 527 ) / 2 ), 20.0f, 527.0f / 1024, 1019.0f / 1024 },
|
||||
{ float( ( w - 527 ) / 2 ), 20.0f, 0.0f, 1019.0f / 1024 },
|
||||
};
|
||||
draw_quad( prompt );
|
||||
}
|
||||
|
||||
} // namespace Gfx
|
||||
|
||||
namespace Init
|
||||
{
|
||||
|
||||
bool all()
|
||||
{
|
||||
ZoneScoped;
|
||||
if( !SDL_Init( SDL_INIT_VIDEO ) )
|
||||
{
|
||||
std::fprintf( stderr, "SDL_Init failed: %s\n", SDL_GetError() );
|
||||
return false;
|
||||
}
|
||||
|
||||
SDL_GL_SetAttribute( SDL_GL_DOUBLEBUFFER, 1 );
|
||||
SDL_GL_SetAttribute( SDL_GL_CONTEXT_MAJOR_VERSION, 3 );
|
||||
SDL_GL_SetAttribute( SDL_GL_CONTEXT_MINOR_VERSION, 3 );
|
||||
SDL_GL_SetAttribute( SDL_GL_CONTEXT_PROFILE_MASK, SDL_GL_CONTEXT_PROFILE_CORE );
|
||||
|
||||
g_window = SDL_CreateWindow( "Dyna.net", Gfx::w, Gfx::h, SDL_WINDOW_OPENGL );
|
||||
if( !g_window )
|
||||
{
|
||||
std::fprintf( stderr, "SDL_CreateWindow failed: %s\n", SDL_GetError() );
|
||||
return false;
|
||||
}
|
||||
|
||||
g_gl_context = SDL_GL_CreateContext( g_window );
|
||||
if( !g_gl_context )
|
||||
{
|
||||
std::fprintf( stderr, "SDL_GL_CreateContext failed: %s\n", SDL_GetError() );
|
||||
return false;
|
||||
}
|
||||
|
||||
int version = gladLoadGL( (GLADloadfunc)SDL_GL_GetProcAddress );
|
||||
if( version == 0 )
|
||||
{
|
||||
std::fprintf( stderr, "gladLoadGL failed\n" );
|
||||
return false;
|
||||
}
|
||||
|
||||
SDL_GL_SetSwapInterval( 1 ); // vsync; the game is time-based so speed is unaffected
|
||||
|
||||
glViewport( 0, 0, Gfx::w, Gfx::h );
|
||||
glClearColor( 0.0f, 0.0f, 0.0f, 1.0f );
|
||||
glEnable( GL_BLEND );
|
||||
glBlendFunc( GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA );
|
||||
|
||||
if( !Render::init() ) return false;
|
||||
|
||||
Timer::reset();
|
||||
Textures::preload();
|
||||
return true;
|
||||
}
|
||||
|
||||
void shutdown()
|
||||
{
|
||||
ZoneScoped;
|
||||
Render::shutdown();
|
||||
if( g_gl_context ) SDL_GL_DestroyContext( g_gl_context );
|
||||
if( g_window ) SDL_DestroyWindow( g_window );
|
||||
SDL_Quit();
|
||||
}
|
||||
|
||||
} // namespace Init
|
||||
|
||||
}
|
||||
59
examples/dyna/src/gfx.hpp
Normal file
@@ -0,0 +1,59 @@
|
||||
#pragma once
|
||||
|
||||
#include <glad/gl.h>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
// Screen dimensions, matching the original 13x11 grid of 64px tiles.
|
||||
namespace Gfx
|
||||
{
|
||||
constexpr int w = 832;
|
||||
constexpr int h = 704;
|
||||
|
||||
void clear();
|
||||
void swap();
|
||||
|
||||
// Drawing primitives ported from gfx.cs. They render with the currently
|
||||
// bound texture (see Texture::bind) and the current alpha. The coordinate
|
||||
// system is bottom-left origin with y growing upward, exactly as the C#
|
||||
// gluOrtho2D setup; draw_sprite/draw_square take y measured from the top
|
||||
// and flip internally, so game-side coordinates stay top-left based.
|
||||
void alpha( float a );
|
||||
void draw_sprite( int x, int y ); // pixel position of the top-left corner
|
||||
void draw_square( int x, int y ); // grid position (multiplied by 64)
|
||||
|
||||
// A single textured quad given four explicit (position, texcoord) corners,
|
||||
// used by the menu/help screens which sample rotated regions of the atlas.
|
||||
struct Vertex
|
||||
{
|
||||
float x, y, u, v;
|
||||
};
|
||||
void draw_quad( const Vertex corners[4] );
|
||||
|
||||
void show_help();
|
||||
void show_menu();
|
||||
}
|
||||
|
||||
// Renderer back end shared by the texture loaders.
|
||||
namespace Render
|
||||
{
|
||||
bool init(); // shaders + streaming VBO/VAO
|
||||
void shutdown(); // delete the program and buffers
|
||||
|
||||
// Select the array texture (and layer within it) used by subsequent draws.
|
||||
void use_texture( GLuint tex, int layer );
|
||||
|
||||
// Upload `layers` tightly packed RGBA8 images of size w*h as one
|
||||
// GL_TEXTURE_2D_ARRAY and return its name (0 on failure).
|
||||
GLuint make_texture( int w, int h, int layers, const void* rgba );
|
||||
}
|
||||
|
||||
// One-time startup/shutdown, ported from the Init class in gfx.cs.
|
||||
namespace Init
|
||||
{
|
||||
bool all(); // SDL, GL context, renderer, textures, timer
|
||||
void shutdown();
|
||||
}
|
||||
|
||||
}
|
||||
38
examples/dyna/src/main.cpp
Normal file
@@ -0,0 +1,38 @@
|
||||
#include "game.hpp"
|
||||
#include "gfx.hpp"
|
||||
|
||||
#include <SDL3/SDL_main.h>
|
||||
#include <tracy/Tracy.hpp>
|
||||
|
||||
#include <cstdlib>
|
||||
#include <new>
|
||||
|
||||
// Route every heap allocation through Tracy so the profiler can track memory
|
||||
// usage. The default array forms (operator new[]/delete[]) and the nothrow
|
||||
// forms forward to these, so overriding the scalar operators covers them too.
|
||||
void* operator new( std::size_t count )
|
||||
{
|
||||
void* ptr = std::malloc( count );
|
||||
if( !ptr ) throw std::bad_alloc();
|
||||
TracyAlloc( ptr, count );
|
||||
return ptr;
|
||||
}
|
||||
|
||||
void operator delete( void* ptr ) noexcept
|
||||
{
|
||||
TracyFree( ptr );
|
||||
std::free( ptr );
|
||||
}
|
||||
|
||||
int main( int /*argc*/, char* /*argv*/[] )
|
||||
{
|
||||
TracyNoop;
|
||||
|
||||
if( !dyna::Init::all() )
|
||||
return 1;
|
||||
|
||||
dyna::Game::menu_loop();
|
||||
|
||||
dyna::Init::shutdown();
|
||||
return 0;
|
||||
}
|
||||
334
examples/dyna/src/map.cpp
Normal file
@@ -0,0 +1,334 @@
|
||||
#include "map.hpp"
|
||||
|
||||
#include "bomb.hpp"
|
||||
#include "bonus.hpp"
|
||||
#include "gfx.hpp"
|
||||
#include "monster.hpp"
|
||||
#include "player.hpp"
|
||||
#include "texture.hpp"
|
||||
#include "timer.hpp"
|
||||
#include "world.hpp"
|
||||
|
||||
#include <algorithm>
|
||||
#include <cmath>
|
||||
#include <cstdio>
|
||||
#include <fstream>
|
||||
#include <sstream>
|
||||
|
||||
#include <tracy/Tracy.hpp>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
// ---- Field --------------------------------------------------------------
|
||||
|
||||
Field Field::explosion( ExplosionType t )
|
||||
{
|
||||
Field f;
|
||||
f.kind = Kind::explosion;
|
||||
f.etype = t;
|
||||
f.tstart = Timer::get_timestamp();
|
||||
return f;
|
||||
}
|
||||
|
||||
bool Field::solid() const
|
||||
{
|
||||
switch( kind )
|
||||
{
|
||||
case Kind::wall:
|
||||
case Kind::crate:
|
||||
case Kind::bomb:
|
||||
return true;
|
||||
default:
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
Destruction Field::destructible() const
|
||||
{
|
||||
switch( kind )
|
||||
{
|
||||
case Kind::floor:
|
||||
return Destruction::multi;
|
||||
case Kind::crate:
|
||||
return Destruction::single;
|
||||
default:
|
||||
return Destruction::none;
|
||||
}
|
||||
}
|
||||
|
||||
void Field::draw( int x, int y ) const
|
||||
{
|
||||
switch( kind )
|
||||
{
|
||||
case Kind::wall:
|
||||
Textures::wall.bind();
|
||||
Gfx::draw_square( x, y );
|
||||
break;
|
||||
|
||||
case Kind::crate:
|
||||
Textures::sand.bind();
|
||||
Gfx::draw_square( x, y );
|
||||
Textures::crate.bind();
|
||||
Gfx::draw_square( x, y );
|
||||
break;
|
||||
|
||||
case Kind::explosion: {
|
||||
Textures::sand.bind();
|
||||
Gfx::draw_square( x, y );
|
||||
|
||||
int frame = static_cast<int>( ( Timer::get_timestamp() - tstart ) / 40 % 8 );
|
||||
if( frame > 4 ) frame = 8 - frame;
|
||||
|
||||
switch( etype )
|
||||
{
|
||||
case ExplosionType::center: Textures::e_c.bind( frame ); break;
|
||||
case ExplosionType::vertical: Textures::e_v.bind( frame ); break;
|
||||
case ExplosionType::horizontal: Textures::e_h.bind( frame ); break;
|
||||
case ExplosionType::left: Textures::e_le.bind( frame ); break;
|
||||
case ExplosionType::right: Textures::e_re.bind( frame ); break;
|
||||
case ExplosionType::up: Textures::e_ue.bind( frame ); break;
|
||||
case ExplosionType::down: Textures::e_de.bind( frame ); break;
|
||||
}
|
||||
Gfx::draw_square( x, y );
|
||||
break;
|
||||
}
|
||||
|
||||
// floor, bomb and vortex tiles all show plain sand; the bomb and vortex
|
||||
// sprites themselves are drawn by their entities.
|
||||
case Kind::floor:
|
||||
case Kind::bomb:
|
||||
case Kind::vortex:
|
||||
default:
|
||||
Textures::sand.bind();
|
||||
Gfx::draw_square( x, y );
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
// ---- Map ----------------------------------------------------------------
|
||||
|
||||
Map::Map( const std::string& fn )
|
||||
{
|
||||
ZoneScoped;
|
||||
ZoneText( fn.c_str(), fn.size() );
|
||||
|
||||
load( fn );
|
||||
generate_destructibles();
|
||||
populate_map();
|
||||
}
|
||||
|
||||
Map::~Map() = default;
|
||||
|
||||
void Map::load( const std::string& fn )
|
||||
{
|
||||
ZoneScoped;
|
||||
std::ifstream f( fn );
|
||||
if( !f )
|
||||
{
|
||||
std::fprintf( stderr, "Cannot open level %s\n", fn.c_str() );
|
||||
grid.assign( X * Y, Field::floor() );
|
||||
return;
|
||||
}
|
||||
|
||||
std::stringstream buf;
|
||||
buf << f.rdbuf();
|
||||
std::string content = buf.str();
|
||||
|
||||
size_t nl = content.find( '\n' );
|
||||
std::string header = ( nl == std::string::npos ) ? content : content.substr( 0, nl );
|
||||
std::sscanf( header.c_str(), "%d %d %d %d", &destructibles, &m1, &m2, &m3 );
|
||||
|
||||
grid.assign( X * Y, Field::floor() );
|
||||
px = -1;
|
||||
|
||||
size_t p = ( nl == std::string::npos ) ? content.size() : nl + 1;
|
||||
for( int ry = 0; ry < Y; ry++ )
|
||||
{
|
||||
for( int rx = 0; rx < X; rx++ )
|
||||
{
|
||||
char c = ( p < content.size() ) ? content[p++] : '\0';
|
||||
switch( c )
|
||||
{
|
||||
case '.':
|
||||
at( rx, ry ) = Field::floor();
|
||||
break;
|
||||
case '#':
|
||||
at( rx, ry ) = Field::wall();
|
||||
break;
|
||||
case '@':
|
||||
at( rx, ry ) = Field::floor();
|
||||
px = rx;
|
||||
py = ry;
|
||||
break;
|
||||
case '\n':
|
||||
rx--; // newlines don't consume a grid cell
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
bool Map::monster_ok( int rx, int ry, int pxx, int pyy, int r ) const
|
||||
{
|
||||
const Field& f = at( rx, ry );
|
||||
return f.is_floor_family() && f.kind != Field::Kind::crate &&
|
||||
( std::abs( rx - pxx ) > r || std::abs( ry - pyy ) > r );
|
||||
}
|
||||
|
||||
void Map::generate_destructibles()
|
||||
{
|
||||
ZoneScoped;
|
||||
int i = destructibles;
|
||||
while( i != 0 )
|
||||
{
|
||||
int rx = RNG::next( X );
|
||||
int ry = RNG::next( Y );
|
||||
if( monster_ok( rx, ry, px, py, 1 ) )
|
||||
{
|
||||
at( rx, ry ) = Field::crate();
|
||||
i--;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void Map::populate_map()
|
||||
{
|
||||
ZoneScoped;
|
||||
for( int type = 1; type <= 3; type++ )
|
||||
{
|
||||
int count = ( type == 1 ) ? m1 : ( type == 2 ) ? m2
|
||||
: m3;
|
||||
while( count != 0 )
|
||||
{
|
||||
int rx = RNG::next( X );
|
||||
int ry = RNG::next( Y );
|
||||
if( monster_ok( rx, ry, px, py, 2 ) )
|
||||
{
|
||||
monsters.push_back( std::make_unique<Monster>( type, rx, ry ) );
|
||||
count--;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void Map::draw()
|
||||
{
|
||||
ZoneScoped;
|
||||
for( int ry = 0; ry < Y; ry++ )
|
||||
for( int rx = 0; rx < X; rx++ )
|
||||
at( rx, ry ).draw( rx, ry );
|
||||
|
||||
for( auto& b : bombs ) b->draw();
|
||||
for( auto& e : monsters ) e->draw();
|
||||
for( auto& e : bonuses ) e->draw();
|
||||
}
|
||||
|
||||
void Map::tick( World& world )
|
||||
{
|
||||
ZoneScoped;
|
||||
// Bombs.
|
||||
for( auto& b : bombs ) b->tick( world );
|
||||
bombs.erase( std::remove_if( bombs.begin(), bombs.end(),
|
||||
[]( const std::unique_ptr<Bomb>& b ) { return b->is_dead(); } ),
|
||||
bombs.end() );
|
||||
|
||||
// Monsters: tick, then retire the dead and queue their respawn timers.
|
||||
for( auto& e : monsters ) e->tick( world );
|
||||
for( auto& e : monsters )
|
||||
{
|
||||
if( e->is_dead() )
|
||||
{
|
||||
int delay = ( e->type() == 1 ) ? 10000 : ( e->type() == 2 ) ? 20000
|
||||
: 30000;
|
||||
mwait.push_back( { e->type(), Timer::get_timestamp() + delay } );
|
||||
}
|
||||
}
|
||||
monsters.erase( std::remove_if( monsters.begin(), monsters.end(),
|
||||
[]( const std::unique_ptr<Monster>& e ) { return e->is_dead(); } ),
|
||||
monsters.end() );
|
||||
|
||||
// The respawn and exit-portal placement below need the player's position;
|
||||
// they only fire during gameplay (a monster died, or every crate is gone),
|
||||
// never on the player-less menu screen.
|
||||
Player* player = world.player();
|
||||
|
||||
// Respawn monsters whose wait has elapsed.
|
||||
std::int64_t now = Timer::get_timestamp();
|
||||
std::vector<MWait> still_waiting;
|
||||
for( const MWait& m : mwait )
|
||||
{
|
||||
if( m.time < now && player )
|
||||
{
|
||||
int rx = 0, ry = 0;
|
||||
bool ok = false;
|
||||
while( !ok )
|
||||
{
|
||||
rx = RNG::next( X );
|
||||
ry = RNG::next( Y );
|
||||
if( monster_ok( rx, ry, player->getx() / 64, player->gety() / 64, 3 ) )
|
||||
ok = true;
|
||||
}
|
||||
auto monster = std::make_unique<Monster>( m.type, rx, ry );
|
||||
monster->set_action( Action::appear );
|
||||
monsters.push_back( std::move( monster ) );
|
||||
}
|
||||
else
|
||||
{
|
||||
still_waiting.push_back( m );
|
||||
}
|
||||
}
|
||||
mwait = std::move( still_waiting );
|
||||
|
||||
// Bonuses.
|
||||
for( auto& e : bonuses ) e->tick( world );
|
||||
|
||||
// Once every crate is gone, open the exit portal somewhere clear.
|
||||
if( world.crates_left == 0 && player )
|
||||
{
|
||||
world.crates_left--;
|
||||
|
||||
int rx = 0, ry = 0;
|
||||
bool ok = false;
|
||||
while( !ok )
|
||||
{
|
||||
rx = RNG::next( X );
|
||||
ry = RNG::next( Y );
|
||||
if( monster_ok( rx, ry, player->getx() / 64, player->gety() / 64, 4 ) )
|
||||
ok = true;
|
||||
}
|
||||
|
||||
at( rx, ry ) = Field::vortex();
|
||||
bonuses.push_back( std::make_unique<Vortex>( rx, ry ) );
|
||||
}
|
||||
}
|
||||
|
||||
std::unique_ptr<Player> Map::create_player() const
|
||||
{
|
||||
return std::make_unique<Player>( px, py );
|
||||
}
|
||||
|
||||
void Map::place_bomb( int x, int y )
|
||||
{
|
||||
Field& f = at( x, y );
|
||||
if( f.is_floor_family() && f.kind != Field::Kind::bomb )
|
||||
{
|
||||
f = Field::bomb();
|
||||
bombs.push_back( std::make_unique<Bomb>( x, y ) );
|
||||
}
|
||||
}
|
||||
|
||||
bool Map::monster_collide( int tx, int ty ) const
|
||||
{
|
||||
for( const auto& e : monsters )
|
||||
{
|
||||
if( ( e->getx() + 32 ) / 64 == ( tx + 32 ) / 64 &&
|
||||
( e->gety() + 32 ) / 64 == ( ty + 32 ) / 64 )
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
}
|
||||
120
examples/dyna/src/map.hpp
Normal file
@@ -0,0 +1,120 @@
|
||||
#pragma once
|
||||
|
||||
#include <cstdint>
|
||||
#include <memory>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
class Player;
|
||||
class Bomb;
|
||||
class Monster;
|
||||
class Vortex;
|
||||
class World;
|
||||
|
||||
// How a tile reacts to an explosion sweeping through it.
|
||||
enum class Destruction
|
||||
{
|
||||
none, // blocks the blast (wall, bomb, existing explosion, vortex)
|
||||
single, // destroyed and stops the blast (crate)
|
||||
multi // passable, blast continues (floor)
|
||||
};
|
||||
|
||||
// A grid cell. The C# version used a small class hierarchy rooted at a Field
|
||||
// interface; since the variants differ only in a couple of flags and how they
|
||||
// draw, this collapses them into one value type tagged by Kind. Note that in
|
||||
// the original everything except Wall derived from Floor, so the "is Floor"
|
||||
// checks there map to "kind != Wall" here.
|
||||
struct Field
|
||||
{
|
||||
enum class Kind
|
||||
{
|
||||
floor,
|
||||
wall,
|
||||
crate,
|
||||
bomb, // tile occupied by a live bomb (solid, indestructible)
|
||||
explosion, // transient blast tile
|
||||
vortex // level exit portal
|
||||
};
|
||||
|
||||
enum class ExplosionType
|
||||
{
|
||||
center,
|
||||
vertical,
|
||||
horizontal,
|
||||
left,
|
||||
right,
|
||||
down,
|
||||
up
|
||||
};
|
||||
|
||||
Kind kind = Kind::floor;
|
||||
ExplosionType etype = ExplosionType::center;
|
||||
std::int64_t tstart = 0; // explosion animation start, set on creation
|
||||
|
||||
static Field floor() { return Field{}; }
|
||||
static Field wall() { return Field{ Kind::wall, {}, 0 }; }
|
||||
static Field crate() { return Field{ Kind::crate, {}, 0 }; }
|
||||
static Field bomb() { return Field{ Kind::bomb, {}, 0 }; }
|
||||
static Field vortex() { return Field{ Kind::vortex, {}, 0 }; }
|
||||
static Field explosion( ExplosionType t );
|
||||
|
||||
bool solid() const;
|
||||
Destruction destructible() const;
|
||||
void draw( int x, int y ) const;
|
||||
|
||||
bool is_floor_family() const { return kind != Kind::wall; }
|
||||
};
|
||||
|
||||
class Map
|
||||
{
|
||||
public:
|
||||
explicit Map( const std::string& fn );
|
||||
~Map(); // defined in map.cpp where the entity types are complete
|
||||
|
||||
Field& at( int x, int y ) { return grid[index( x, y )]; }
|
||||
const Field& at( int x, int y ) const { return grid[index( x, y )]; }
|
||||
|
||||
void draw();
|
||||
void tick( World& world );
|
||||
|
||||
int getx() const { return X; }
|
||||
int gety() const { return Y; }
|
||||
int get_crates() const { return destructibles; }
|
||||
|
||||
std::unique_ptr<Player> create_player() const;
|
||||
|
||||
void place_bomb( int x, int y );
|
||||
bool monster_collide( int tx, int ty ) const;
|
||||
|
||||
private:
|
||||
static constexpr int X = 13, Y = 11;
|
||||
|
||||
// Deferred monster respawn timer, mirroring Map.MWait.
|
||||
struct MWait
|
||||
{
|
||||
int type; // 1, 2 or 3
|
||||
std::int64_t time; // timestamp at which it respawns
|
||||
};
|
||||
|
||||
static int index( int x, int y ) { return x * Y + y; }
|
||||
|
||||
void load( const std::string& fn );
|
||||
void generate_destructibles();
|
||||
void populate_map();
|
||||
bool monster_ok( int rx, int ry, int px, int py, int r ) const;
|
||||
|
||||
std::vector<Field> grid;
|
||||
int px = -10, py = -10;
|
||||
int destructibles = 0;
|
||||
int m1 = 0, m2 = 0, m3 = 0;
|
||||
|
||||
std::vector<std::unique_ptr<Bomb>> bombs;
|
||||
std::vector<std::unique_ptr<Monster>> monsters;
|
||||
std::vector<std::unique_ptr<Vortex>> bonuses;
|
||||
std::vector<MWait> mwait;
|
||||
};
|
||||
|
||||
}
|
||||
227
examples/dyna/src/monster.cpp
Normal file
@@ -0,0 +1,227 @@
|
||||
#include "monster.hpp"
|
||||
|
||||
#include "gfx.hpp"
|
||||
#include "map.hpp"
|
||||
#include "texture.hpp"
|
||||
#include "timer.hpp"
|
||||
#include "world.hpp"
|
||||
|
||||
#include <tracy/Tracy.hpp>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
namespace
|
||||
{
|
||||
|
||||
bool is_opposite( Action a, Action b )
|
||||
{
|
||||
return ( a == Action::up && b == Action::down ) ||
|
||||
( a == Action::down && b == Action::up ) ||
|
||||
( a == Action::left && b == Action::right ) ||
|
||||
( a == Action::right && b == Action::left );
|
||||
}
|
||||
|
||||
} // namespace
|
||||
|
||||
Monster::Monster( int type, int gx, int gy )
|
||||
: mtype( type )
|
||||
, t( type == 1 ? 14 : type == 2 ? 11
|
||||
: 7 )
|
||||
{
|
||||
x = gx * 64;
|
||||
y = gy * 64;
|
||||
}
|
||||
|
||||
void Monster::set_action( Action a )
|
||||
{
|
||||
Entity::set_action( a );
|
||||
if( action == Action::appear )
|
||||
left = 200;
|
||||
}
|
||||
|
||||
std::vector<Action> Monster::possible_dirs( const Map& map ) const
|
||||
{
|
||||
std::vector<Action> dirs;
|
||||
|
||||
if( x > 0 && !map.at( x / 64 - 1, y / 64 ).solid() )
|
||||
dirs.push_back( Action::left );
|
||||
if( x / 64 < map.getx() - 1 && !map.at( x / 64 + 1, y / 64 ).solid() )
|
||||
dirs.push_back( Action::right );
|
||||
if( y > 0 && !map.at( x / 64, y / 64 - 1 ).solid() )
|
||||
dirs.push_back( Action::up );
|
||||
if( y / 64 < map.gety() - 1 && !map.at( x / 64, y / 64 + 1 ).solid() )
|
||||
dirs.push_back( Action::down );
|
||||
|
||||
return dirs;
|
||||
}
|
||||
|
||||
bool Monster::straight( const std::vector<Action>& dirs )
|
||||
{
|
||||
return is_opposite( dirs[0], dirs[1] );
|
||||
}
|
||||
|
||||
Action Monster::any_dir( const Map& map )
|
||||
{
|
||||
std::vector<Action> dirs = possible_dirs( map );
|
||||
if( dirs.empty() )
|
||||
return Action::wait;
|
||||
return dirs[RNG::next( static_cast<int>( dirs.size() ) )];
|
||||
}
|
||||
|
||||
Action Monster::rand_dir( const Map& map )
|
||||
{
|
||||
Action tmp = any_dir( map );
|
||||
if( is_opposite( action, tmp ) )
|
||||
tmp = any_dir( map );
|
||||
return tmp;
|
||||
}
|
||||
|
||||
void Monster::think( const Map& map )
|
||||
{
|
||||
ZoneScoped;
|
||||
if( action == Action::wait || action == Action::appear )
|
||||
{
|
||||
set_action( rand_dir( map ) );
|
||||
return;
|
||||
}
|
||||
|
||||
std::vector<Action> dirs = possible_dirs( map );
|
||||
|
||||
if( dirs.size() == 2 && straight( dirs ) )
|
||||
{
|
||||
left = 64;
|
||||
}
|
||||
else
|
||||
{
|
||||
Action tmp = rand_dir( map );
|
||||
|
||||
if( tmp == action )
|
||||
{
|
||||
left = 64;
|
||||
}
|
||||
else
|
||||
{
|
||||
set_action( tmp );
|
||||
if( tmp != Action::wait )
|
||||
left = 64;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void Monster::tick( World& world )
|
||||
{
|
||||
ZoneScoped;
|
||||
Map& map = world.map();
|
||||
|
||||
delta += Timer::delta;
|
||||
|
||||
while( delta > t )
|
||||
{
|
||||
delta -= t;
|
||||
|
||||
if( action == Action::wait )
|
||||
{
|
||||
think( map );
|
||||
}
|
||||
else if( left > 0 )
|
||||
{
|
||||
left--;
|
||||
|
||||
switch( action )
|
||||
{
|
||||
case Action::down: y++; break;
|
||||
case Action::up: y--; break;
|
||||
case Action::left: x--; break;
|
||||
case Action::right: x++; break;
|
||||
default: break;
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
if( action == Action::death )
|
||||
die( world );
|
||||
else
|
||||
think( map );
|
||||
}
|
||||
|
||||
if( action != Action::death && killed( map ) )
|
||||
{
|
||||
set_action( Action::death );
|
||||
left = 790 / t;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void Monster::die( World& )
|
||||
{
|
||||
dead = true;
|
||||
}
|
||||
|
||||
const AnimTexture& Monster::texture_for( Action a ) const
|
||||
{
|
||||
struct Set
|
||||
{
|
||||
const AnimTexture* wait;
|
||||
const AnimTexture* up;
|
||||
const AnimTexture* down;
|
||||
const AnimTexture* left;
|
||||
const AnimTexture* right;
|
||||
const AnimTexture* death;
|
||||
};
|
||||
|
||||
Set s;
|
||||
if( mtype == 1 )
|
||||
s = { &Textures::m1_d, &Textures::m1_u, &Textures::m1_d, &Textures::m1_l, &Textures::m1_r, &Textures::m1_death };
|
||||
else if( mtype == 2 )
|
||||
s = { &Textures::m2_d, &Textures::m2_u, &Textures::m2_d, &Textures::m2_l, &Textures::m2_r, &Textures::m2_death };
|
||||
else
|
||||
s = { &Textures::m3_d, &Textures::m3_u, &Textures::m3_d, &Textures::m3_l, &Textures::m3_r, &Textures::m3_death };
|
||||
|
||||
switch( a )
|
||||
{
|
||||
case Action::up: return *s.up;
|
||||
case Action::down: return *s.down;
|
||||
case Action::left: return *s.left;
|
||||
case Action::right: return *s.right;
|
||||
case Action::death: return *s.death;
|
||||
case Action::wait:
|
||||
case Action::appear:
|
||||
default: return *s.wait; // wait/appear use the "down" sprite
|
||||
}
|
||||
}
|
||||
|
||||
void Monster::draw()
|
||||
{
|
||||
ZoneScoped;
|
||||
// The original returns without drawing for unexpected actions; monsters only
|
||||
// ever hold the actions handled by texture_for, so always draw.
|
||||
generic_draw( texture_for( action ) );
|
||||
}
|
||||
|
||||
void Monster::generic_draw( const AnimTexture& tex )
|
||||
{
|
||||
int frame;
|
||||
|
||||
if( action == Action::wait )
|
||||
{
|
||||
frame = 0;
|
||||
}
|
||||
else if( action == Action::appear )
|
||||
{
|
||||
frame = 0;
|
||||
Gfx::alpha( static_cast<float>( 200 - left ) / 200.0f );
|
||||
}
|
||||
else
|
||||
{
|
||||
frame = static_cast<int>( ( Timer::get_timestamp() - action_start ) / 40 );
|
||||
}
|
||||
|
||||
tex.bind( frame );
|
||||
Gfx::draw_sprite( x, y );
|
||||
|
||||
if( action == Action::appear )
|
||||
Gfx::alpha( 1.0f );
|
||||
}
|
||||
|
||||
}
|
||||
41
examples/dyna/src/monster.hpp
Normal file
@@ -0,0 +1,41 @@
|
||||
#pragma once
|
||||
|
||||
#include "entity.hpp"
|
||||
|
||||
#include <vector>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
class AnimTexture;
|
||||
|
||||
// The three monster variants from monster.cs differed only in speed, sprite set
|
||||
// and respawn delay, so they fold into one class parameterised by `type` (1-3).
|
||||
class Monster : public Entity
|
||||
{
|
||||
public:
|
||||
Monster( int type, int gx, int gy );
|
||||
|
||||
void set_action( Action a ) override;
|
||||
void tick( World& world ) override;
|
||||
void draw() override;
|
||||
void die( World& world ) override;
|
||||
|
||||
bool is_dead() const { return dead; }
|
||||
int type() const { return mtype; }
|
||||
|
||||
private:
|
||||
std::vector<Action> possible_dirs( const Map& map ) const;
|
||||
static bool straight( const std::vector<Action>& dirs );
|
||||
Action rand_dir( const Map& map );
|
||||
Action any_dir( const Map& map ); // __rand_dir in the original
|
||||
void think( const Map& map );
|
||||
void generic_draw( const AnimTexture& tex );
|
||||
const AnimTexture& texture_for( Action a ) const;
|
||||
|
||||
int mtype; // 1, 2 or 3
|
||||
int t; // ms per movement sub-step (per-type speed)
|
||||
bool dead = false;
|
||||
};
|
||||
|
||||
}
|
||||
127
examples/dyna/src/player.cpp
Normal file
@@ -0,0 +1,127 @@
|
||||
#include "player.hpp"
|
||||
|
||||
#include "gfx.hpp"
|
||||
#include "map.hpp"
|
||||
#include "texture.hpp"
|
||||
#include "timer.hpp"
|
||||
#include "world.hpp"
|
||||
|
||||
#include <tracy/Tracy.hpp>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
Player::Player( int gx, int gy )
|
||||
{
|
||||
x = gx * 64;
|
||||
y = gy * 64;
|
||||
set_action( Action::wait );
|
||||
queue = Action::wait;
|
||||
}
|
||||
|
||||
void Player::tick( World& world )
|
||||
{
|
||||
ZoneScoped;
|
||||
Map& map = world.map();
|
||||
|
||||
delta += Timer::delta;
|
||||
|
||||
while( delta > t )
|
||||
{
|
||||
delta -= t;
|
||||
|
||||
if( left > 0 )
|
||||
{
|
||||
left--;
|
||||
|
||||
switch( action )
|
||||
{
|
||||
case Action::down: y++; break;
|
||||
case Action::up: y--; break;
|
||||
case Action::left: x--; break;
|
||||
case Action::right: x++; break;
|
||||
case Action::place_bomb:
|
||||
if( left == 0 )
|
||||
map.place_bomb( x / 64, y / 64 );
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
if( action == Action::death )
|
||||
{
|
||||
die( world );
|
||||
return;
|
||||
}
|
||||
if( map.at( x / 64, y / 64 ).kind == Field::Kind::vortex )
|
||||
{
|
||||
world.next_level = true;
|
||||
return;
|
||||
}
|
||||
if( !can_move( queue, map ) )
|
||||
queue = Action::wait;
|
||||
|
||||
if( action != queue )
|
||||
set_action( queue );
|
||||
|
||||
if( action != Action::wait )
|
||||
left = 64;
|
||||
if( action == Action::place_bomb )
|
||||
left = 32;
|
||||
}
|
||||
|
||||
if( action != Action::death && killed( map ) )
|
||||
{
|
||||
set_action( Action::death );
|
||||
left = 1140 / t;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void Player::draw()
|
||||
{
|
||||
ZoneScoped;
|
||||
const AnimTexture* tex = nullptr;
|
||||
|
||||
switch( action )
|
||||
{
|
||||
case Action::wait: tex = &Textures::p_wait; break;
|
||||
case Action::up: tex = &Textures::p_u; break;
|
||||
case Action::down: tex = &Textures::p_d; break;
|
||||
case Action::left: tex = &Textures::p_l; break;
|
||||
case Action::right: tex = &Textures::p_r; break;
|
||||
case Action::death: tex = &Textures::p_death; break;
|
||||
case Action::place_bomb: tex = &Textures::p_wait; break;
|
||||
default:
|
||||
return;
|
||||
}
|
||||
|
||||
int frame = static_cast<int>( Timer::get_timestamp() - action_start );
|
||||
frame /= ( action == Action::death ) ? 60 : 40;
|
||||
tex->bind( frame );
|
||||
|
||||
Gfx::draw_sprite( x, y );
|
||||
}
|
||||
|
||||
void Player::move( Action a )
|
||||
{
|
||||
queue = a;
|
||||
}
|
||||
|
||||
void Player::die( World& world )
|
||||
{
|
||||
world.killed = true;
|
||||
}
|
||||
|
||||
bool Player::killed( const Map& map ) const
|
||||
{
|
||||
if( Entity::killed( map ) )
|
||||
return true;
|
||||
if( map.monster_collide( x, y ) )
|
||||
return true;
|
||||
return false;
|
||||
}
|
||||
|
||||
}
|
||||
27
examples/dyna/src/player.hpp
Normal file
@@ -0,0 +1,27 @@
|
||||
#pragma once
|
||||
|
||||
#include "entity.hpp"
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
class Player : public Entity
|
||||
{
|
||||
public:
|
||||
Player( int gx, int gy );
|
||||
|
||||
void tick( World& world ) override;
|
||||
void draw() override;
|
||||
void die( World& world ) override;
|
||||
|
||||
void move( Action a ); // queues the next direction; applied between tiles
|
||||
|
||||
protected:
|
||||
bool killed( const Map& map ) const override;
|
||||
|
||||
private:
|
||||
static constexpr int t = 6; // ms per movement sub-step
|
||||
Action queue = Action::wait;
|
||||
};
|
||||
|
||||
}
|
||||
221
examples/dyna/src/texture.cpp
Normal file
@@ -0,0 +1,221 @@
|
||||
#include "texture.hpp"
|
||||
|
||||
#include "datapath.hpp"
|
||||
#include "gfx.hpp"
|
||||
|
||||
#include <SDL3/SDL.h>
|
||||
#include <SDL3_image/SDL_image.h>
|
||||
#include <tracy/Tracy.hpp>
|
||||
|
||||
#include <cstdint>
|
||||
#include <cstdio>
|
||||
#include <cstring>
|
||||
#include <memory>
|
||||
#include <vector>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
void GlTexture::reset()
|
||||
{
|
||||
if( id_ )
|
||||
{
|
||||
// The texture globals outlive main(), so their destructors can run after
|
||||
// the GL context is already gone (which frees its textures anyway). Only
|
||||
// call into GL while a context is current; otherwise just drop the name.
|
||||
if( SDL_GL_GetCurrentContext() )
|
||||
glDeleteTextures( 1, &id_ );
|
||||
id_ = 0;
|
||||
}
|
||||
}
|
||||
|
||||
namespace
|
||||
{
|
||||
|
||||
struct SurfaceDeleter
|
||||
{
|
||||
void operator()( SDL_Surface* s ) const { SDL_DestroySurface( s ); }
|
||||
};
|
||||
using SurfacePtr = std::unique_ptr<SDL_Surface, SurfaceDeleter>;
|
||||
|
||||
// Convert an arbitrary surface to tightly addressable RGBA8. Returns null on
|
||||
// failure; the result owns its pixels.
|
||||
SurfacePtr to_rgba( SDL_Surface* src )
|
||||
{
|
||||
ZoneScoped;
|
||||
if( !src ) return nullptr;
|
||||
return SurfacePtr{ SDL_ConvertSurface( src, SDL_PIXELFORMAT_RGBA32 ) };
|
||||
}
|
||||
|
||||
} // namespace
|
||||
|
||||
bool Texture::load( const char* fn )
|
||||
{
|
||||
ZoneScoped;
|
||||
ZoneText( fn, strlen( fn ) );
|
||||
|
||||
SurfacePtr image{ IMG_Load( fn ) };
|
||||
if( !image )
|
||||
{
|
||||
std::fprintf( stderr, "Cannot open texture %s: %s\n", fn, SDL_GetError() );
|
||||
return false;
|
||||
}
|
||||
|
||||
SurfacePtr rgba = to_rgba( image.get() );
|
||||
if( !rgba )
|
||||
{
|
||||
std::fprintf( stderr, "Cannot convert texture %s: %s\n", fn, SDL_GetError() );
|
||||
return false;
|
||||
}
|
||||
|
||||
// Pack the surface into a tight RGBA8 block, skipping any per-row padding.
|
||||
const int w = rgba->w, h = rgba->h;
|
||||
std::vector<std::uint8_t> packed( static_cast<size_t>( w ) * h * 4 );
|
||||
const auto* pixels = static_cast<const std::uint8_t*>( rgba->pixels );
|
||||
for( int row = 0; row < h; row++ )
|
||||
{
|
||||
std::memcpy( &packed[static_cast<size_t>( row ) * w * 4],
|
||||
pixels + static_cast<size_t>( row ) * rgba->pitch,
|
||||
static_cast<size_t>( w ) * 4 );
|
||||
}
|
||||
|
||||
tex_ = GlTexture{ Render::make_texture( w, h, 1, packed.data() ) };
|
||||
return static_cast<bool>( tex_ );
|
||||
}
|
||||
|
||||
void Texture::bind() const
|
||||
{
|
||||
Render::use_texture( tex_.get(), 0 );
|
||||
}
|
||||
|
||||
void AnimTexture::load( SDL_Surface* sheet, int tilex, int tiley, int n )
|
||||
{
|
||||
ZoneScoped;
|
||||
|
||||
SurfacePtr rgba = to_rgba( sheet );
|
||||
if( !rgba )
|
||||
{
|
||||
std::fprintf( stderr, "Cannot convert sprite sheet: %s\n", SDL_GetError() );
|
||||
return;
|
||||
}
|
||||
|
||||
const auto* pixels = static_cast<const std::uint8_t*>( rgba->pixels );
|
||||
const int pitch = rgba->pitch;
|
||||
|
||||
// Lay the n frames out back to back as the layers of an array texture.
|
||||
constexpr int frame_bytes = 64 * 64 * 4;
|
||||
std::vector<std::uint8_t> frames( static_cast<size_t>( n ) * frame_bytes );
|
||||
for( int i = 0; i < n; i++ )
|
||||
{
|
||||
for( int fy = 0; fy < 64; fy++ )
|
||||
{
|
||||
int srcy = 64 * ( tiley + i ) + fy;
|
||||
int srcx = 64 * tilex;
|
||||
std::memcpy( &frames[static_cast<size_t>( i ) * frame_bytes + static_cast<size_t>( fy ) * 64 * 4],
|
||||
pixels + static_cast<size_t>( srcy ) * pitch + static_cast<size_t>( srcx ) * 4,
|
||||
static_cast<size_t>( 64 ) * 4 );
|
||||
}
|
||||
}
|
||||
|
||||
tex_ = GlTexture{ Render::make_texture( 64, 64, n, frames.data() ) };
|
||||
frames_ = n;
|
||||
}
|
||||
|
||||
void AnimTexture::bind( int frame ) const
|
||||
{
|
||||
if( frames_ <= 0 ) return;
|
||||
int layer = frame % frames_;
|
||||
if( layer < 0 ) layer += frames_;
|
||||
Render::use_texture( tex_.get(), layer );
|
||||
}
|
||||
|
||||
namespace Textures
|
||||
{
|
||||
Texture menu, sand, wall, crate;
|
||||
|
||||
AnimTexture p_wait, p_u, p_d, p_l, p_r, p_death;
|
||||
|
||||
AnimTexture bomb, bomb_appear, e_c, e_h, e_v, e_le, e_re, e_de, e_ue;
|
||||
|
||||
AnimTexture m1_death, m1_l, m1_r, m1_d, m1_u;
|
||||
AnimTexture m2_death, m2_l, m2_r, m2_d, m2_u;
|
||||
AnimTexture m3_death, m3_l, m3_r, m3_d, m3_u;
|
||||
|
||||
AnimTexture bonus1, bonus2;
|
||||
|
||||
AnimTexture vortex_appear, vortex;
|
||||
|
||||
void preload()
|
||||
{
|
||||
ZoneScoped;
|
||||
|
||||
menu.load( data_path( "data/gfx/menu.png" ).c_str() );
|
||||
sand.load( data_path( "data/gfx/sand.png" ).c_str() );
|
||||
wall.load( data_path( "data/gfx/wall.png" ).c_str() );
|
||||
crate.load( data_path( "data/gfx/crate.png" ).c_str() );
|
||||
|
||||
{
|
||||
SurfacePtr img{ IMG_Load( data_path( "data/gfx/Player.png" ).c_str() ) };
|
||||
p_wait.load( img.get(), 0, 0, 20 );
|
||||
p_d.load( img.get(), 1, 0, 20 );
|
||||
p_u.load( img.get(), 2, 0, 20 );
|
||||
p_l.load( img.get(), 3, 0, 20 );
|
||||
p_r.load( img.get(), 4, 0, 20 );
|
||||
p_death.load( img.get(), 5, 0, 20 );
|
||||
}
|
||||
|
||||
{
|
||||
SurfacePtr img{ IMG_Load( data_path( "data/gfx/Bomb.png" ).c_str() ) };
|
||||
bomb.load( img.get(), 0, 0, 10 );
|
||||
bomb_appear.load( img.get(), 5, 0, 10 );
|
||||
e_c.load( img.get(), 1, 0, 5 );
|
||||
e_h.load( img.get(), 2, 0, 5 );
|
||||
e_v.load( img.get(), 1, 5, 5 );
|
||||
e_le.load( img.get(), 3, 0, 5 );
|
||||
e_re.load( img.get(), 2, 5, 5 );
|
||||
e_de.load( img.get(), 4, 0, 5 );
|
||||
e_ue.load( img.get(), 3, 5, 5 );
|
||||
}
|
||||
|
||||
{
|
||||
SurfacePtr img{ IMG_Load( data_path( "data/gfx/monster1.png" ).c_str() ) };
|
||||
m1_death.load( img.get(), 0, 0, 20 );
|
||||
m1_u.load( img.get(), 1, 0, 10 );
|
||||
m1_l.load( img.get(), 2, 0, 10 );
|
||||
m1_d.load( img.get(), 1, 10, 10 );
|
||||
m1_r.load( img.get(), 2, 10, 10 );
|
||||
}
|
||||
|
||||
{
|
||||
SurfacePtr img{ IMG_Load( data_path( "data/gfx/monster2.png" ).c_str() ) };
|
||||
m2_death.load( img.get(), 0, 0, 20 );
|
||||
m2_d.load( img.get(), 1, 0, 20 );
|
||||
m2_u.load( img.get(), 2, 0, 20 );
|
||||
m2_l.load( img.get(), 3, 0, 20 );
|
||||
m2_r.load( img.get(), 4, 0, 20 );
|
||||
}
|
||||
|
||||
{
|
||||
SurfacePtr img{ IMG_Load( data_path( "data/gfx/monster3.png" ).c_str() ) };
|
||||
m3_death.load( img.get(), 0, 0, 20 );
|
||||
m3_d.load( img.get(), 1, 0, 9 );
|
||||
m3_u.load( img.get(), 2, 0, 9 );
|
||||
m3_l.load( img.get(), 1, 10, 9 );
|
||||
m3_r.load( img.get(), 2, 10, 9 );
|
||||
}
|
||||
|
||||
{
|
||||
SurfacePtr img{ IMG_Load( data_path( "data/gfx/bonusy.png" ).c_str() ) };
|
||||
bonus1.load( img.get(), 0, 0, 20 );
|
||||
bonus2.load( img.get(), 1, 0, 20 );
|
||||
}
|
||||
|
||||
{
|
||||
SurfacePtr img{ IMG_Load( data_path( "data/gfx/portal.png" ).c_str() ) };
|
||||
vortex_appear.load( img.get(), 0, 0, 20 );
|
||||
vortex.load( img.get(), 1, 0, 20 );
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
91
examples/dyna/src/texture.hpp
Normal file
@@ -0,0 +1,91 @@
|
||||
#pragma once
|
||||
|
||||
#include <glad/gl.h>
|
||||
|
||||
struct SDL_Surface;
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
// Move-only RAII owner of a GL texture name. Every texture in the game is a
|
||||
// GL_TEXTURE_2D_ARRAY (static images use a single layer, animations use one
|
||||
// layer per frame) so the renderer only ever has to deal with one sampler type.
|
||||
class GlTexture
|
||||
{
|
||||
public:
|
||||
GlTexture() = default;
|
||||
explicit GlTexture( GLuint id ) noexcept : id_( id ) {}
|
||||
~GlTexture() { reset(); }
|
||||
|
||||
GlTexture( GlTexture&& o ) noexcept : id_( o.id_ ) { o.id_ = 0; }
|
||||
GlTexture& operator=( GlTexture&& o ) noexcept
|
||||
{
|
||||
if( this != &o )
|
||||
{
|
||||
reset();
|
||||
id_ = o.id_;
|
||||
o.id_ = 0;
|
||||
}
|
||||
return *this;
|
||||
}
|
||||
|
||||
GlTexture( const GlTexture& ) = delete;
|
||||
GlTexture& operator=( const GlTexture& ) = delete;
|
||||
|
||||
GLuint get() const { return id_; }
|
||||
explicit operator bool() const { return id_ != 0; }
|
||||
|
||||
void reset(); // glDeleteTextures; safe on an empty handle
|
||||
|
||||
private:
|
||||
GLuint id_ = 0;
|
||||
};
|
||||
|
||||
// A single static texture loaded from a whole image file. Ported from
|
||||
// texture.cs; binding just records the texture for the next draw call.
|
||||
class Texture
|
||||
{
|
||||
public:
|
||||
bool load( const char* fn );
|
||||
void bind() const;
|
||||
|
||||
private:
|
||||
GlTexture tex_;
|
||||
};
|
||||
|
||||
// A vertical strip of 64x64 animation frames cut out of a sprite sheet, stored
|
||||
// as the layers of one array texture. Mirrors AnimTexture in texture.cs.
|
||||
class AnimTexture
|
||||
{
|
||||
public:
|
||||
// Extract n frames from column `tilex`, starting at row `tiley`, where each
|
||||
// coordinate is in 64px tile units. Mirrors AnimTexture.load in texture.cs.
|
||||
void load( SDL_Surface* sheet, int tilex, int tiley, int n );
|
||||
void bind( int frame ) const; // frame is taken modulo the frame count
|
||||
|
||||
private:
|
||||
GlTexture tex_;
|
||||
int frames_ = 0;
|
||||
};
|
||||
|
||||
// All game textures, loaded once at startup. Mirrors the Textures class.
|
||||
namespace Textures
|
||||
{
|
||||
extern Texture menu, sand, wall, crate;
|
||||
|
||||
extern AnimTexture p_wait, p_u, p_d, p_l, p_r, p_death;
|
||||
|
||||
extern AnimTexture bomb, bomb_appear, e_c, e_h, e_v, e_le, e_re, e_de, e_ue;
|
||||
|
||||
extern AnimTexture m1_death, m1_l, m1_r, m1_d, m1_u;
|
||||
extern AnimTexture m2_death, m2_l, m2_r, m2_d, m2_u;
|
||||
extern AnimTexture m3_death, m3_l, m3_r, m3_d, m3_u;
|
||||
|
||||
extern AnimTexture bonus1, bonus2;
|
||||
|
||||
extern AnimTexture vortex_appear, vortex;
|
||||
|
||||
void preload();
|
||||
}
|
||||
|
||||
}
|
||||
51
examples/dyna/src/timer.cpp
Normal file
@@ -0,0 +1,51 @@
|
||||
#include "timer.hpp"
|
||||
|
||||
#include <SDL3/SDL.h>
|
||||
|
||||
#include <random>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
namespace Timer
|
||||
{
|
||||
int delta = 0;
|
||||
static std::int64_t timestamp = 0;
|
||||
|
||||
void reset()
|
||||
{
|
||||
delta = 0;
|
||||
timestamp = static_cast<std::int64_t>( SDL_GetTicks() );
|
||||
}
|
||||
|
||||
int tick()
|
||||
{
|
||||
std::int64_t tmp = timestamp;
|
||||
timestamp = static_cast<std::int64_t>( SDL_GetTicks() );
|
||||
delta = static_cast<int>( timestamp - tmp );
|
||||
return delta;
|
||||
}
|
||||
|
||||
std::int64_t get_timestamp()
|
||||
{
|
||||
return timestamp;
|
||||
}
|
||||
}
|
||||
|
||||
namespace RNG
|
||||
{
|
||||
static std::mt19937& engine()
|
||||
{
|
||||
static std::mt19937 e{ std::random_device{}() };
|
||||
return e;
|
||||
}
|
||||
|
||||
int next( int n )
|
||||
{
|
||||
if( n <= 0 ) return 0;
|
||||
std::uniform_int_distribution<int> dist( 0, n - 1 );
|
||||
return dist( engine() );
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
25
examples/dyna/src/timer.hpp
Normal file
@@ -0,0 +1,25 @@
|
||||
#pragma once
|
||||
|
||||
#include <cstdint>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
// Frame timing, ported from timer.cs. Timestamps are milliseconds since
|
||||
// Timer::reset(); kept 64-bit so the modulo arithmetic the animation code
|
||||
// relies on never overflows during a session.
|
||||
namespace Timer
|
||||
{
|
||||
void reset();
|
||||
int tick(); // advances the clock, returns delta in ms
|
||||
std::int64_t get_timestamp();
|
||||
extern int delta; // ms elapsed during the last tick()
|
||||
}
|
||||
|
||||
// Thin wrapper over a single global PRNG, mirroring the C# RNG helper.
|
||||
namespace RNG
|
||||
{
|
||||
int next( int n ); // uniform in [0, n)
|
||||
}
|
||||
|
||||
}
|
||||
40
examples/dyna/src/world.cpp
Normal file
@@ -0,0 +1,40 @@
|
||||
#include "world.hpp"
|
||||
|
||||
#include "map.hpp"
|
||||
#include "player.hpp"
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
World::World( const std::string& level_fn, bool with_player )
|
||||
: map_( std::make_unique<Map>( level_fn ) )
|
||||
, name_( level_fn.substr( level_fn.rfind( '/' ) + 1 ) )
|
||||
{
|
||||
if( with_player )
|
||||
{
|
||||
player_ = map_->create_player();
|
||||
crates_left = map_->get_crates();
|
||||
}
|
||||
else
|
||||
{
|
||||
crates_left = -1; // the menu never opens an exit portal
|
||||
}
|
||||
}
|
||||
|
||||
World::~World() = default;
|
||||
|
||||
void World::tick()
|
||||
{
|
||||
map_->tick( *this );
|
||||
if( player_ )
|
||||
player_->tick( *this );
|
||||
}
|
||||
|
||||
void World::draw()
|
||||
{
|
||||
map_->draw();
|
||||
if( player_ )
|
||||
player_->draw();
|
||||
}
|
||||
|
||||
}
|
||||
46
examples/dyna/src/world.hpp
Normal file
@@ -0,0 +1,46 @@
|
||||
#pragma once
|
||||
|
||||
#include <memory>
|
||||
#include <string>
|
||||
|
||||
namespace dyna
|
||||
{
|
||||
|
||||
class Map;
|
||||
class Player;
|
||||
|
||||
// Owns the state for one running level: the map, the player (absent on the
|
||||
// menu screen), and the flags the gameplay code used to reach through global
|
||||
// variables. Passing a World& into the tick path replaces the old Game::p /
|
||||
// Game::current_map / Game::killed globals, so there are no non-owning pointers
|
||||
// to outlive the objects they point at.
|
||||
class World
|
||||
{
|
||||
public:
|
||||
// Loads `level_fn`; spawns a player from the map's '@' marker when
|
||||
// with_player is set (gameplay) and leaves it null otherwise (menu).
|
||||
World( const std::string& level_fn, bool with_player );
|
||||
~World();
|
||||
|
||||
World( const World& ) = delete;
|
||||
World& operator=( const World& ) = delete;
|
||||
|
||||
Map& map() { return *map_; }
|
||||
const Map& map() const { return *map_; }
|
||||
Player* player() { return player_.get(); } // null on the menu screen
|
||||
const std::string& name() const { return name_; }
|
||||
|
||||
void tick();
|
||||
void draw();
|
||||
|
||||
bool killed = false;
|
||||
bool next_level = false;
|
||||
int crates_left = 0;
|
||||
|
||||
private:
|
||||
std::unique_ptr<Map> map_;
|
||||
std::unique_ptr<Player> player_;
|
||||
std::string name_;
|
||||
};
|
||||
|
||||
}
|
||||
83
examples/opengl/triangle/CMakeLists.txt
Normal file
@@ -0,0 +1,83 @@
|
||||
# CMakeLists.txt — OpenGL spinning triangle demo
|
||||
#
|
||||
# macOS:
|
||||
# cmake -G Ninja -DCMAKE_BUILD_TYPE=RelWithDebInfo -B build/ninja .
|
||||
# cmake --build build/ninja
|
||||
#
|
||||
# Linux (requires libsdl3-dev libgl1-mesa-dev):
|
||||
# cmake -G Ninja -DCMAKE_BUILD_TYPE=RelWithDebInfo -B build/ninja .
|
||||
# cmake --build build/ninja
|
||||
#
|
||||
# Windows:
|
||||
# cmake -G Ninja -DCMAKE_BUILD_TYPE=RelWithDebInfo -B build/ninja .
|
||||
# cmake --build build/ninja
|
||||
|
||||
cmake_minimum_required(VERSION 3.16)
|
||||
project(gl_spinning_triangle LANGUAGES C CXX)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Tracy root — defaults to three directories above this CMakeLists.txt.
|
||||
# ---------------------------------------------------------------------------
|
||||
set(TRACY_DIR "${CMAKE_CURRENT_SOURCE_DIR}/../../..")
|
||||
option(TRACY_ENABLE "Enable Tracy profiling" ON)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Platform — SDL3 (cross-platform windowing, must be installed on the system)
|
||||
# ---------------------------------------------------------------------------
|
||||
find_package(SDL3 REQUIRED)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# GL extension loader — GLEW (Windows + Linux, fetched automatically)
|
||||
# ---------------------------------------------------------------------------
|
||||
if(NOT APPLE)
|
||||
include(FetchContent)
|
||||
set(glew-cmake_BUILD_SHARED OFF CACHE BOOL "" FORCE)
|
||||
set(ONLY_LIBS ON CACHE BOOL "" FORCE)
|
||||
FetchContent_Declare(glew
|
||||
GIT_REPOSITORY https://github.com/Perlmint/glew-cmake.git
|
||||
GIT_TAG master # pin to a specific commit for reproducible builds
|
||||
GIT_SHALLOW TRUE
|
||||
)
|
||||
FetchContent_MakeAvailable(glew)
|
||||
endif()
|
||||
|
||||
set(PLATFORM_SOURCES platform/platform_sdl3.cpp)
|
||||
|
||||
if(APPLE)
|
||||
set(PLATFORM_LIBS SDL3::SDL3 "-framework OpenGL")
|
||||
elseif(WIN32)
|
||||
set(PLATFORM_LIBS SDL3::SDL3 opengl32 libglew_static)
|
||||
else()
|
||||
set(PLATFORM_LIBS SDL3::SDL3 GL libglew_static)
|
||||
endif()
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Target
|
||||
# ---------------------------------------------------------------------------
|
||||
add_executable(gl_spinning_triangle
|
||||
spinning_triangle.cpp
|
||||
"${TRACY_DIR}/public/TracyClient.cpp"
|
||||
${PLATFORM_SOURCES}
|
||||
)
|
||||
|
||||
# Suppress upstream warnings from TracyClient.cpp
|
||||
if(MSVC)
|
||||
set_source_files_properties("${TRACY_DIR}/public/TracyClient.cpp"
|
||||
PROPERTIES COMPILE_FLAGS "/w"
|
||||
)
|
||||
else()
|
||||
set_source_files_properties("${TRACY_DIR}/public/TracyClient.cpp"
|
||||
PROPERTIES COMPILE_FLAGS "-w"
|
||||
)
|
||||
endif()
|
||||
|
||||
target_compile_features(gl_spinning_triangle PRIVATE cxx_std_17)
|
||||
|
||||
if(TRACY_ENABLE)
|
||||
target_compile_definitions(gl_spinning_triangle PRIVATE TRACY_ENABLE)
|
||||
endif()
|
||||
|
||||
target_include_directories(gl_spinning_triangle PRIVATE
|
||||
"${TRACY_DIR}/public"
|
||||
)
|
||||
target_link_libraries(gl_spinning_triangle PRIVATE ${PLATFORM_LIBS})
|
||||
37
examples/opengl/triangle/platform/platform.h
Normal file
@@ -0,0 +1,37 @@
|
||||
// platform.h — interface between platform-agnostic code and platform backends
|
||||
//
|
||||
// Each platform_*.mm / platform_*.cpp file implements these four functions.
|
||||
// Exactly one backend must be linked into the final binary.
|
||||
|
||||
#pragma once
|
||||
|
||||
#ifdef __APPLE__
|
||||
# include <OpenGL/gl3.h>
|
||||
#else
|
||||
# include <GL/glew.h>
|
||||
#endif
|
||||
|
||||
// Initialize the windowing system, create a window, and make an OpenGL 3.3
|
||||
// Core Profile context current on the calling thread.
|
||||
// Returns true on success.
|
||||
bool platformInit(int width, int height, const char* title);
|
||||
|
||||
// Load OpenGL function pointers (no-op on macOS where the framework exports them directly).
|
||||
// Must be called after platformInit() while the GL context is current.
|
||||
// Returns true on success.
|
||||
bool platformInitGL();
|
||||
|
||||
// Elapsed wall-clock time in seconds since platformInit().
|
||||
double platformGetTime();
|
||||
|
||||
// Swap front and back buffers (present the rendered frame).
|
||||
void platformSwapBuffers();
|
||||
|
||||
// Pixel scaling factor relative to the logical window size (1.0 on non-HiDPI displays).
|
||||
// Must be called after platformInit().
|
||||
void platformGetPixelDensityScale(float* x, float* y);
|
||||
|
||||
// Enter the platform event/render loop.
|
||||
// Calls render() each frame at ~60 fps.
|
||||
// Calls shutdown() exactly once before returning.
|
||||
void platformRunLoop(void (*render)(), void (*shutdown)());
|
||||
85
examples/opengl/triangle/platform/platform_sdl3.cpp
Normal file
@@ -0,0 +1,85 @@
|
||||
// platform_sdl3.cpp — SDL3 windowing backend (cross-platform)
|
||||
#include "platform.h" // GL headers first (gl3.h / glew.h) so SDL sees guards set
|
||||
|
||||
#define SDL_MAIN_HANDLED // we don't want SDL_main
|
||||
#include <SDL3/SDL.h>
|
||||
|
||||
#include <chrono>
|
||||
#include <cstdio>
|
||||
|
||||
static SDL_Window* sWin = nullptr;
|
||||
static SDL_GLContext sCtx = nullptr;
|
||||
static std::chrono::steady_clock::time_point sStartTime;
|
||||
|
||||
bool platformInit(int width, int height, const char* title) {
|
||||
if (!SDL_Init(SDL_INIT_VIDEO)) {
|
||||
fprintf(stderr, "ERROR: SDL_Init failed: %s\n", SDL_GetError());
|
||||
return false;
|
||||
}
|
||||
|
||||
SDL_GL_SetAttribute(SDL_GL_CONTEXT_MAJOR_VERSION, 3);
|
||||
SDL_GL_SetAttribute(SDL_GL_CONTEXT_MINOR_VERSION, 3);
|
||||
SDL_GL_SetAttribute(SDL_GL_CONTEXT_PROFILE_MASK, SDL_GL_CONTEXT_PROFILE_CORE);
|
||||
|
||||
sWin = SDL_CreateWindow(title, width, height, SDL_WINDOW_OPENGL);
|
||||
if (!sWin) {
|
||||
fprintf(stderr, "ERROR: SDL_CreateWindow failed: %s\n", SDL_GetError());
|
||||
SDL_Quit();
|
||||
return false;
|
||||
}
|
||||
SDL_SetWindowPosition(sWin, SDL_WINDOWPOS_CENTERED, SDL_WINDOWPOS_CENTERED);
|
||||
|
||||
sCtx = SDL_GL_CreateContext(sWin);
|
||||
if (!sCtx) {
|
||||
fprintf(stderr, "ERROR: SDL_GL_CreateContext failed: %s\n", SDL_GetError());
|
||||
SDL_DestroyWindow(sWin);
|
||||
SDL_Quit();
|
||||
return false;
|
||||
}
|
||||
|
||||
SDL_GL_SetSwapInterval(1);
|
||||
sStartTime = std::chrono::steady_clock::now();
|
||||
return true;
|
||||
}
|
||||
|
||||
bool platformInitGL() {
|
||||
#ifndef __APPLE__
|
||||
glewExperimental = GL_TRUE;
|
||||
if (glewInit() != GLEW_OK) {
|
||||
fprintf(stderr, "Failed to initialize GLEW\n");
|
||||
return false;
|
||||
}
|
||||
#endif
|
||||
return true;
|
||||
}
|
||||
|
||||
double platformGetTime() {
|
||||
return std::chrono::duration<double>(
|
||||
std::chrono::steady_clock::now() - sStartTime).count();
|
||||
}
|
||||
|
||||
void platformSwapBuffers() { SDL_GL_SwapWindow(sWin); }
|
||||
|
||||
void platformGetPixelDensityScale(float* x, float* y) {
|
||||
int pw, ph, ww, wh;
|
||||
SDL_GetWindowSizeInPixels(sWin, &pw, &ph);
|
||||
SDL_GetWindowSize(sWin, &ww, &wh);
|
||||
*x = (ww > 0) ? (float)pw / (float)ww : 1.0f;
|
||||
*y = (wh > 0) ? (float)ph / (float)wh : 1.0f;
|
||||
}
|
||||
|
||||
void platformRunLoop(void (*render)(), void (*shutdown)()) {
|
||||
bool running = true;
|
||||
while (running) {
|
||||
SDL_Event e;
|
||||
while (SDL_PollEvent(&e)) {
|
||||
if (e.type == SDL_EVENT_QUIT) running = false;
|
||||
if (e.type == SDL_EVENT_KEY_DOWN && e.key.key == SDLK_ESCAPE) running = false;
|
||||
}
|
||||
if (running) render();
|
||||
}
|
||||
shutdown();
|
||||
SDL_GL_DestroyContext(sCtx);
|
||||
SDL_DestroyWindow(sWin);
|
||||
SDL_Quit();
|
||||
}
|
||||
145
examples/opengl/triangle/spinning_triangle.cpp
Normal file
@@ -0,0 +1,145 @@
|
||||
// spinning_triangle.cpp — OpenGL spinning triangle demo with Tracy GPU profiling.
|
||||
|
||||
#ifdef __APPLE__
|
||||
// NOTE: OpenGL is only available on MacOS (no iOS support)
|
||||
// Including and using anything related to OpenGL on Apple (like <OpenGL/gl3.h>)
|
||||
// will emit deprecation warnings, unless GL_SILENCE_DEPRECATION is defined
|
||||
#define GL_SILENCE_DEPRECATION
|
||||
// NOTE: TracyOpenGL.hpp will not work as expected even on Apple devices that
|
||||
// support OpenGL, because the OpenGL drivers do not implement ARB_timer_query
|
||||
// properly (querying GL_TIMESTAMP always resolves to 0). TracyOpenGL.hpp will
|
||||
// emit a compiler warning, and a Tracy message to the trace/profiler, but the
|
||||
// program will still run.
|
||||
#endif
|
||||
|
||||
#include "platform/platform.h" // also includes OpenGL headers
|
||||
|
||||
#include <tracy/Tracy.hpp>
|
||||
|
||||
// NOTE: opt-in toggle for periodic recalibrations during Collect()
|
||||
#define TRACY_OPENGL_AUTO_CALIBRATION
|
||||
#include <tracy/TracyOpenGL.hpp>
|
||||
|
||||
static const int kWidth = 800;
|
||||
static const int kHeight = 600;
|
||||
|
||||
static GLuint gProgram = 0;
|
||||
static GLuint gVao = 0;
|
||||
static GLint gAngleLoc = -1;
|
||||
|
||||
// Vertex colors and positions are baked in; rotation is driven by a uniform.
|
||||
static const char* kVertSrc = R"(
|
||||
#version 150 core
|
||||
uniform float uAngle;
|
||||
const vec2 kPos[3] = vec2[3](
|
||||
vec2( 0.0, 0.5 ),
|
||||
vec2(-0.433, -0.25 ),
|
||||
vec2( 0.433, -0.25 )
|
||||
);
|
||||
const vec3 kCol[3] = vec3[3](
|
||||
vec3(1.0, 0.0, 0.0),
|
||||
vec3(0.0, 1.0, 0.0),
|
||||
vec3(0.0, 0.0, 1.0)
|
||||
);
|
||||
out vec3 vColor;
|
||||
void main() {
|
||||
float c = cos(uAngle);
|
||||
float s = sin(uAngle);
|
||||
vec2 p = kPos[gl_VertexID];
|
||||
gl_Position = vec4(p.x*c - p.y*s, p.x*s + p.y*c, 0.0, 1.0);
|
||||
vColor = kCol[gl_VertexID];
|
||||
}
|
||||
)";
|
||||
|
||||
static const char* kFragSrc = R"(
|
||||
#version 150 core
|
||||
in vec3 vColor;
|
||||
out vec4 fragColor;
|
||||
void main() { fragColor = vec4(vColor, 1.0); }
|
||||
)";
|
||||
|
||||
static GLuint compileShader(GLenum type, const char* src) {
|
||||
GLuint s = glCreateShader(type);
|
||||
glShaderSource(s, 1, &src, nullptr);
|
||||
glCompileShader(s);
|
||||
GLint ok = 0;
|
||||
glGetShaderiv(s, GL_COMPILE_STATUS, &ok);
|
||||
if (!ok) {
|
||||
char log[512];
|
||||
glGetShaderInfoLog(s, sizeof(log), nullptr, log);
|
||||
fprintf(stderr, "Shader compile error: %s\n", log);
|
||||
glDeleteShader(s);
|
||||
return 0;
|
||||
}
|
||||
return s;
|
||||
}
|
||||
|
||||
static int initGL() {
|
||||
if (!platformInitGL()) return 1;
|
||||
|
||||
TracyGpuContext;
|
||||
TracyGpuContextName("OpenGL", 6);
|
||||
|
||||
GLuint vert = compileShader(GL_VERTEX_SHADER, kVertSrc);
|
||||
GLuint frag = compileShader(GL_FRAGMENT_SHADER, kFragSrc);
|
||||
if (!vert || !frag) return 1;
|
||||
|
||||
gProgram = glCreateProgram();
|
||||
glAttachShader(gProgram, vert);
|
||||
glAttachShader(gProgram, frag);
|
||||
glLinkProgram(gProgram);
|
||||
glDeleteShader(vert);
|
||||
glDeleteShader(frag);
|
||||
|
||||
GLint ok = 0;
|
||||
glGetProgramiv(gProgram, GL_LINK_STATUS, &ok);
|
||||
if (!ok) {
|
||||
char log[512];
|
||||
glGetProgramInfoLog(gProgram, sizeof(log), nullptr, log);
|
||||
fprintf(stderr, "Program link error: %s\n", log);
|
||||
return 1;
|
||||
}
|
||||
|
||||
gAngleLoc = glGetUniformLocation(gProgram, "uAngle");
|
||||
|
||||
// Core profile requires a bound VAO even with no vertex attributes.
|
||||
glGenVertexArrays(1, &gVao);
|
||||
glBindVertexArray(gVao);
|
||||
|
||||
glClearColor(0.05f, 0.05f, 0.08f, 1.0f);
|
||||
float scaleX, scaleY;
|
||||
platformGetPixelDensityScale(&scaleX, &scaleY);
|
||||
glViewport(0, 0, (int)(kWidth * scaleX), (int)(kHeight * scaleY));
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void renderFrame() {
|
||||
ZoneScoped;
|
||||
|
||||
glClear(GL_COLOR_BUFFER_BIT);
|
||||
glUseProgram(gProgram);
|
||||
|
||||
{
|
||||
TracyGpuZone("triangle draw");
|
||||
glUniform1f(gAngleLoc, (float)platformGetTime());
|
||||
glDrawArrays(GL_TRIANGLES, 0, 3);
|
||||
}
|
||||
|
||||
platformSwapBuffers();
|
||||
TracyGpuCollect;
|
||||
}
|
||||
|
||||
static void shutdown() {
|
||||
fprintf(stderr, "application is shutting down...\n");
|
||||
glDeleteVertexArrays(1, &gVao);
|
||||
glDeleteProgram(gProgram);
|
||||
}
|
||||
|
||||
int main() {
|
||||
if (!platformInit(kWidth, kHeight, "OpenGL Spinning Triangle"))
|
||||
return 1;
|
||||
if (initGL() != 0)
|
||||
return 2;
|
||||
platformRunLoop(renderFrame, shutdown);
|
||||
return 0;
|
||||
}
|
||||
157
examples/webgpu/triangle/CMakeLists.txt
Normal file
@@ -0,0 +1,157 @@
|
||||
# CMakeLists.txt — WebGPU spinning triangle demo
|
||||
#
|
||||
# macOS:
|
||||
# clang++ -std=c++17 -ObjC++ spinning_triangle.cpp platform/platform_macos.mm \
|
||||
# -I/path/to/wgpu/include -L/path/to/wgpu/lib -lwgpu_native \
|
||||
# -Wl,-rpath,@executable_path \
|
||||
# -framework Cocoa -framework Metal -framework QuartzCore \
|
||||
# -framework Foundation -framework IOKit -framework IOSurface \
|
||||
# -o spinning_triangle
|
||||
#
|
||||
# Windows (MSVC):
|
||||
# cl /std:c++17 spinning_triangle.cpp platform/platform_windows.cpp \
|
||||
# /I\path\to\wgpu\include \path\to\wgpu\lib\wgpu_native.lib \
|
||||
# user32.lib gdi32.lib /Fe:spinning_triangle.exe
|
||||
#
|
||||
# Linux (requires libsdl3-dev):
|
||||
# g++ -std=c++17 spinning_triangle.cpp platform/platform_wayland.cpp \
|
||||
# xdg-shell-protocol.c \
|
||||
# -I/path/to/wgpu/include -L/path/to/wgpu/lib -lwgpu_native \
|
||||
# -lwayland-client -o spinning_triangle
|
||||
|
||||
cmake_minimum_required(VERSION 3.16)
|
||||
project(spinning_triangle LANGUAGES C CXX)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# WebGPU backend — set WGPU_PATH to your wgpu-native or Dawn installation.
|
||||
# The library name differs between backends:
|
||||
# wgpu-native → wgpu_native
|
||||
# Dawn → webgpu_dawn
|
||||
# ---------------------------------------------------------------------------
|
||||
set(WGPU_PATH "" CACHE PATH "Root of the WebGPU native installation (contains include/ and lib/)")
|
||||
set(WGPU_LIB "" CACHE STRING "WebGPU library name (wgpu_native or webgpu_dawn); auto-detected if empty")
|
||||
|
||||
if(NOT WGPU_PATH)
|
||||
message(FATAL_ERROR "Set WGPU_PATH to the root of your WebGPU native installation.")
|
||||
endif()
|
||||
|
||||
# When WGPU_PATH changes, discard any previously auto-detected WGPU_LIB so
|
||||
# detection re-runs against the new path.
|
||||
if(NOT "${WGPU_PATH}" STREQUAL "${_WGPU_PATH_LAST}")
|
||||
unset(WGPU_LIB CACHE)
|
||||
set(WGPU_LIB "" CACHE STRING "WebGPU library name (wgpu_native or webgpu_dawn); auto-detected if empty")
|
||||
endif()
|
||||
set(_WGPU_PATH_LAST "${WGPU_PATH}" CACHE INTERNAL "")
|
||||
|
||||
if(NOT WGPU_LIB)
|
||||
unset(_WGPU_NATIVE_LIB CACHE)
|
||||
unset(_WEBGPU_DAWN_LIB CACHE)
|
||||
find_library(_WGPU_NATIVE_LIB NAMES wgpu_native wgpu_native.dll PATHS "${WGPU_PATH}/lib" NO_DEFAULT_PATH)
|
||||
find_library(_WEBGPU_DAWN_LIB NAMES webgpu_dawn PATHS "${WGPU_PATH}/lib" NO_DEFAULT_PATH)
|
||||
if(_WGPU_NATIVE_LIB)
|
||||
set(WGPU_LIB "wgpu_native" CACHE STRING "WebGPU library name (wgpu_native or webgpu_dawn); auto-detected if empty" FORCE)
|
||||
elseif(_WEBGPU_DAWN_LIB)
|
||||
set(WGPU_LIB "webgpu_dawn" CACHE STRING "WebGPU library name (wgpu_native or webgpu_dawn); auto-detected if empty" FORCE)
|
||||
else()
|
||||
message(FATAL_ERROR "Could not detect a WebGPU library in ${WGPU_PATH}/lib. Set WGPU_LIB explicitly (wgpu_native or webgpu_dawn).")
|
||||
endif()
|
||||
message(STATUS "WebGPU library auto-detected: ${WGPU_LIB}")
|
||||
endif()
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Tracy root — defaults to two directories above this CMakeLists.txt.
|
||||
# ---------------------------------------------------------------------------
|
||||
set(TRACY_DIR "${CMAKE_CURRENT_SOURCE_DIR}/../../..")
|
||||
option(TRACY_ENABLE "Enable Tracy profiling" ON)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# macOS quarantine — pre-built WebGPU binaries downloaded from the internet
|
||||
# carry a com.apple.quarantine extended attribute that prevents dyld from
|
||||
# loading them ("damaged or incomplete" / Gatekeeper block). Strip it once
|
||||
# at configure time so the linker and the runtime loader can both access the
|
||||
# library directory without further user intervention.
|
||||
# ---------------------------------------------------------------------------
|
||||
if(APPLE)
|
||||
execute_process(
|
||||
COMMAND xattr -dr com.apple.quarantine "${WGPU_PATH}/lib"
|
||||
)
|
||||
endif()
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Platform — SDL3 (cross-platform windowing, must be installed on the system)
|
||||
# ---------------------------------------------------------------------------
|
||||
find_package(SDL3 REQUIRED)
|
||||
|
||||
set(PLATFORM_SOURCES platform/platform_sdl3.cpp)
|
||||
|
||||
if(APPLE)
|
||||
set(PLATFORM_LIBS
|
||||
SDL3::SDL3
|
||||
"-framework Cocoa"
|
||||
"-framework Metal"
|
||||
"-framework QuartzCore"
|
||||
"-framework Foundation"
|
||||
"-framework IOKit"
|
||||
"-framework IOSurface"
|
||||
)
|
||||
elseif(WIN32)
|
||||
# wgpu-native (Rust stdlib) pull-ins: NtReadFile, GetUserProfileDirectoryW, ...
|
||||
set(WGPU_NATIVE_WIN32_LIBS ntdll userenv)
|
||||
# Dawn pull-ins: WKPDID_D3DDebugObjectName GUID, CompareObjectHandles, ...
|
||||
set(WEBGPU_DAWN_WIN32_LIBS dxguid onecore)
|
||||
set(PLATFORM_LIBS SDL3::SDL3 ${WGPU_NATIVE_WIN32_LIBS} ${WEBGPU_DAWN_WIN32_LIBS})
|
||||
else()
|
||||
set(PLATFORM_LIBS SDL3::SDL3)
|
||||
endif()
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Target
|
||||
# ---------------------------------------------------------------------------
|
||||
add_executable(spinning_triangle
|
||||
spinning_triangle.cpp
|
||||
"${TRACY_DIR}/public/TracyClient.cpp"
|
||||
${PLATFORM_SOURCES}
|
||||
)
|
||||
|
||||
# Treat TracyClient.cpp as third-party code — suppress all warnings so that
|
||||
# upstream changes don't pollute our build output.
|
||||
if(MSVC)
|
||||
set_source_files_properties("${TRACY_DIR}/public/TracyClient.cpp"
|
||||
PROPERTIES COMPILE_FLAGS "/w"
|
||||
)
|
||||
else()
|
||||
set_source_files_properties("${TRACY_DIR}/public/TracyClient.cpp"
|
||||
PROPERTIES COMPILE_FLAGS "-w"
|
||||
)
|
||||
endif()
|
||||
|
||||
target_compile_features(spinning_triangle PRIVATE cxx_std_17)
|
||||
|
||||
if(TRACY_ENABLE)
|
||||
target_compile_definitions(spinning_triangle PRIVATE TRACY_ENABLE)
|
||||
endif()
|
||||
|
||||
target_include_directories(spinning_triangle PRIVATE
|
||||
"${WGPU_PATH}/include"
|
||||
"${TRACY_DIR}/public"
|
||||
)
|
||||
|
||||
target_link_directories(spinning_triangle PRIVATE "${WGPU_PATH}/lib")
|
||||
|
||||
target_link_libraries(spinning_triangle PRIVATE
|
||||
${WGPU_LIB}
|
||||
${PLATFORM_LIBS}
|
||||
)
|
||||
|
||||
# Embed the rpath so the binary finds the WebGPU dylib/so next to itself.
|
||||
if(APPLE)
|
||||
set_target_properties(spinning_triangle PROPERTIES
|
||||
BUILD_RPATH "${WGPU_PATH}/lib"
|
||||
INSTALL_RPATH "@executable_path"
|
||||
)
|
||||
elseif(UNIX)
|
||||
set_target_properties(spinning_triangle PROPERTIES
|
||||
BUILD_RPATH "${WGPU_PATH}/lib"
|
||||
INSTALL_RPATH "$ORIGIN"
|
||||
)
|
||||
endif()
|
||||
23
examples/webgpu/triangle/platform/platform.h
Normal file
@@ -0,0 +1,23 @@
|
||||
// platform.h — interface between platform-agnostic code and platform backends
|
||||
//
|
||||
// Each platform_*.mm / platform_*.cpp file implements these five functions.
|
||||
// Exactly one backend must be linked into the final binary.
|
||||
|
||||
#pragma once
|
||||
#include <webgpu/webgpu.h>
|
||||
|
||||
// Initialize the windowing system and create a window of the given dimensions.
|
||||
// Returns true on success.
|
||||
bool platformInit(int width, int height, const char* title);
|
||||
|
||||
// Create a WebGPU surface backed by the platform window.
|
||||
// Must be called after wgpuCreateInstance() and platformInit().
|
||||
WGPUSurface platformCreateSurface(WGPUInstance instance);
|
||||
|
||||
// Elapsed wall-clock time in seconds since platformInit().
|
||||
double platformGetTime();
|
||||
|
||||
// Enter the platform event/render loop.
|
||||
// Calls render() each frame at ~60 fps.
|
||||
// Calls shutdown() exactly once before returning.
|
||||
void platformRunLoop(void (*render)(), void (*shutdown)());
|
||||
95
examples/webgpu/triangle/platform/platform_sdl3.cpp
Normal file
@@ -0,0 +1,95 @@
|
||||
// platform_sdl3.cpp — SDL3 windowing backend for the WebGPU example
|
||||
#include "platform.h" // webgpu/webgpu.h first
|
||||
|
||||
#define SDL_MAIN_HANDLED // we don't want SDL_main
|
||||
#include <SDL3/SDL.h>
|
||||
|
||||
#ifdef __APPLE__
|
||||
# include <SDL3/SDL_metal.h>
|
||||
#endif
|
||||
|
||||
#include <chrono>
|
||||
#include <cstdio>
|
||||
|
||||
static SDL_Window* sWin = nullptr;
|
||||
static std::chrono::steady_clock::time_point sStartTime;
|
||||
#ifdef __APPLE__
|
||||
static SDL_MetalView sMetalView = nullptr;
|
||||
#endif
|
||||
|
||||
bool platformInit(int width, int height, const char* title) {
|
||||
if (!SDL_Init(SDL_INIT_VIDEO)) {
|
||||
fprintf(stderr, "ERROR: SDL_Init failed: %s\n", SDL_GetError());
|
||||
return false;
|
||||
}
|
||||
|
||||
SDL_WindowFlags flags = 0;
|
||||
#ifdef __APPLE__
|
||||
flags |= SDL_WINDOW_METAL;
|
||||
#endif
|
||||
|
||||
sWin = SDL_CreateWindow(title, width, height, flags);
|
||||
if (!sWin) {
|
||||
fprintf(stderr, "ERROR: SDL_CreateWindow failed: %s\n", SDL_GetError());
|
||||
SDL_Quit();
|
||||
return false;
|
||||
}
|
||||
SDL_SetWindowPosition(sWin, SDL_WINDOWPOS_CENTERED, SDL_WINDOWPOS_CENTERED);
|
||||
|
||||
sStartTime = std::chrono::steady_clock::now();
|
||||
return true;
|
||||
}
|
||||
|
||||
WGPUSurface platformCreateSurface(WGPUInstance instance) {
|
||||
WGPUSurfaceDescriptor desc = {};
|
||||
SDL_PropertiesID props = SDL_GetWindowProperties(sWin);
|
||||
|
||||
#if defined(__APPLE__)
|
||||
sMetalView = SDL_Metal_CreateView(sWin);
|
||||
if (!sMetalView) {
|
||||
fprintf(stderr, "ERROR: SDL_Metal_CreateView failed\n");
|
||||
return nullptr;
|
||||
}
|
||||
WGPUSurfaceSourceMetalLayer metalDesc = {};
|
||||
metalDesc.chain.sType = WGPUSType_SurfaceSourceMetalLayer;
|
||||
metalDesc.layer = SDL_Metal_GetLayer(sMetalView);
|
||||
desc.nextInChain = &metalDesc.chain;
|
||||
#elif defined(_WIN32)
|
||||
WGPUSurfaceSourceWindowsHWND hwndDesc = {};
|
||||
hwndDesc.chain.sType = WGPUSType_SurfaceSourceWindowsHWND;
|
||||
hwndDesc.hinstance = SDL_GetPointerProperty(props, SDL_PROP_WINDOW_WIN32_INSTANCE_POINTER, nullptr);
|
||||
hwndDesc.hwnd = SDL_GetPointerProperty(props, SDL_PROP_WINDOW_WIN32_HWND_POINTER, nullptr);
|
||||
desc.nextInChain = &hwndDesc.chain;
|
||||
#else // Linux / X11
|
||||
WGPUSurfaceSourceXlibWindow x11Desc = {};
|
||||
x11Desc.chain.sType = WGPUSType_SurfaceSourceXlibWindow;
|
||||
x11Desc.display = SDL_GetPointerProperty(props, SDL_PROP_WINDOW_X11_DISPLAY_POINTER, nullptr);
|
||||
x11Desc.window = (uint32_t)SDL_GetNumberProperty(props, SDL_PROP_WINDOW_X11_WINDOW_NUMBER, 0);
|
||||
desc.nextInChain = &x11Desc.chain;
|
||||
#endif
|
||||
|
||||
return wgpuInstanceCreateSurface(instance, &desc);
|
||||
}
|
||||
|
||||
double platformGetTime() {
|
||||
return std::chrono::duration<double>(
|
||||
std::chrono::steady_clock::now() - sStartTime).count();
|
||||
}
|
||||
|
||||
void platformRunLoop(void (*render)(), void (*shutdown)()) {
|
||||
bool running = true;
|
||||
while (running) {
|
||||
SDL_Event e;
|
||||
while (SDL_PollEvent(&e)) {
|
||||
if (e.type == SDL_EVENT_QUIT) running = false;
|
||||
if (e.type == SDL_EVENT_KEY_DOWN && e.key.key == SDLK_ESCAPE) running = false;
|
||||
}
|
||||
if (running) render();
|
||||
}
|
||||
shutdown();
|
||||
#ifdef __APPLE__
|
||||
SDL_Metal_DestroyView(sMetalView);
|
||||
#endif
|
||||
SDL_DestroyWindow(sWin);
|
||||
SDL_Quit();
|
||||
}
|
||||
352
examples/webgpu/triangle/spinning_triangle.cpp
Normal file
@@ -0,0 +1,352 @@
|
||||
// spinning_triangle.cpp — platform-agnostic WebGPU spinning triangle demo.
|
||||
|
||||
#include "platform/platform.h"
|
||||
#include <cmath>
|
||||
#include <cstdio>
|
||||
#include <cstdlib>
|
||||
#include <webgpu/webgpu.h>
|
||||
|
||||
#include <tracy/Tracy.hpp>
|
||||
#include <tracy/TracyWebGPU.hpp>
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Globals
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
static const int kWidth = 800;
|
||||
static const int kHeight = 600;
|
||||
|
||||
static WGPUInstance gInstance = nullptr;
|
||||
static WGPUSurface gSurface = nullptr;
|
||||
static WGPUAdapter gAdapter = nullptr;
|
||||
static WGPUDevice gDevice = nullptr;
|
||||
static WGPUQueue gQueue = nullptr;
|
||||
static WGPURenderPipeline gPipeline = nullptr;
|
||||
static WGPUBuffer gUniformBuf = nullptr;
|
||||
static WGPUBindGroup gBindGroup = nullptr;
|
||||
|
||||
static TracyWebGPUCtx gTracyCtx = nullptr;
|
||||
|
||||
static WGPUTextureFormat gSurfaceFormat = WGPUTextureFormat_BGRA8Unorm;
|
||||
|
||||
// TODO: this can become platformError() instead
|
||||
int error(int code, const char* message) {
|
||||
fprintf(stderr, "ERROR: %s (code: %d)\n", message, code);
|
||||
return code;
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// WGSL shader — vertex colours baked in, rotation via a uniform float.
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
static const char* kShaderSource = R"(
|
||||
struct Uniforms {
|
||||
angle: f32,
|
||||
};
|
||||
@group(0) @binding(0) var<uniform> u: Uniforms;
|
||||
|
||||
struct VSOut {
|
||||
@builtin(position) pos: vec4f,
|
||||
@location(0) color: vec3f,
|
||||
};
|
||||
|
||||
@vertex
|
||||
fn vs_main(@builtin(vertex_index) vi: u32) -> VSOut {
|
||||
var positions = array<vec2f, 3>(
|
||||
vec2f( 0.0, 0.5),
|
||||
vec2f(-0.433, -0.25),
|
||||
vec2f( 0.433, -0.25),
|
||||
);
|
||||
var colors = array<vec3f, 3>(
|
||||
vec3f(1.0, 0.0, 0.0),
|
||||
vec3f(0.0, 1.0, 0.0),
|
||||
vec3f(0.0, 0.0, 1.0),
|
||||
);
|
||||
|
||||
let c = cos(u.angle);
|
||||
let s = sin(u.angle);
|
||||
let p = positions[vi];
|
||||
let rotated = vec2f(p.x * c - p.y * s, p.x * s + p.y * c);
|
||||
|
||||
var out: VSOut;
|
||||
out.pos = vec4f(rotated, 0.0, 1.0);
|
||||
out.color = colors[vi];
|
||||
return out;
|
||||
}
|
||||
|
||||
@fragment
|
||||
fn fs_main(@location(0) color: vec3f) -> @location(0) vec4f {
|
||||
return vec4f(color, 1.0);
|
||||
}
|
||||
)";
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Adapter / Device request callbacks (current wgpu-native API)
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
static void onAdapterReady(WGPURequestAdapterStatus status,
|
||||
WGPUAdapter adapter,
|
||||
WGPUStringView message,
|
||||
void* userdata1, void* /*userdata2*/) {
|
||||
if (status == WGPURequestAdapterStatus_Success) {
|
||||
*(WGPUAdapter*)userdata1 = adapter;
|
||||
} else {
|
||||
fprintf(stderr, "Adapter request failed: %.*s\n",
|
||||
(int)message.length, message.data);
|
||||
}
|
||||
}
|
||||
|
||||
static void onDeviceReady(WGPURequestDeviceStatus status,
|
||||
WGPUDevice device,
|
||||
WGPUStringView message,
|
||||
void* userdata1, void* /*userdata2*/) {
|
||||
if (status == WGPURequestDeviceStatus_Success) {
|
||||
*(WGPUDevice*)userdata1 = device;
|
||||
} else {
|
||||
fprintf(stderr, "Device request failed: %.*s\n",
|
||||
(int)message.length, message.data);
|
||||
}
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// WebGPU init
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
static int initWebGPU() {
|
||||
// Adapter
|
||||
WGPURequestAdapterOptions adapterOpts = {};
|
||||
adapterOpts.compatibleSurface = gSurface;
|
||||
|
||||
WGPURequestAdapterCallbackInfo adapterCB = {};
|
||||
adapterCB.mode = WGPUCallbackMode_AllowProcessEvents;
|
||||
adapterCB.callback = onAdapterReady;
|
||||
adapterCB.userdata1 = &gAdapter;
|
||||
wgpuInstanceRequestAdapter(gInstance, &adapterOpts, adapterCB);
|
||||
while (!gAdapter) { wgpuInstanceProcessEvents(gInstance); }
|
||||
if (!gAdapter) return error(11, "No adapter");
|
||||
|
||||
WGPUUncapturedErrorCallbackInfo errorCB = {};
|
||||
errorCB.callback = [](WGPUDevice const*, WGPUErrorType type,
|
||||
WGPUStringView message, void*, void*) {
|
||||
fprintf(stderr, "[WGPU ERROR] type=%d %.*s\n",
|
||||
(int)type, (int)message.length, message.data);
|
||||
};
|
||||
|
||||
WGPUDeviceDescriptor deviceDesc = {};
|
||||
deviceDesc.uncapturedErrorCallbackInfo = errorCB;
|
||||
|
||||
TracyWebGPUSetupDeviceDescriptor(deviceDesc);
|
||||
|
||||
WGPURequestDeviceCallbackInfo deviceCB = {};
|
||||
deviceCB.mode = WGPUCallbackMode_AllowProcessEvents;
|
||||
deviceCB.callback = onDeviceReady;
|
||||
deviceCB.userdata1 = &gDevice;
|
||||
wgpuAdapterRequestDevice(gAdapter, &deviceDesc, deviceCB);
|
||||
while (!gDevice) { wgpuInstanceProcessEvents(gInstance); }
|
||||
if (!gDevice) return error(12, "No device");
|
||||
|
||||
gQueue = wgpuDeviceGetQueue(gDevice);
|
||||
gTracyCtx = TracyWebGPUContext(gInstance, gDevice, gQueue);
|
||||
TracyWebGPUContextName(gTracyCtx, "WebGPU", 6);
|
||||
|
||||
// Configure surface
|
||||
WGPUSurfaceConfiguration config = {};
|
||||
config.device = gDevice;
|
||||
config.format = gSurfaceFormat;
|
||||
config.usage = WGPUTextureUsage_RenderAttachment;
|
||||
config.alphaMode = WGPUCompositeAlphaMode_Opaque;
|
||||
config.width = kWidth;
|
||||
config.height = kHeight;
|
||||
config.presentMode = WGPUPresentMode_Fifo;
|
||||
wgpuSurfaceConfigure(gSurface, &config);
|
||||
|
||||
// Shader module
|
||||
WGPUShaderSourceWGSL wgslSrc = {};
|
||||
wgslSrc.chain.sType = WGPUSType_ShaderSourceWGSL;
|
||||
wgslSrc.code = { kShaderSource, WGPU_STRLEN };
|
||||
|
||||
WGPUShaderModuleDescriptor smDesc = {};
|
||||
smDesc.nextInChain = (WGPUChainedStruct*)&wgslSrc;
|
||||
WGPUShaderModule shaderMod = wgpuDeviceCreateShaderModule(gDevice, &smDesc);
|
||||
|
||||
// Uniform buffer (one f32 for rotation angle)
|
||||
WGPUBufferDescriptor bufDesc = {};
|
||||
bufDesc.usage = WGPUBufferUsage_Uniform | WGPUBufferUsage_CopyDst;
|
||||
bufDesc.size = sizeof(float);
|
||||
gUniformBuf = wgpuDeviceCreateBuffer(gDevice, &bufDesc);
|
||||
|
||||
// Bind group layout + bind group
|
||||
WGPUBindGroupLayoutEntry bglEntry = {};
|
||||
bglEntry.binding = 0;
|
||||
bglEntry.visibility = WGPUShaderStage_Vertex;
|
||||
bglEntry.buffer.type = WGPUBufferBindingType_Uniform;
|
||||
bglEntry.buffer.minBindingSize = sizeof(float);
|
||||
|
||||
WGPUBindGroupLayoutDescriptor bglDesc = {};
|
||||
bglDesc.entryCount = 1;
|
||||
bglDesc.entries = &bglEntry;
|
||||
WGPUBindGroupLayout bgl = wgpuDeviceCreateBindGroupLayout(gDevice, &bglDesc);
|
||||
|
||||
WGPUBindGroupEntry bgEntry = {};
|
||||
bgEntry.binding = 0;
|
||||
bgEntry.buffer = gUniformBuf;
|
||||
bgEntry.size = sizeof(float);
|
||||
|
||||
WGPUBindGroupDescriptor bgDesc = {};
|
||||
bgDesc.layout = bgl;
|
||||
bgDesc.entryCount = 1;
|
||||
bgDesc.entries = &bgEntry;
|
||||
gBindGroup = wgpuDeviceCreateBindGroup(gDevice, &bgDesc);
|
||||
|
||||
// Pipeline layout
|
||||
WGPUPipelineLayoutDescriptor plDesc = {};
|
||||
plDesc.bindGroupLayoutCount = 1;
|
||||
plDesc.bindGroupLayouts = &bgl;
|
||||
WGPUPipelineLayout pipelineLayout = wgpuDeviceCreatePipelineLayout(gDevice, &plDesc);
|
||||
|
||||
// Render pipeline
|
||||
WGPUColorTargetState colorTarget = {};
|
||||
colorTarget.format = gSurfaceFormat;
|
||||
colorTarget.writeMask = WGPUColorWriteMask_All;
|
||||
|
||||
WGPUFragmentState fragState = {};
|
||||
fragState.module = shaderMod;
|
||||
fragState.entryPoint = { "fs_main", WGPU_STRLEN };
|
||||
fragState.targetCount = 1;
|
||||
fragState.targets = &colorTarget;
|
||||
|
||||
WGPURenderPipelineDescriptor rpDesc = {};
|
||||
rpDesc.layout = pipelineLayout;
|
||||
rpDesc.vertex.module = shaderMod;
|
||||
rpDesc.vertex.entryPoint = { "vs_main", WGPU_STRLEN };
|
||||
rpDesc.primitive.topology = WGPUPrimitiveTopology_TriangleList;
|
||||
rpDesc.multisample.count = 1;
|
||||
rpDesc.multisample.mask = 0xFFFFFFFF;
|
||||
rpDesc.fragment = &fragState;
|
||||
|
||||
gPipeline = wgpuDeviceCreateRenderPipeline(gDevice, &rpDesc);
|
||||
|
||||
// Cleanup intermediates
|
||||
wgpuShaderModuleRelease(shaderMod);
|
||||
wgpuPipelineLayoutRelease(pipelineLayout);
|
||||
wgpuBindGroupLayoutRelease(bgl);
|
||||
return 0;
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Frame rendering
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
// Returns the surface texture for the current frame, or {.texture=nullptr} on
|
||||
// a skippable condition (timeout, occlusion) or an error.
|
||||
static WGPUSurfaceTexture getWindowSurface() {
|
||||
WGPUSurfaceTexture surfTex = {};
|
||||
wgpuSurfaceGetCurrentTexture(gSurface, &surfTex);
|
||||
if (surfTex.status == WGPUSurfaceGetCurrentTextureStatus_SuccessOptimal ||
|
||||
surfTex.status == WGPUSurfaceGetCurrentTextureStatus_SuccessSuboptimal)
|
||||
return surfTex;
|
||||
|
||||
// Timeout and Occluded are normal OS events (window covered / on a different Space).
|
||||
bool silent = surfTex.status == WGPUSurfaceGetCurrentTextureStatus_Timeout;
|
||||
#ifdef WGPU_H_
|
||||
silent = silent || surfTex.status == (WGPUSurfaceGetCurrentTextureStatus)WGPUSurfaceGetCurrentTextureStatus_Occluded;
|
||||
#endif
|
||||
if (!silent)
|
||||
fprintf(stderr, "Failed to get surface texture (status %d)\n", surfTex.status);
|
||||
if (surfTex.texture) wgpuTextureRelease(surfTex.texture);
|
||||
surfTex.texture = nullptr;
|
||||
return surfTex;
|
||||
}
|
||||
|
||||
static void renderFrame() {
|
||||
ZoneScoped;
|
||||
|
||||
// Update rotation angle
|
||||
float angle = (float)platformGetTime();
|
||||
wgpuQueueWriteBuffer(gQueue, gUniformBuf, 0, &angle, sizeof(float));
|
||||
|
||||
WGPUSurfaceTexture surfTex = getWindowSurface();
|
||||
if (!surfTex.texture) return;
|
||||
|
||||
WGPUTextureView view = wgpuTextureCreateView(surfTex.texture, nullptr);
|
||||
|
||||
// Command encoder
|
||||
WGPUCommandEncoder encoder = wgpuDeviceCreateCommandEncoder(gDevice, nullptr);
|
||||
|
||||
// Render pass
|
||||
WGPURenderPassColorAttachment colorAtt = {};
|
||||
colorAtt.view = view;
|
||||
colorAtt.loadOp = WGPULoadOp_Clear;
|
||||
colorAtt.storeOp = WGPUStoreOp_Store;
|
||||
colorAtt.clearValue = { 0.05, 0.05, 0.08, 1.0 };
|
||||
colorAtt.depthSlice = WGPU_DEPTH_SLICE_UNDEFINED;
|
||||
|
||||
WGPURenderPassDescriptor passDesc = {};
|
||||
passDesc.colorAttachmentCount = 1;
|
||||
passDesc.colorAttachments = &colorAtt;
|
||||
|
||||
{
|
||||
ZoneScopedN("render-pass");
|
||||
TracyWebGPUNamedZone(gTracyCtx, tracyZone, encoder, passDesc, "triangle draw", true);
|
||||
WGPURenderPassEncoder pass = wgpuCommandEncoderBeginRenderPass(encoder, &passDesc);
|
||||
wgpuRenderPassEncoderSetPipeline(pass, gPipeline);
|
||||
wgpuRenderPassEncoderSetBindGroup(pass, 0, gBindGroup, 0, nullptr);
|
||||
wgpuRenderPassEncoderDraw(pass, 3, 1, 0, 0);
|
||||
wgpuRenderPassEncoderEnd(pass);
|
||||
wgpuRenderPassEncoderRelease(pass);
|
||||
}
|
||||
|
||||
// Submit
|
||||
WGPUCommandBuffer cmdBuf = wgpuCommandEncoderFinish(encoder, nullptr);
|
||||
wgpuQueueSubmit(gQueue, 1, &cmdBuf);
|
||||
|
||||
// Present
|
||||
wgpuSurfacePresent(gSurface);
|
||||
|
||||
// Process Events
|
||||
wgpuInstanceProcessEvents(gInstance);
|
||||
TracyWebGPUCollect(gTracyCtx);
|
||||
|
||||
// Cleanup
|
||||
wgpuCommandBufferRelease(cmdBuf);
|
||||
wgpuCommandEncoderRelease(encoder);
|
||||
wgpuTextureViewRelease(view);
|
||||
wgpuTextureRelease(surfTex.texture);
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Shutdown
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
static void shutdown() {
|
||||
fprintf(stderr, "application is shutting down...\n");
|
||||
TracyWebGPUDestroy(gTracyCtx);
|
||||
if (gBindGroup) wgpuBindGroupRelease(gBindGroup);
|
||||
if (gUniformBuf) wgpuBufferRelease(gUniformBuf);
|
||||
if (gPipeline) wgpuRenderPipelineRelease(gPipeline);
|
||||
if (gQueue) wgpuQueueRelease(gQueue);
|
||||
if (gDevice) wgpuDeviceRelease(gDevice);
|
||||
if (gAdapter) wgpuAdapterRelease(gAdapter);
|
||||
if (gSurface) wgpuSurfaceRelease(gSurface);
|
||||
if (gInstance) wgpuInstanceRelease(gInstance);
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// main
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
int main(int argc, char* argv[]) {
|
||||
if (!platformInit(kWidth, kHeight, "WebGPU Spinning Triangle"))
|
||||
return 1;
|
||||
|
||||
gInstance = wgpuCreateInstance(nullptr);
|
||||
if (!gInstance) return error(2, "Failed to create WebGPU instance.");
|
||||
|
||||
gSurface = platformCreateSurface(gInstance);
|
||||
if (!gSurface) return error(3, "Failed to create surface.");
|
||||
|
||||
if (initWebGPU() != 0) return 4;
|
||||
|
||||
platformRunLoop(renderFrame, shutdown);
|
||||
return 0;
|
||||
}
|
||||
@@ -1,4 +1,4 @@
|
||||
// g++ identify.cpp -lpthread ../public/common/tracy_lz4.cpp ../zstd/common/*.c ../zstd/decompress/*.c ../zstd/decompress/huf_decompress_amd64.S
|
||||
// g++ identify.cpp -lpthread ../public/common/tracy_lz4.cpp -lzstd
|
||||
|
||||
#include <memory>
|
||||
#include <stdint.h>
|
||||
|
||||
@@ -1,26 +0,0 @@
|
||||
#!/bin/sh
|
||||
|
||||
rm -rf tracy-build
|
||||
mkdir tracy-build
|
||||
|
||||
./update-meson-version.sh
|
||||
|
||||
if [ ! -f vswhere.exe ]; then
|
||||
wget https://github.com/microsoft/vswhere/releases/download/2.8.4/vswhere.exe
|
||||
fi
|
||||
|
||||
MSVC=`./vswhere.exe -property installationPath -version '[17.0,17.999]' | head -n 1`
|
||||
MSVC=`wslpath "$MSVC" | tr -d '\r'`
|
||||
MSBUILD=$MSVC/MSBuild/Current/Bin/MSBuild.exe
|
||||
|
||||
for i in capture csvexport import-chrome update; do
|
||||
echo $i...
|
||||
"$MSBUILD" ../$i/build/win32/$i.sln /t:Clean /p:Configuration=Release /p:Platform=x64 /noconsolelogger /nologo -m
|
||||
"$MSBUILD" ../$i/build/win32/$i.sln /t:Build /p:Configuration=Release /p:Platform=x64 /noconsolelogger /nologo -m
|
||||
cp ../$i/build/win32/x64/Release/$i.exe tracy-build/
|
||||
done
|
||||
|
||||
echo profiler...
|
||||
"$MSBUILD" ../profiler/build/win32/Tracy.sln /t:Clean /p:Configuration=Release /p:Platform=x64 /noconsolelogger /nologo -m
|
||||
"$MSBUILD" ../profiler/build/win32/Tracy.sln /t:Build /p:Configuration=Release /p:Platform=x64 /noconsolelogger /nologo -m
|
||||
cp ../profiler/build/win32/x64/Release/Tracy.exe tracy-build/
|
||||
@@ -1,7 +1,7 @@
|
||||
# Tracy MCP eval guide
|
||||
|
||||
This document covers the bindings-layer detail that the curated catalog
|
||||
(`tracy://catalog`) and analysis guidance (`tracy://prompt`) do not.
|
||||
This document covers the bindings-layer detail that the analysis
|
||||
guidance (`tracy://prompt`) does not.
|
||||
|
||||
## ctx
|
||||
|
||||
@@ -21,6 +21,9 @@ data surface. Common entry points:
|
||||
- Threads: `get_threads()`, `get_thread_name(tid)`, `get_thread_context_switches(tid)`
|
||||
- Messages / plots / locks / memory / callstacks: `get_messages()`, `get_plots()`,
|
||||
`get_locks()`, `get_memory_events()`, `get_callstack_frames(...)`
|
||||
- Sections: `get_sections()` — timed code sections from
|
||||
`TracySectionEnter`/`TracySectionLeave` instrumentation. Returns a list of
|
||||
`{start, end, text}` dicts (start/end in ns).
|
||||
- Capture metadata: `get_capture_name()`, `get_capture_program()`,
|
||||
`get_first_time()`, `get_last_time()`, `get_resolution()`, `get_host_info()`
|
||||
|
||||
@@ -41,24 +44,23 @@ Run `print([m for m in dir(ctx) if not m.startswith('_')])` for the full list.
|
||||
- Source-location IDs from `get_all_zone_source_locations()` are the join key
|
||||
between zone-name lookups and per-callsite queries.
|
||||
|
||||
## Translating catalog entries to ctx Python
|
||||
## Common query patterns
|
||||
|
||||
The catalog (`tracy://catalog`) lists curated queries. Each maps to a small
|
||||
Python snippet:
|
||||
Small Python snippets for the queries you'll reach for most often:
|
||||
|
||||
```python
|
||||
# zone_list — top 10 hottest zones by total time
|
||||
# top 10 hottest zones by total time
|
||||
top = sorted(ctx.get_all_zone_stats().items(),
|
||||
key=lambda kv: kv[1].total, reverse=True)[:10]
|
||||
for k, v in top:
|
||||
print(f"{v.total/1e6:.2f}ms count={v.count} {k}")
|
||||
|
||||
# frame_list — primary frame set timing
|
||||
# primary frame set timing
|
||||
times = ctx.get_frame_times() # ns per frame
|
||||
print(f"frames={len(times)} avg={sum(times)/len(times)/1e6:.2f}ms "
|
||||
f"p99={sorted(times)[int(len(times)*0.99)]/1e6:.2f}ms")
|
||||
|
||||
# zone_stats for a named zone — find the srcloc id, then drill in
|
||||
# stats for a named zone — find the srcloc id, then drill in
|
||||
import re
|
||||
matches = [k for k in ctx.get_all_zone_stats() if k.startswith("MyFunc ")]
|
||||
sid = int(re.search(r"<(\d+)>$", matches[0]).group(1))
|
||||
|
||||
@@ -30,6 +30,7 @@ _HERE = os.path.dirname(os.path.abspath(__file__))
|
||||
_PORT_FILE = os.path.join(_HERE, "tracy_mcp.port")
|
||||
_PID_FILE = os.path.join(_HERE, "tracy_mcp.pid")
|
||||
_PREFERRED_PORT = int(os.environ.get("TRACY_MCP_PORT", "47380"))
|
||||
_TRANSPORT = os.environ.get("TRACY_MCP_TRANSPORT", "streamable-http").strip().lower()
|
||||
|
||||
# Shared documentation surfaces. system.prompt.md is Tracy Assist's source
|
||||
# system prompt; exposing it as an MCP resource keeps analysis guidance in
|
||||
@@ -258,8 +259,7 @@ def _prompt_resource() -> str:
|
||||
@mcp_server.resource("tracy://eval-guide")
|
||||
def _eval_guide_resource() -> str:
|
||||
"""Bindings-layer guide for the eval tool: ctx object model, time units,
|
||||
source-location ID semantics, and worked examples translating catalog
|
||||
entries into ctx Python."""
|
||||
source-location ID semantics, and worked examples of common ctx queries."""
|
||||
return _read_text(_EVAL_GUIDE_PATH)
|
||||
|
||||
|
||||
@@ -677,6 +677,13 @@ async def shutdown_server() -> str:
|
||||
if __name__ == "__main__":
|
||||
atexit.register(_cleanup_pid_files)
|
||||
|
||||
if _TRANSPORT not in ("sse", "streamable-http"):
|
||||
print(
|
||||
"TRACY_MCP_TRANSPORT must be 'sse' or 'streamable-http'.",
|
||||
file=sys.stderr,
|
||||
)
|
||||
sys.exit(1)
|
||||
|
||||
running, existing_port = _is_our_server_running()
|
||||
if running:
|
||||
print(
|
||||
@@ -689,12 +696,17 @@ if __name__ == "__main__":
|
||||
port = _find_free_port()
|
||||
_write_pid_and_port(port)
|
||||
|
||||
print(f"Tracy MCP listening on http://127.0.0.1:{port}/sse", file=sys.stderr)
|
||||
path = (
|
||||
mcp_server.settings.sse_path
|
||||
if _TRANSPORT == "sse"
|
||||
else mcp_server.settings.streamable_http_path
|
||||
)
|
||||
print(f"Tracy MCP listening on http://127.0.0.1:{port}{path}", file=sys.stderr)
|
||||
|
||||
mcp_server.settings.host = "127.0.0.1"
|
||||
mcp_server.settings.port = port
|
||||
try:
|
||||
mcp_server.run(transport="sse")
|
||||
mcp_server.run(transport=_TRANSPORT)
|
||||
except KeyboardInterrupt:
|
||||
print("\nTracy MCP server stopped.", file=sys.stderr)
|
||||
sys.exit(0)
|
||||
|
||||
1
manual/README
Normal file
@@ -0,0 +1 @@
|
||||
The LaTeX source file (tracy.tex) and the resulting PDF file (tracy.pdf) are the only authorative version of the user manual. Do NOT modify the Markdown user manual (tracy.md) by hand. It is only meant to be updated via the latex2md.sh script.
|
||||
35
manual/bclogo2quote.awk
Normal file
@@ -0,0 +1,35 @@
|
||||
/\\begin\{bclogo\}\[/ {
|
||||
in_bclogo = 1
|
||||
bclogo_type = ""
|
||||
next
|
||||
}
|
||||
in_bclogo && /logo=/ {
|
||||
if (/\\bcbombe/) bclogo_type = "bcbombe"
|
||||
else if (/\\bcattention/) bclogo_type = "bcattention"
|
||||
else if (/\\bclampe/) bclogo_type = "bclampe"
|
||||
else if (/\\bcquestion/) bclogo_type = "bcquestion"
|
||||
next
|
||||
}
|
||||
in_bclogo && /noborder|couleur/ {
|
||||
next
|
||||
}
|
||||
in_bclogo {
|
||||
line = $0
|
||||
sub(/^[ \t]*\]?\{/, "", line)
|
||||
sub(/\}.*$/, "", line)
|
||||
bclogo_title = line
|
||||
|
||||
if (bclogo_type == "bcbombe") prefix = "IMPORTANT"
|
||||
else if (bclogo_type == "bcattention") prefix = "CAUTION"
|
||||
else if (bclogo_type == "bclampe") prefix = "TIP"
|
||||
else prefix = "NOTE"
|
||||
|
||||
printf "\\begin{quote}\\textbf{%s:%s}\\par\n", prefix, bclogo_title
|
||||
in_bclogo = 0
|
||||
next
|
||||
}
|
||||
/\\end\{bclogo\}/ {
|
||||
printf "\\end{quote}\n"
|
||||
next
|
||||
}
|
||||
{ print }
|
||||
64
manual/fa-icons.py
Normal file
@@ -0,0 +1,64 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Replace Font Awesome icon macros in LaTeX with Unicode codepoints."""
|
||||
|
||||
import re
|
||||
import sys
|
||||
|
||||
def pascal_to_snake(name):
|
||||
"""Convert PascalCase to UPPER_SNAKE_CASE."""
|
||||
result = name[0]
|
||||
for i in range(1, len(name)):
|
||||
if name[i].isupper() and name[i - 1].islower():
|
||||
result += '_'
|
||||
result += name[i]
|
||||
return result.upper()
|
||||
|
||||
def main():
|
||||
if len(sys.argv) < 3:
|
||||
print(f"Usage: {sys.argv[0]} <header_path> <tex_path>", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
header_path = sys.argv[1]
|
||||
tex_path = sys.argv[2]
|
||||
|
||||
# Parse header: ICON_FA_SNAKE_CASE -> Unicode char
|
||||
icons = {}
|
||||
with open(header_path) as f:
|
||||
for line in f:
|
||||
m = re.match(
|
||||
r'#define\s+ICON_FA_(\w+)\s+.*?//\s*(U\+([0-9a-fA-F]+))', line
|
||||
)
|
||||
if m:
|
||||
snake = m.group(1)
|
||||
parts = snake.split('_')
|
||||
pascal = ''.join(p.capitalize() for p in parts)
|
||||
codepoint = int(m.group(3), 16)
|
||||
icons[pascal] = chr(codepoint)
|
||||
|
||||
# Read tex file
|
||||
with open(tex_path) as f:
|
||||
text = f.read()
|
||||
|
||||
# Find all \faXxx used in the text (uppercase first letter excludes \fancyhead etc.)
|
||||
used = set()
|
||||
for m in re.finditer(r'\\fa([A-Z][a-zA-Z0-9]*)', text):
|
||||
used.add(m.group(1))
|
||||
|
||||
# Replace each used icon, longest names first to avoid prefix conflicts
|
||||
for name in sorted(used, key=lambda n: (-len(n), n)):
|
||||
if name not in icons:
|
||||
print(f"Warning: \\fa{name} not found in header", file=sys.stderr)
|
||||
continue
|
||||
char = icons[name]
|
||||
# Order matters: more specific patterns first
|
||||
text = text.replace(f'\\fa{name}{{}}~', f'{char} ')
|
||||
text = text.replace(f'\\fa{name}{{}}', char)
|
||||
text = text.replace(f'\\fa{name}~', f'{char} ')
|
||||
text = text.replace(f'\\fa{name}', char)
|
||||
|
||||
# Write back
|
||||
with open(tex_path, 'w') as f:
|
||||
f.write(text)
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
@@ -3,3 +3,151 @@ function Link(el)
|
||||
el.attributes['reference'] = nil
|
||||
return el
|
||||
end
|
||||
|
||||
-- Drop Div wrappers (e.g. table/titlepage containers), keeping their content.
|
||||
function Div(el)
|
||||
return el.content
|
||||
end
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- LaTeX math -> plain-text approximation.
|
||||
--
|
||||
-- The target Markdown renderer has no math support, so a raw "$\frac{1}{2}$"
|
||||
-- would show verbatim. We turn each math node into the closest Unicode/ASCII
|
||||
-- equivalent: fractions become "a/b", \times becomes "x", super/subscripts use
|
||||
-- Unicode digits, and the one multi-line display equation becomes a fenced
|
||||
-- code block (Markdown collapses plain newlines, a code block keeps them).
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
local sup = {['0']='⁰',['1']='¹',['2']='²',['3']='³',['4']='⁴',['5']='⁵',
|
||||
['6']='⁶',['7']='⁷',['8']='⁸',['9']='⁹',['+']='⁺',['-']='⁻',
|
||||
['=']='⁼',['(']='⁽',[')']='⁾'}
|
||||
local sub = {['0']='₀',['1']='₁',['2']='₂',['3']='₃',['4']='₄',['5']='₅',
|
||||
['6']='₆',['7']='₇',['8']='₈',['9']='₉',['+']='₊',['-']='₋',
|
||||
['=']='₌',['(']='₍',[')']='₎'}
|
||||
|
||||
-- Symbol replacements, applied as literal substitutions. Longer commands must
|
||||
-- precede those that are a prefix of them (e.g. \rightarrow before \right).
|
||||
local symbols = {
|
||||
{'\\leftrightarrow','↔'}, {'\\rightarrow','→'}, {'\\leftarrow','←'},
|
||||
{'\\Rightarrow','⇒'}, {'\\Leftarrow','⇐'}, {'\\to','→'}, {'\\mapsto','↦'},
|
||||
{'\\times','×'}, {'\\cdot','·'}, {'\\div','÷'}, {'\\ast','*'}, {'\\star','*'},
|
||||
{'\\leq','≤'}, {'\\geq','≥'}, {'\\neq','≠'}, {'\\approx','≈'}, {'\\equiv','≡'},
|
||||
{'\\ll','«'}, {'\\gg','»'}, {'\\le','≤'}, {'\\ge','≥'},
|
||||
{'\\ldots','…'}, {'\\cdots','…'}, {'\\dots','…'}, {'\\infty','∞'},
|
||||
{'\\pm','±'}, {'\\mp','∓'}, {'\\propto','∝'}, {'\\sum','Σ'}, {'\\prod','Π'},
|
||||
{'\\alpha','α'}, {'\\beta','β'}, {'\\gamma','γ'}, {'\\delta','δ'}, {'\\Delta','Δ'},
|
||||
{'\\mu','µ'}, {'\\sigma','σ'}, {'\\pi','π'}, {'\\lambda','λ'}, {'\\theta','θ'},
|
||||
{'\\left',''}, {'\\right',''},
|
||||
{'\\qquad',' '}, {'\\quad',' '}, {'\\,',' '}, {'\\;',' '}, {'\\:',' '},
|
||||
{'\\ ',' '}, {'\\!',''},
|
||||
{'\\%','%'}, {'\\#','#'}, {'\\&','&'}, {'\\_','_'}, {'\\{','{'}, {'\\}','}'},
|
||||
{'\\$','$'},
|
||||
}
|
||||
|
||||
-- Literal (non-pattern) string replacement; avoids Lua pattern magic in keys.
|
||||
local function lit_replace(s, a, b)
|
||||
local out, i = {}, 1
|
||||
while true do
|
||||
local p = s:find(a, i, true)
|
||||
if not p then out[#out + 1] = s:sub(i); break end
|
||||
out[#out + 1] = s:sub(i, p - 1)
|
||||
out[#out + 1] = b
|
||||
i = p + #a
|
||||
end
|
||||
return table.concat(out)
|
||||
end
|
||||
|
||||
-- Strip the outer braces of a "%b{}" capture.
|
||||
local function grp(b) return b:sub(2, #b - 1) end
|
||||
|
||||
-- Map a string to Unicode super/subscript, or nil if any char is unsupported.
|
||||
local function map_script(txt, map)
|
||||
local res = {}
|
||||
for i = 1, #txt do
|
||||
local c = txt:sub(i, i)
|
||||
if not map[c] then return nil end
|
||||
res[#res + 1] = map[c]
|
||||
end
|
||||
return table.concat(res)
|
||||
end
|
||||
|
||||
local function convert(s)
|
||||
-- Text/font wrappers: keep the content, recurse to handle nesting.
|
||||
for _, cmd in ipairs({'text', 'mathrm', 'mathit', 'mathbf', 'mathbb',
|
||||
'mathsf', 'mathtt', 'mathcal', 'operatorname',
|
||||
'textbf', 'textit', 'textrm'}) do
|
||||
s = s:gsub('\\' .. cmd .. '(%b{})', function(b) return convert(grp(b)) end)
|
||||
end
|
||||
-- Fractions -> "num/den" (spaced when either side has spaces).
|
||||
local function frac(a, b)
|
||||
local n, d = convert(grp(a)), convert(grp(b))
|
||||
local sep = (n:find(' ', 1, true) or d:find(' ', 1, true)) and ' / ' or '/'
|
||||
return n .. sep .. d
|
||||
end
|
||||
s = s:gsub('\\frac(%b{})(%b{})', frac)
|
||||
s = s:gsub('\\dfrac(%b{})(%b{})', frac)
|
||||
s = s:gsub('\\tfrac(%b{})(%b{})', frac)
|
||||
s = s:gsub('\\sfrac(%b{})(%b{})', frac)
|
||||
-- Roots.
|
||||
s = s:gsub('\\sqrt(%b{})', function(b) return '√(' .. convert(grp(b)) .. ')' end)
|
||||
-- Single-char scripts first, so the braced fallback (e.g. "_native") below
|
||||
-- is not re-scanned and mangled into Unicode subscripts.
|
||||
s = s:gsub('%^([%w])', function(c) return sup[c] or ('^' .. c) end)
|
||||
s = s:gsub('_([%w])', function(c) return sub[c] or ('_' .. c) end)
|
||||
-- Braced scripts: Unicode when the content is all digits/signs, else keep
|
||||
-- a readable "^(...)" / "_..." form.
|
||||
s = s:gsub('%^(%b{})', function(b)
|
||||
local inner = convert(grp(b))
|
||||
return map_script(inner, sup) or ('^(' .. inner .. ')')
|
||||
end)
|
||||
s = s:gsub('_(%b{})', function(b)
|
||||
local inner = convert(grp(b))
|
||||
return map_script(inner, sub) or ('_' .. inner)
|
||||
end)
|
||||
-- Remaining symbols.
|
||||
for _, pair in ipairs(symbols) do s = lit_replace(s, pair[1], pair[2]) end
|
||||
return s
|
||||
end
|
||||
|
||||
-- Convert a display equation, preserving its line structure for a code block.
|
||||
local function convert_display(s)
|
||||
s = convert(s)
|
||||
for _, env in ipairs({'cases', 'aligned', 'align', 'array', 'matrix',
|
||||
'gathered', 'split'}) do
|
||||
s = lit_replace(s, '\\begin{' .. env .. '}', '')
|
||||
s = lit_replace(s, '\\end{' .. env .. '}', '')
|
||||
end
|
||||
s = lit_replace(s, '\\\\', '\n') -- row break
|
||||
s = s:gsub('%s*&%s*', ' ') -- column separator -> spacing
|
||||
local lines = {}
|
||||
for line in (s .. '\n'):gmatch('(.-)\n') do
|
||||
line = line:gsub('^%s+', ''):gsub('%s+$', '')
|
||||
if line ~= '' then lines[#lines + 1] = line end
|
||||
end
|
||||
for i = 2, #lines do lines[i] = ' ' .. lines[i] end -- indent continuations
|
||||
return table.concat(lines, '\n')
|
||||
end
|
||||
|
||||
function Math(el)
|
||||
if el.mathtype == 'DisplayMath' then
|
||||
return el -- handled at block level by Para, to emit a code block
|
||||
end
|
||||
return pandoc.Str(convert(el.text))
|
||||
end
|
||||
|
||||
-- A paragraph that is solely a display equation becomes a fenced code block.
|
||||
function Para(el)
|
||||
local maths, only_math = {}, true
|
||||
for _, x in ipairs(el.content) do
|
||||
if x.t == 'Math' and x.mathtype == 'DisplayMath' then
|
||||
maths[#maths + 1] = x
|
||||
elseif x.t ~= 'Space' and x.t ~= 'SoftBreak' and x.t ~= 'LineBreak' then
|
||||
only_math = false
|
||||
end
|
||||
end
|
||||
if #maths == 0 or not only_math then return nil end
|
||||
local parts = {}
|
||||
for _, m in ipairs(maths) do parts[#parts + 1] = convert_display(m.text) end
|
||||
return pandoc.CodeBlock(table.concat(parts, '\n\n'))
|
||||
end
|
||||
|
||||
77
manual/icon-explain.py
Normal file
@@ -0,0 +1,77 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Append icon legend blocks to each markdown section containing Font Awesome icons."""
|
||||
|
||||
import re
|
||||
import sys
|
||||
|
||||
|
||||
def _extract_icons(lines):
|
||||
"""Return deduplicated icon chars from lines, in order of first appearance."""
|
||||
seen = set()
|
||||
icons = []
|
||||
for line in lines:
|
||||
for ch in line:
|
||||
cp = ord(ch)
|
||||
if 0xE000 <= cp <= 0xF8FF and ch not in seen:
|
||||
seen.add(ch)
|
||||
icons.append(ch)
|
||||
return icons
|
||||
|
||||
|
||||
def _append_legend(result_lines, icons, icon_names):
|
||||
"""Append a legend block for the given icons."""
|
||||
result_lines.append('')
|
||||
result_lines.append('-----')
|
||||
result_lines.append('')
|
||||
for ch in icons:
|
||||
name = icon_names.get(ch, f'Unknown(U+{ord(ch):04X})')
|
||||
result_lines.append(f'{ch} - {name} icon')
|
||||
result_lines.append('')
|
||||
|
||||
|
||||
def main():
|
||||
if len(sys.argv) < 3:
|
||||
print(f"Usage: {sys.argv[0]} <header_path> <md_path>", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
header_path = sys.argv[1]
|
||||
md_path = sys.argv[2]
|
||||
|
||||
# Build char -> name mapping from header
|
||||
icon_names = {}
|
||||
with open(header_path) as f:
|
||||
for line in f:
|
||||
m = re.match(
|
||||
r'#define\s+ICON_FA_(\w+)\s+.*?//\s*(U\+([0-9a-fA-F]+))', line
|
||||
)
|
||||
if m:
|
||||
snake = m.group(1)
|
||||
parts = snake.split('_')
|
||||
pascal = ' '.join(p.capitalize() for p in parts)
|
||||
codepoint = int(m.group(3), 16)
|
||||
icon_names[chr(codepoint)] = pascal
|
||||
|
||||
with open(md_path, encoding='utf-8') as f:
|
||||
lines = f.read().split('\n')
|
||||
|
||||
# Build chunk boundaries: header lines and EOF
|
||||
chunk_starts = [i for i, line in enumerate(lines) if line.startswith('#')]
|
||||
|
||||
# Also add index 0 as a chunk start if there's pre-header content
|
||||
if chunk_starts and chunk_starts[0] > 0:
|
||||
chunk_starts.insert(0, 0)
|
||||
|
||||
result_lines = []
|
||||
for ci, start in enumerate(chunk_starts):
|
||||
end = chunk_starts[ci + 1] if ci + 1 < len(chunk_starts) else len(lines)
|
||||
icons = _extract_icons(lines[start:end])
|
||||
result_lines.extend(lines[start:end])
|
||||
if icons:
|
||||
_append_legend(result_lines, icons, icon_names)
|
||||
|
||||
with open(md_path, 'w', encoding='utf-8') as f:
|
||||
f.write('\n'.join(result_lines))
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
@@ -7,20 +7,45 @@ sed -i -e 's@\\ctrl@Ctrl@g' _tmp.tex
|
||||
sed -i -e 's@\\shift@Shift@g' _tmp.tex
|
||||
sed -i -e 's@\\Alt@Alt@g' _tmp.tex
|
||||
sed -i -e 's@\\del@Delete@g' _tmp.tex
|
||||
sed -i -e 's@\\fa\([a-zA-Z]*\)@(\1~icon)@g' _tmp.tex
|
||||
python3 fa-icons.py ../profiler/src/profiler/IconsFontAwesome7.h _tmp.tex
|
||||
sed -i -e 's@\\LMB{}~@@g' _tmp.tex
|
||||
sed -i -e 's@\\MMB{}~@@g' _tmp.tex
|
||||
sed -i -e 's@\\RMB{}~@@g' _tmp.tex
|
||||
sed -i -e 's@\\Scroll{}~@@g' _tmp.tex
|
||||
|
||||
# Resolve \circled{} markers and lstlisting escapeinside (@...@) snippets, which
|
||||
# pandoc would otherwise emit verbatim or drop, to their Unicode equivalents.
|
||||
sed -i -e 's|@\\circled{a}@|(a)|g' -e 's|@\\circled{b}@|(b)|g' -e 's|@\\circled{c}@|(c)|g' _tmp.tex
|
||||
sed -i -e 's|\\circled{a}|(a)|g' -e 's|\\circled{b}|(b)|g' -e 's|\\circled{c}|(c)|g' _tmp.tex
|
||||
sed -i -e 's|@\\ldots@|…|g' _tmp.tex
|
||||
|
||||
sed -i -e 's@\\nameref{quicklook}@A quick look at Tracy Profiler@g' _tmp.tex
|
||||
sed -i -e 's@\\nameref{firststeps}@First steps@g' _tmp.tex
|
||||
sed -i -e 's@\\nameref{client}@Client markup@g' _tmp.tex
|
||||
sed -i -e 's@\\nameref{capturing}@Capturing the data@g' _tmp.tex
|
||||
sed -i -e 's@\\nameref{analyzingdata}@Analyzing captured data@g' _tmp.tex
|
||||
sed -i -e 's@\\nameref{tracyassist}@Tracy Assist@g' _tmp.tex
|
||||
sed -i -e 's@\\nameref{csvexport}@Exporting zone statistics to CSV@g' _tmp.tex
|
||||
sed -i -e 's@\\nameref{importingdata}@Importing external profiling data@g' _tmp.tex
|
||||
sed -i -e 's@\\nameref{configurationfiles}@Configuration files@g' _tmp.tex
|
||||
|
||||
pandoc --wrap=none --reference-location=block --number-sections -L filter.lua -s _tmp.tex -o tracy.md
|
||||
awk -f bclogo2quote.awk _tmp.tex > _tmp_quoted.tex
|
||||
mv _tmp_quoted.tex _tmp.tex
|
||||
|
||||
pandoc --wrap=none --reference-location=block --number-sections -L filter.lua -t 'markdown-simple_tables-multiline_tables-grid_tables+pipe_tables' -s _tmp.tex -o tracy.md
|
||||
|
||||
awk -f tablecaption.awk tracy.md > _tmp_caption.md
|
||||
mv _tmp_caption.md tracy.md
|
||||
|
||||
sed -i -e 's/^> \*\*IMPORTANT:\([^*]*\)\*\*/> [!IMPORTANT]\
|
||||
> **\1**/' tracy.md
|
||||
sed -i -e 's/^> \*\*TIP:\([^*]*\)\*\*/> [!TIP]\
|
||||
> **\1**/' tracy.md
|
||||
sed -i -e 's/^> \*\*CAUTION:\([^*]*\)\*\*/> [!CAUTION]\
|
||||
> **\1**/' tracy.md
|
||||
sed -i -e 's/^> \*\*NOTE:\([^*]*\)\*\*/> [!NOTE]\
|
||||
> **\1**/' tracy.md
|
||||
|
||||
python3 icon-explain.py ../profiler/src/profiler/IconsFontAwesome7.h tracy.md
|
||||
|
||||
rm -f _tmp.tex
|
||||
|
||||
16
manual/tablecaption.awk
Normal file
@@ -0,0 +1,16 @@
|
||||
# Pandoc emits table captions as a line beginning with ": ", which GitHub
|
||||
# renders literally instead of as a caption. Strip the marker and italicize
|
||||
# the caption instead. Captions may span several physical lines when they
|
||||
# contain a hard line break (a trailing backslash). Underscores are used for
|
||||
# the emphasis so captions that already contain "*...*" markup are left intact.
|
||||
!incap && /^: / {
|
||||
incap = 1
|
||||
$0 = "_" substr($0, 3)
|
||||
}
|
||||
incap && !/\\$/ {
|
||||
print $0 "_"
|
||||
incap = 0
|
||||
next
|
||||
}
|
||||
incap { print; next }
|
||||
{ print }
|
||||
2483
manual/tracy.md
430
manual/tracy.tex
@@ -14,7 +14,7 @@
|
||||
\usepackage{verbatim}
|
||||
\usepackage[hyphens]{url}
|
||||
\usepackage{hyperref} % For hyperlinks in the PDF
|
||||
\usepackage{fontawesome6}
|
||||
\usepackage{fontawesome7}
|
||||
\usepackage[os=win]{menukeys}
|
||||
\usepackage{xfrac}
|
||||
\usepackage[euler]{textgreek}
|
||||
@@ -106,6 +106,7 @@ Hello and welcome to the Tracy Profiler user manual! Here you will find all the
|
||||
\item Chapter~\ref{client}, \emph{\nameref{client}}, provides information on how to instrument your application, in order to retrieve useful profiling data. This includes a description of the C API (section~\ref{capi}), which enables usage of Tracy in any programming language.
|
||||
\item Chapter~\ref{capturing}, \emph{\nameref{capturing}}, goes into more detail on how the profiling information can be captured and stored on disk.
|
||||
\item Chapter~\ref{analyzingdata}, \emph{\nameref{analyzingdata}}, guides you through the graphical user interface of the profiler.
|
||||
\item Chapter~\ref{tracyassist}, \emph{\nameref{tracyassist}}, describes how to use the built-in AI assistant.
|
||||
\item Chapter~\ref{csvexport}, \emph{\nameref{csvexport}}, explains how to export some zone timing statistics into a CSV format.
|
||||
\item Chapter~\ref{importingdata}, \emph{\nameref{importingdata}}, documents how to import data from other profilers.
|
||||
\item Chapter~\ref{configurationfiles}, \emph{\nameref{configurationfiles}}, gives information on the profiler settings.
|
||||
@@ -140,7 +141,7 @@ There's much more Tracy can do, which can be explored by carefully reading this
|
||||
\section{A quick look at Tracy Profiler}
|
||||
\label{quicklook}
|
||||
|
||||
Tracy is a real-time, nanosecond resolution \emph{hybrid frame and sampling profiler} that you can use for remote or embedded telemetry of games and other applications. It can profile CPU\footnote{Direct support is provided for C, C++, Lua, Python and Fortran integration. At the same time, third-party bindings to many other languages exist on the internet, such as Rust, Zig, C\#, OCaml, Odin, etc.}, GPU\footnote{All major graphic APIs: OpenGL, Vulkan, Direct3D 11/12, Metal, OpenCL.}, memory allocations, locks, context switches, automatically attribute screenshots to captured frames, and much more.
|
||||
Tracy is a real-time, nanosecond resolution \emph{hybrid frame and sampling profiler} that you can use for remote or embedded telemetry of games and other applications. It can profile CPU\footnote{Direct support is provided for C, C++, Lua, Python and Fortran integration. At the same time, third-party bindings to many other languages exist on the internet, such as Rust, Zig, C\#, OCaml, Odin, etc.}, GPU\footnote{All major graphics/compute APIs: OpenGL, Vulkan, Direct3D 11/12, Metal, OpenCL, CUDA, WebGPU.}, memory allocations, locks, context switches, automatically attribute screenshots to captured frames, and much more.
|
||||
|
||||
While Tracy can perform statistical analysis of sampled call stack data, just like other \emph{statistical profilers} (such as VTune, perf, or Very Sleepy), it mainly focuses on manual markup of the source code. Such markup allows frame-by-frame inspection of the program execution. For example, you will be able to see exactly which functions are called, how much time they require, and how they interact with each other in a multi-threaded environment. In contrast, the statistical analysis may show you the hot spots in your code, but it cannot accurately pinpoint the underlying cause for semi-random frame stutter that may occur every couple of seconds.
|
||||
|
||||
@@ -227,7 +228,7 @@ Tracy aims to give you an understanding of the inner workings of a tight loop of
|
||||
|
||||
\subsection{Sampling profiler}
|
||||
|
||||
Tracy can periodically sample what the profiled application is doing, which provides detailed performance information at the source line/assembly instruction level. This can give you a deep understanding of how the processor executes the program. Using this information, you can get a coarse view at the call stacks, fine-tune your algorithms, or even 'steal' an optimization performed by one compiler and make it available for the others.
|
||||
Tracy can periodically sample what the profiled application is doing, which provides detailed performance information at the source line/assembly instruction level. This can give you a deep understanding of how the processor executes the program. Using this information, you can get a coarse view at the call stacks, fine-tune your algorithms, or even \enquote{steal} an optimization performed by one compiler and make it available for the others.
|
||||
|
||||
On some platforms, it is possible to sample the hardware performance counters, which will give you information not only \emph{where} your program is running slowly, but also \emph{why}.
|
||||
|
||||
@@ -368,7 +369,7 @@ Note that these binary releases require AVX2 instruction set support on the proc
|
||||
Tracy Profiler supports MSVC, GCC, and clang. You will need to use a reasonably recent version of the compiler due to the C++11 requirement. The following platforms are confirmed to be working (this is not a complete list):
|
||||
|
||||
\begin{itemize}
|
||||
\item Windows (x86, x64, ARM64\footnote{Requires \textbf{"OpenCL, OpenGL, and Vulkan Compatibility Pack"} from Microsoft Store.})
|
||||
\item Windows (x86, x64, ARM64\footnote{Requires \textbf{\enquote{OpenCL, OpenGL, and Vulkan Compatibility Pack}} from Microsoft Store.})
|
||||
\item Linux (x86, x64, ARM, ARM64)
|
||||
\item Android (ARM, ARM64, x86)
|
||||
\item FreeBSD (x64)
|
||||
@@ -593,7 +594,7 @@ In the case of some programming environments, you may need to take extra steps t
|
||||
|
||||
If you are using MSVC, you will need to disable the \emph{Edit And Continue} feature, as it makes the compiler non-conformant to some aspects of the C++ standard. In order to do so, open the project properties and go to \menu[,]{C/C++,General,Debug Information Format} and make sure \emph{Program Database for Edit And Continue (/ZI)} is \emph{not} selected.
|
||||
|
||||
For context, if you experience errors like "error C2131: expression did not evaluate to a constant", "failure was caused by non-constant arguments or reference to a non-constant symbol", and "see usage of '\texttt{\_\_LINE\_\_Var}'", chances are that your project has the \emph{Edit And Continue} feature enabled.
|
||||
For context, if you experience errors like \enquote{error C2131: expression did not evaluate to a constant}, \enquote{failure was caused by non-constant arguments or reference to a non-constant symbol}, and \enquote{see usage of \enquote{\texttt{\_\_LINE\_\_Var}}}, chances are that your project has the \emph{Edit And Continue} feature enabled.
|
||||
|
||||
\paragraph{Universal Windows Platform}
|
||||
|
||||
@@ -669,6 +670,40 @@ Although the basic features will work without them, you'll have to grant elevate
|
||||
\item \texttt{-{}-pid=host}
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{Porting to unsupported platforms}
|
||||
\label{customplatform}
|
||||
|
||||
When Tracy is built for a platform that is not among the supported set, some of the platform-specific code paths it relies on may fail to compile or have undesired behavior. Rather than patching the \texttt{\#if} chains in Tracy itself, you can point Tracy at a \emph{platform header} that provides your own implementations of these primitives.
|
||||
|
||||
Define \texttt{TRACY\_PLATFORM\_HEADER} at build time to the path of a header that Tracy will include from its internal translation units:
|
||||
|
||||
\begin{lstlisting}
|
||||
-DTRACY_PLATFORM_HEADER="\"my_platform.h\""
|
||||
\end{lstlisting}
|
||||
|
||||
Inside that header, enable any subset of the hooks you need by defining the corresponding \texttt{TRACY\_HAS\_CUSTOM\_*} macro and declaring the matching \texttt{tracy::Platform*} function. Provide the implementations in a separate translation unit that is linked into your final binary.
|
||||
|
||||
The available hooks are:
|
||||
|
||||
\begin{itemize}
|
||||
\item \texttt{TRACY\_HAS\_CUSTOM\_THREAD\_ID} $\rightarrow$ \texttt{tracy::PlatformGetThreadId()}. Required.
|
||||
\item \texttt{TRACY\_HAS\_CUSTOM\_USER\_INFO} $\rightarrow$ \texttt{tracy::PlatformGetHostname()}, \texttt{tracy::PlatformGetUserLogin()}, \texttt{tracy::PlatformGetUserFullName()}.
|
||||
\item \texttt{TRACY\_HAS\_CUSTOM\_SAFE\_COPY} $\rightarrow$ \texttt{tracy::PlatformSafeMemcpy()}.
|
||||
\item \texttt{TRACY\_HAS\_CUSTOM\_ALLOCATOR} $\rightarrow$ \texttt{tracy::PlatformMalloc()}, \texttt{tracy::PlatformFree()}, \texttt{tracy::PlatformRealloc()}, \texttt{tracy::PlatformAllocatorInit()}, \texttt{tracy::PlatformAllocatorThreadInit()}, \texttt{tracy::PlatformAllocatorFinalize()}, \texttt{tracy::PlatformAllocatorThreadFinalize()}.
|
||||
\end{itemize}
|
||||
|
||||
Template files are provided in the repository ( \texttt{examples/CustomPlatform/CustomPlatform(.h|.cpp)} ). See \texttt{CustomPlatform.h} for the contract each \texttt{Platform*} function must satisfy (return values, threading guarantees, and footguns to avoid). Copy these files into your project, fill in the bodies for the hooks you enable, and point Tracy at the header.
|
||||
|
||||
These are the only categories currently exposed through the custom-platform mechanism. Other platform-specific subsystems (call stack collection, context switch capture, crash handling, system tracing, and so on) are not pluggable this way. If your platform cannot support one of them, disable it at build time using the corresponding \texttt{TRACY\_NO\_*} macros rather than trying to stub it out via the platform header.
|
||||
|
||||
\begin{bclogo}[
|
||||
noborder=true,
|
||||
couleur=black!5,
|
||||
logo=\bcbombe
|
||||
]{Important}
|
||||
The platform header is intended only for the \texttt{TRACY\_HAS\_CUSTOM\_*} hooks and their matching function declarations. Do not use it to set unrelated \texttt{TRACY\_*} options (such as \texttt{TRACY\_ENABLE} or \texttt{TRACY\_ON\_DEMAND}). Some of those are checked in \texttt{Tracy.hpp} before the platform header is included, so the results would be inconsistent depending on which translation unit consults them. Set those options at the build system level instead, as described in section~\ref{initialsetup}.
|
||||
\end{bclogo}
|
||||
|
||||
\subsubsection{Troubleshooting}
|
||||
|
||||
By default, Tracy's diagnostics will be sent as Message logs (section~\ref{messagelog}) to the server.
|
||||
@@ -743,7 +778,7 @@ Nevertheless, let's look at how we can try to stabilize the profiling data.
|
||||
|
||||
Also known as: the \emph{spectre} thing we have to deal with now.
|
||||
|
||||
You must be aware that most processors available on the market\footnote{Except low-cost ARM CPUs.} \emph{do not} execute machine code linearly, as laid out in the source code. This can lead to counterintuitive timing results reported by Tracy. Trying to get more 'reliable' readings\footnote{And by saying 'reliable,' you do in reality mean: behaving in a way you expect it.} would require a change in the behavior of the code, and this is not a thing a profiler should do. So instead, Tracy shows you what the hardware is \emph{really} doing.
|
||||
You must be aware that most processors available on the market\footnote{Except low-cost ARM CPUs.} \emph{do not} execute machine code linearly, as laid out in the source code. This can lead to counterintuitive timing results reported by Tracy. Trying to get more \enquote{reliable} readings\footnote{And by saying \enquote{reliable,} you do in reality mean: behaving in a way you expect it.} would require a change in the behavior of the code, and this is not a thing a profiler should do. So instead, Tracy shows you what the hardware is \emph{really} doing.
|
||||
|
||||
This is a complex subject, and the details vary from one CPU to another. You can read a brief rundown of the topic at the following address: \url{https://travisdowns.github.io/blog/2019/06/11/speed-limits.html}.
|
||||
|
||||
@@ -770,7 +805,7 @@ While the CPU is more-or-less designed always to be able to work at the advertis
|
||||
\item Do you have complete control over the power profile? Spoiler alert: no. The operating system may run anything at any time on any of the other cores, which will impact the turbo frequency you're able to achieve.
|
||||
\end{itemize}
|
||||
|
||||
As you can see, this feature basically screams 'unreliable results!' Best keep it disabled and run at the base frequency. Otherwise, your timings won't make much sense. A true example: branchless compression function executing multiple times with the same input data was measured executing at \emph{four} different speeds.
|
||||
As you can see, this feature basically screams \enquote{unreliable results!} Best keep it disabled and run at the base frequency. Otherwise, your timings won't make much sense. A true example: branchless compression function executing multiple times with the same input data was measured executing at \emph{four} different speeds.
|
||||
|
||||
Keep in mind that even at the base frequency, you may hit the thermal limits of the silicon and be down throttled.
|
||||
|
||||
@@ -905,7 +940,7 @@ Please don't ask about window decorations in Gnome. The current behavior is the
|
||||
|
||||
Special considerations must be taken to run the Tracy server/profiler GUI on Windows on ARM.
|
||||
|
||||
Ensure that the \textbf{"OpenCL, OpenGL, and Vulkan Compatibility Pack"} is installed (from the Microsoft Store), otherwise the GUI will fail to open.
|
||||
Ensure that the \textbf{\enquote{OpenCL, OpenGL, and Vulkan Compatibility Pack}} is installed (from the Microsoft Store), otherwise the GUI will fail to open.
|
||||
|
||||
\subsubsection{Using an IDE}
|
||||
|
||||
@@ -920,7 +955,7 @@ The CMake build configuration will begin immediately. It is likely that you will
|
||||
After the build configuration phase is over, you may want to make some further adjustments to what is being built. The primary place to do this is in the \emph{Project Status} section of the CMake side panel. The two key settings there are also available in the status bar at the bottom of the window:
|
||||
|
||||
\begin{itemize}
|
||||
\item The \emph{Folder} setting allows you to choose which Tracy utility you want to work with. Select "profiler" for the profiler's GUI.
|
||||
\item The \emph{Folder} setting allows you to choose which Tracy utility you want to work with. Select \enquote{profiler} for the profiler's GUI.
|
||||
\item The \emph{Build variant} setting is used to toggle between the debug and release build configurations.
|
||||
\end{itemize}
|
||||
|
||||
@@ -981,7 +1016,7 @@ void Graphics::Render()
|
||||
\subsection{Crash handling}
|
||||
\label{crashhandling}
|
||||
|
||||
On selected platforms (see section~\ref{featurematrix}) Tracy will intercept application crashes\footnote{For example, invalid memory accesses ('segmentation faults', 'null pointer exceptions'), divisions by zero, etc.}. This serves two purposes. First, the client application will be able to send the remaining profiling data to the server. Second, the server will receive a crash report with the crash reason, call stack at the time of the crash, etc.
|
||||
On selected platforms (see section~\ref{featurematrix}) Tracy will intercept application crashes\footnote{For example, invalid memory accesses (\enquote{segmentation faults}, \enquote{null pointer exceptions}), divisions by zero, etc.}. This serves two purposes. First, the client application will be able to send the remaining profiling data to the server. Second, the server will receive a crash report with the crash reason, call stack at the time of the crash, etc.
|
||||
|
||||
This is an automatic process, and it doesn't require user interaction. If you are experiencing issues with crash handling you may want to try defining the \texttt{TRACY\_NO\_CRASH\_HANDLER} macro to disable the built in crash handling.
|
||||
|
||||
@@ -1015,6 +1050,8 @@ Memory & \faCheck & \faCheck & \faCheck & \faCheck & \faCheck & \faCheck & \faXm
|
||||
GPU zones (OpenGL) & \faCheck & \faCheck & \faCheck & \faPoo & \faPoo & & \faXmark \\
|
||||
GPU zones (Vulkan) & \faCheck & \faCheck & \faCheck & \faCheck & \faCheck & & \faXmark \\
|
||||
GPU zones (Metal) & \faXmark & \faXmark & \faXmark & \faCheck\textsuperscript{\emph{b}} & \faCheck\textsuperscript{\emph{b}} & \faXmark & \faXmark \\
|
||||
GPU zones (CUDA) & \faCheck & \faCheck & \faXmark & \faXmark & \faXmark & \faQuestion & \faXmark \\
|
||||
GPU zones (WebGPU) & \faCheck & \faCheck & \faCheck & \faCheck & \faCheck & \faQuestion & \faQuestion \\
|
||||
Call stacks & \faCheck & \faCheck & \faCheck & \faCheck & \faCheck & \faCheck & \faXmark \\
|
||||
Symbol resolution & \faCheck & \faCheck & \faCheck & \faCheck & \faCheck & \faCheck & \faCheck \\
|
||||
Crash handling & \faCheck & \faCheck & \faCheck & \faXmark & \faXmark & \faXmark & \faXmark \\
|
||||
@@ -1073,7 +1110,7 @@ FrameMarkStart("Audio processing");
|
||||
FrameMarkEnd("Audio processing");
|
||||
\end{lstlisting}
|
||||
|
||||
Here, we pass two string literals with identical contents to two different macros. It is entirely up to the compiler to decide if it will pool these two strings into one pointer or if there will be two instances present in the executable image\footnote{\cite{ISO:2012:III} \S 2.14.5.12: "Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation-defined."}. For example, on MSVC, this is controlled by \menu[,]{Configuration Properties,C/C++,Code Generation,Enable String Pooling} option in the project properties (optimized builds enable it automatically). Note that even if string pooling is used on the compilation unit level, it is still up to the linker to implement pooling across object files.
|
||||
Here, we pass two string literals with identical contents to two different macros. It is entirely up to the compiler to decide if it will pool these two strings into one pointer or if there will be two instances present in the executable image\footnote{\cite{ISO:2012:III} \S 2.14.5.12: \enquote{Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation-defined.}}. For example, on MSVC, this is controlled by \menu[,]{Configuration Properties,C/C++,Code Generation,Enable String Pooling} option in the project properties (optimized builds enable it automatically). Note that even if string pooling is used on the compilation unit level, it is still up to the linker to implement pooling across object files.
|
||||
|
||||
As you can see, making sure that string literals are properly pooled can be surprisingly tricky. To work around this problem, you may employ the following technique. In \emph{one} source file create the unique pointer for a string literal, for example:
|
||||
|
||||
@@ -1371,7 +1408,7 @@ It is valid to set the \texttt{Zone1} text or name \emph{only} in places \circle
|
||||
\subsubsection{Filtering zones}
|
||||
\label{filteringzones}
|
||||
|
||||
Zone logging can be disabled on a per-zone basis by making use of the \texttt{ZoneNamed} macros. Each of the macros takes an \texttt{active} argument ('\texttt{true}' in the example in section~\ref{multizone}), which will determine whether the zone should be logged.
|
||||
Zone logging can be disabled on a per-zone basis by making use of the \texttt{ZoneNamed} macros. Each of the macros takes an \texttt{active} argument (\enquote{\texttt{true}} in the example in section~\ref{multizone}), which will determine whether the zone should be logged.
|
||||
|
||||
Note that this parameter may be a run-time variable, such as a user-controlled switch to enable profiling of a specific part of code only when required.
|
||||
|
||||
@@ -1523,14 +1560,24 @@ Fast navigation in large data sets and correlating zones with what was happening
|
||||
|
||||
If you want to include color coding of the messages (for example to make critical messages easily visible), you can use \texttt{TracyMessageC(text, size, color)} or \texttt{TracyMessageLC(text, color)} macros.
|
||||
|
||||
Messages can also have different severity levels: \texttt{Trace}, \texttt{Debug}, \texttt{Info}, \texttt{Warning}, \texttt{Error} or \texttt{Fatal}.
|
||||
Messages can also have different severity levels:
|
||||
|
||||
\begin{itemize}
|
||||
\item \emph{Trace} -- Broadly track variable states and events in the software program.
|
||||
\item \emph{Debug} -- Describes variable states and details about specific internal events in the software, that are useful for investigations.
|
||||
\item \emph{Info} -- Describes normal events, which inform on the expected progress and state of your software.
|
||||
\item \emph{Warning} -- Describes potentially dangerous situations caused by unexpected events and states.
|
||||
\item \emph{Error} -- Describes the occurrence of unexpected behavior. Does not interrupt the execution of the software.
|
||||
\item \emph{Fatal} -- Describes a critical event that will lead to a software failure/crash.
|
||||
\end{itemize}
|
||||
|
||||
The \texttt{TracyMessage} macros will log messages with the severity \texttt{Info}. To log a message with a different severity, you may use the \texttt{TracyLogString} macro that regroups all the functionalities from the previous macros. We recommend writing your own macros, wrapping the different severities for easier use. You may provide a color of 0 if you do not want to set a color for this message.
|
||||
|
||||
Examples:
|
||||
\begin{lstlisting}
|
||||
std::string dynStr = "Trace using a dynamic string, blue color, no callstack";
|
||||
TracyLogString( tracy::MessageSeverity::Trace, 0xFF, 0, dynStr.size(), dynStr.c_str() );
|
||||
TracyLogString( tracy::MessageSeverity::Warning, 0, TRACY_CALLSTACK, "Warning using a string litteral, no color, capturing the callstack to a depth of TRACY_CALLSTACK" );
|
||||
TracyLogString( tracy::MessageSeverity::Warning, 0, TRACY_CALLSTACK, "Warning using a string literal, no color, capturing the callstack to a depth of TRACY_CALLSTACK" );
|
||||
\end{lstlisting}
|
||||
|
||||
|
||||
@@ -1572,8 +1619,6 @@ void operator delete(void* ptr) noexcept
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
In some rare cases (e.g., destruction of TLS block), events may be reported after the profiler is no longer available, which would lead to a crash. To work around this issue, you may use \texttt{TracySecureAlloc} and \texttt{TracySecureFree} variants of the macros.
|
||||
|
||||
\begin{bclogo}[
|
||||
noborder=true,
|
||||
couleur=black!5,
|
||||
@@ -1607,10 +1652,12 @@ Sometimes an application will use more than one memory pool. For example, in add
|
||||
|
||||
To mark that a separate memory pool is to be tracked you should use the named version of memory macros, for example \texttt{TracyAllocN(ptr, size, name)} and \texttt{TracyFreeN(ptr, name)}, where \texttt{name} is an unique pointer to a string literal (section~\ref{uniquepointers}) identifying the memory pool.
|
||||
|
||||
Certain memory allocator designs (\enquote{arena allocators}) use an always-incrementing pointer to track the next region to allocate and do not support deallocation of individual objects. The only way to free memory with such an allocator is to simultaneously release all the objects that were allocated (reset the allocator state). You can mark such a mass-deallocation event in a memory pool with the \texttt{TracyMemoryDiscard(name)} macro.
|
||||
|
||||
\subsection{GPU profiling}
|
||||
\label{gpuprofiling}
|
||||
|
||||
Tracy provides bindings for profiling OpenGL, Vulkan, Direct3D 11, Direct3D 12, Metal, OpenCL and CUDA execution time on GPU.
|
||||
Tracy provides bindings for profiling OpenGL, Vulkan, Direct3D 11, Direct3D 12, Metal, OpenCL, CUDA and WebGPU execution time on GPU.
|
||||
|
||||
Note that the CPU and GPU timers may be unsynchronized unless you create a calibrated context, but the availability of calibrated contexts is limited. You can try to correct the desynchronization of uncalibrated contexts in the profiler's options (section~\ref{options}).
|
||||
|
||||
@@ -1666,6 +1713,12 @@ logo=\bcattention
|
||||
\end{itemize}
|
||||
\end{bclogo}
|
||||
|
||||
\subparagraph{Calibrated context}
|
||||
|
||||
By default, the OpenGL context is uncalibrated: the CPU and GPU clocks are aligned only once, when the context is created, so over long captures the two time domains may drift apart (section~\ref{options} describes correcting this drift manually). Defining \texttt{TRACY\_OPENGL\_AUTO\_CALIBRATION} before including \texttt{TracyOpenGL.hpp} enables periodic recalibration instead: roughly once per second Tracy samples the GPU and CPU clocks together and emits a calibration event, allowing the profiler to track and remove the drift automatically.
|
||||
|
||||
This is opt-in because OpenGL exposes no atomic CPU+GPU timestamp query (unlike Vulkan's \texttt{VK\_EXT\_calibrated\_timestamps} or Direct3D~12, whose contexts are always calibrated). Recalibration therefore reads the GPU clock with \texttt{glGetInteger64v(GL\_TIMESTAMP)}, which forces a CPU/GPU synchronization (a pipeline stall) each time it runs. Enable it only when the improved long-capture alignment is worth the periodic stall.
|
||||
|
||||
\subsubsection{Vulkan}
|
||||
|
||||
Similarly, for Vulkan support you should include the \texttt{public/tracy/TracyVulkan.hpp} header file. Tracing Vulkan devices and queues is a bit more involved, and the Vulkan initialization macro \texttt{TracyVkContext(physdev, device, queue, cmdbuf)} returns an instance of \texttt{TracyVkCtx} object, which tracks an associated Vulkan queue. Cleanup is performed using the \texttt{TracyVkDestroy(ctx)} macro. You may create multiple Vulkan contexts. To set a custom name for the context, use the \texttt{TracyVkContextName(ctx, name, size)} macro.
|
||||
@@ -1738,7 +1791,7 @@ Similar to Vulkan and OpenGL, you also need to periodically collect the OpenCL e
|
||||
|
||||
\subsubsection{CUDA}
|
||||
|
||||
CUDA support is enabled by including the \texttt{public/tracy/TracyCUDA.hpp} header file. To use it, the NVIDIA CUPTI library is required. This library comes with the NVIDIA CUDA Toolkit and is located at \texttt{CUDA\_INSTALLATION\_PATH/extras/CUPTI}.
|
||||
CUDA support is enabled by including the \texttt{public/tracy/TracyCUDA.hpp} header file. To use it, make sure you have the NVIDIA CUDA Toolkit v12.4 (or later) installed, and that the NVIDIA CUPTI library is available in the toolkit (located at \texttt{CUDA\_INSTALLATION\_PATH/extras/CUPTI}).
|
||||
|
||||
Tracing CUDA requires the creation of a Tracy CUDA context using the macro \texttt{TracyCUDAContext()}, which returns an instance of a \texttt{tracy::CUDACtx} object. TracyCUDA allows only a single \texttt{tracy::CUDACtx} object at any given time. Subsequent calls to \texttt{TracyCUDAContext()} will return the same reference-counted object. There is no need for clients to instantiate multiple \texttt{tracy::CUDACtx} objects, as a single context is capable of instrumenting all CUDA contexts and streams.
|
||||
|
||||
@@ -1750,6 +1803,16 @@ Unlike other GPU backends in Tracy, there is no need to call \texttt{TracyCUDACo
|
||||
|
||||
To stop profiling, call the \texttt{TracyCUDAStopProfiling(ctx)} macro.
|
||||
|
||||
\subsubsection{WebGPU}
|
||||
|
||||
WebGPU support is enabled by including the \texttt{public/tracy/TracyWebGPU.hpp} header file. Both major implementations of WebGPU (Dawn and wgpu-native) are supported.
|
||||
|
||||
Before creating the WebGPU device, make sure to call \texttt{TracyWebGPUSetupDeviceDescriptor()} to let Tracy request the necessary device features and extensions necessary for profiling. After the device is created, use the \texttt{TracyWebGPUContext()} macro to instantiate the necessary \texttt{WebGPUQueueCtx} object required for GPU instrumentation. The object should later be cleaned up with the \texttt{TracyWebGPUDestroy()} macro. To set a custom name for the context, use the \texttt{TracyWebGPUContextName()} macro.
|
||||
|
||||
To instrument a GPU zone, use the various \texttt{TracyWebGPU*Zone*()} macros. Note that WebGPU only offers command instrumentation at the \enquote{pass}-level. While command-level granularity is possible through implementation-specific WebGPU extensions, Tracy does not support it at the moment. Supply the corresponding WebGPU pass descriptor to the instrumentation macro \textit{before} creating the WebGPU pass encoder.
|
||||
|
||||
You are required to periodically collect the GPU events using the \texttt{TracyWebGPUCollect()} macro. Good places for collection are: after synchronous waits, after event processing \texttt{wgpuInstanceProcessEvents}, after present drawable calls (\texttt{wgpuSurfacePresent}), and inside the completion callback of command queues (\texttt{wgpuQueueOnSubmittedWorkDone}).
|
||||
|
||||
\subsubsection{ROCm}
|
||||
|
||||
On Linux, if rocprofiler-sdk is installed, tracy can automatically trace GPU dispatches and collect
|
||||
@@ -1783,13 +1846,13 @@ sudo amd-smi set -g 0 -l stable_std
|
||||
|
||||
Putting more than one GPU zone macro in a single scope features the same issue as with the \texttt{ZoneScoped} macros, described in section~\ref{multizone} (but this time the variable name is \texttt{\_\_\_tracy\_gpu\_zone}).
|
||||
|
||||
To solve this problem, in case of OpenGL use the \texttt{TracyGpuNamedZone} macro in place of \texttt{TracyGpuZone} (or the color variant). The same applies to Vulkan, Direct3D 11/12 and Metal -- replace \texttt{TracyVkZone} with \texttt{TracyVkNamedZone}, \texttt{TracyD3D11Zone}/\texttt{TracyD3D12Zone} with \texttt{TracyD3D11NamedZone}/\texttt{TracyD3D12NamedZone}, and \texttt{TracyMetalZone} with \texttt{TracyMetalNamedZone}.
|
||||
To solve this problem, in case of OpenGL use the \texttt{TracyGpuNamedZone} macro in place of \texttt{TracyGpuZone} (or the color variant). The same applies to Vulkan, Direct3D 11/12, Metal and WebGPU -- replace \texttt{TracyVkZone} with \texttt{TracyVkNamedZone}, \texttt{TracyD3D11Zone}/\texttt{TracyD3D12Zone} with \texttt{TracyD3D11NamedZone}/\texttt{TracyD3D12NamedZone}, \texttt{TracyMetalZone} with \texttt{TracyMetalNamedZone}, and \texttt{TracyWebGPUZone} with \texttt{TracyWebGPUNamedZone}.
|
||||
|
||||
Remember to provide your name for the created stack variable as the first parameter to the macros.
|
||||
|
||||
\subsubsection{Transient GPU zones}
|
||||
|
||||
Transient zones (see section~\ref{transientzones} for details) are available in OpenGL, Vulkan, and Direct3D 11/12 macros. Transient zones are not available for Metal at this moment.
|
||||
Transient zones (see section~\ref{transientzones} for details) are available in OpenGL, Vulkan, Direct3D 11/12 and WebGPU macros. Transient zones are not available for Metal at this moment.
|
||||
|
||||
\subsection{Fibers}
|
||||
\label{fibers}
|
||||
@@ -1836,7 +1899,7 @@ As you can see, there are two threads, \texttt{t1} and \texttt{t2}, which are si
|
||||
\subsection{Collecting call stacks}
|
||||
\label{collectingcallstacks}
|
||||
|
||||
Capture of true calls stacks can be performed by using macros with the \texttt{S} postfix, which require an additional parameter, specifying the depth of call stack to be captured. The greater the depth, the longer it will take to perform capture. Currently you can use the following macros: \texttt{ZoneScopedS}, \texttt{ZoneScopedNS}, \texttt{ZoneScopedCS}, \texttt{ZoneScopedNCS}, \texttt{TracyAllocS}, \texttt{TracyFreeS}, \texttt{TracySecureAllocS}, \texttt{TracySecureFreeS}, \texttt{TracyMessageS}, \texttt{TracyMessageLS}, \texttt{TracyMessageCS}, \texttt{TracyMessageLCS}, \texttt{TracyGpuZoneS}, \texttt{TracyGpuZoneCS}, \texttt{TracyVkZoneS}, \texttt{TracyVkZoneCS}, and the named and transient variants.
|
||||
Capture of true calls stacks can be performed by using macros with the \texttt{S} postfix, which require an additional parameter, specifying the depth of call stack to be captured. The greater the depth, the longer it will take to perform capture. Currently you can use the following macros: \texttt{ZoneScopedS}, \texttt{ZoneScopedNS}, \texttt{ZoneScopedCS}, \texttt{ZoneScopedNCS}, \texttt{TracyAllocS}, \texttt{TracyFreeS}, \texttt{TracyMessageS}, \texttt{TracyMessageLS}, \texttt{TracyMessageCS}, \texttt{TracyMessageLCS}, \texttt{TracyGpuZoneS}, \texttt{TracyGpuZoneCS}, \texttt{TracyVkZoneS}, \texttt{TracyVkZoneCS}, and the named and transient variants.
|
||||
|
||||
Be aware that call stack collection is a relatively slow operation. Table~\ref{CallstackTimes} and figure~\ref{CallstackPlot} show how long it took to perform a single capture of varying depth on multiple CPU architectures.
|
||||
|
||||
@@ -1982,7 +2045,7 @@ void DbgHelpUnlock() { ReleaseMutex(dbgHelpLock); }
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
At initilization time, tracy will attempt to preload symbols for device drivers and process modules. As this process can be slow when a lot of pdbs are involved, you can set the \texttt{TRACY\_NO\_DBGHELP\_INIT\_LOAD} environment variable to "1" to disable this behavior and rely on-demand symbol loading.
|
||||
At initilization time, tracy will attempt to preload symbols for device drivers and process modules. As this process can be slow when a lot of pdbs are involved, you can set the \texttt{TRACY\_NO\_DBGHELP\_INIT\_LOAD} environment variable to \enquote{1} to disable this behavior and rely on-demand symbol loading.
|
||||
|
||||
\paragraph{Disabling resolution of inline frames}
|
||||
|
||||
@@ -2006,6 +2069,20 @@ filesystem setup as the one used to run the tracy instrumented application).
|
||||
You can do path substitution with the \texttt{-p} option to perform any number of path
|
||||
substitions in order to use symbols located elsewhere.
|
||||
|
||||
By default symbol resolution is performed with the platform's native facility: the DbgHelp
|
||||
library on Windows, and the \texttt{addr2line} tool found in \texttt{PATH} elsewhere. You can
|
||||
override this with the \texttt{-a} option, passing the path to a custom
|
||||
\texttt{addr2line}-compatible tool (for instance an \texttt{addr2line} from a cross-compilation
|
||||
toolchain, or \texttt{llvm-addr2line}). The \texttt{-a} option works on all platforms, including
|
||||
Windows, and takes precedence over the platform default.
|
||||
|
||||
Extra arguments can be passed verbatim to the resolution tool with the \texttt{-A} option. Tracy
|
||||
records callstack frame offsets relative to the image base, but \texttt{addr2line}-compatible
|
||||
tools expect a full virtual address for images that have a non-zero preferred image base (such as
|
||||
PE on Windows or Mach-O on Apple). For these, pass \texttt{-A "--relative-address"} so that
|
||||
\texttt{llvm-addr2line} or \texttt{llvm-symbolizer} adds the image base back. ELF images need no
|
||||
such adjustment.
|
||||
|
||||
\begin{bclogo}[
|
||||
noborder=true,
|
||||
couleur=black!5,
|
||||
@@ -2218,6 +2295,31 @@ TracyCLockAfterUnlock(tracy_lock_ctx);
|
||||
|
||||
You can optionally mark the location of where the lock is held by using the \texttt{TracyCLockMark} macro, this should be done after acquiring the lock.
|
||||
|
||||
Similarly, you can use the following macros to mark a shared lock using the C API:
|
||||
\begin{itemize}
|
||||
\item \texttt{TracyCSharedLockAnnounce(lock\_ctx)}
|
||||
\item \texttt{TracyCSharedLockTerminate(lock\_ctx)}
|
||||
\item \texttt{TracyCSharedLockBeforeLock(lock\_ctx)}
|
||||
\item \texttt{TracyCSharedLockAfterLock(lock\_ctx)}
|
||||
\item \texttt{TracyCSharedLockAfterUnlock(lock\_ctx)}
|
||||
\item \texttt{TracyCSharedLockAfterTryLock(lock\_ctx, acquired)}
|
||||
\item \texttt{TracyCSharedLockBeforeSharedLock(lock\_ctx)}
|
||||
\item \texttt{TracyCSharedLockAfterSharedLock(lock\_ctx)}
|
||||
\item \texttt{TracyCSharedLockAfterSharedUnlock(lock\_ctx)}
|
||||
\item \texttt{TracyCSharedLockAfterTrySharedLock(lock\_ctx, acquired)}
|
||||
\item \texttt{TracyCSharedLockMark(lock\_ctx)}
|
||||
\item \texttt{TracyCSharedLockCustomName(lock\_ctx, name, size)}
|
||||
\end{itemize}
|
||||
|
||||
A shared lock context has to be defined next to the shared lock that it will be marking:
|
||||
\begin{lstlisting}
|
||||
TracyCSharedLockCtx tracy_shared_lock_ctx;
|
||||
HANDLE shared_lock;
|
||||
\end{lstlisting}
|
||||
|
||||
The same rules apply to shared locks as to regular locks, but you need to use the shared lock macros instead.
|
||||
Lock implementations in classes \texttt{Lockable} and \texttt{SharedLockable} show how to properly perform context handling.
|
||||
|
||||
\subsubsection{Memory profiling}
|
||||
\label{cmemoryprofiling}
|
||||
|
||||
@@ -2226,8 +2328,6 @@ Use the following macros in your implementations of \texttt{malloc} and \texttt{
|
||||
\begin{itemize}
|
||||
\item \texttt{TracyCAlloc(ptr, size)}
|
||||
\item \texttt{TracyCFree(ptr)}
|
||||
\item \texttt{TracyCSecureAlloc(ptr, size)}
|
||||
\item \texttt{TracyCSecureFree(ptr)}
|
||||
\end{itemize}
|
||||
|
||||
Correctly using this functionality can be pretty tricky. You also will need to handle all the memory allocations made by external libraries (which typically allow usage of custom memory allocation functions) and the allocations made by system functions. If you can't track such an allocation, you will need to make sure freeing is not reported\footnote{It's not uncommon to see a pattern where a system function returns some allocated memory, which you then need to release.}.
|
||||
@@ -2289,7 +2389,7 @@ To see how you should use this API, you should look at the reference implementat
|
||||
couleur=black!5,
|
||||
logo=\bcbombe
|
||||
]{Important}
|
||||
A common mistake is to skip the zone "\texttt{isActive}" check. When using \texttt{TRACY\_ON\_DEMAND}, you need to read the value of \texttt{TracyCIsConnected} once, and check the same value for both \newline \texttt{\_\_\_tracy\_emit\_gpu\_zone\_begin\_alloc} and \texttt{\_\_\_tracy\_emit\_gpu\_zone\_end}. Tracy may otherwise receive a zone end without a zone begin.
|
||||
A common mistake is to skip the zone \enquote{\texttt{isActive}} check. When using \texttt{TRACY\_ON\_DEMAND}, you need to read the value of \texttt{TracyCIsConnected} once, and check the same value for both \newline \texttt{\_\_\_tracy\_emit\_gpu\_zone\_begin\_alloc} and \texttt{\_\_\_tracy\_emit\_gpu\_zone\_end}. Tracy may otherwise receive a zone end without a zone begin.
|
||||
\end{bclogo}
|
||||
|
||||
\subsubsection{Fibers}
|
||||
@@ -2320,10 +2420,7 @@ Tracy C API exposes functions with the \texttt{\_\_\_tracy} prefix that you may
|
||||
\item \texttt{\_\_\_tracy\_alloc\_srcloc\_name(uint32\_t line, const char* source, size\_t sourceSz, const char* function, size\_t functionSz, const char* name, size\_t nameSz)}
|
||||
\end{itemize}
|
||||
|
||||
Here \texttt{line} is line number in the \texttt{source} source file and \texttt{function} is the
|
||||
name of a function in which the zone is created. \texttt{sourceSz} and \texttt{functionSz} are the
|
||||
size of the corresponding string arguments in bytes. You may additionally specify an optional zone
|
||||
name, by providing it in the \texttt{name} variable, and specifying its size in \texttt{nameSz}.
|
||||
Here \texttt{line} is line number in the \texttt{source} source file and \texttt{function} is the name of a function in which the zone is created. \texttt{sourceSz} and \texttt{functionSz} are the sizes of the corresponding string arguments in bytes. You may additionally specify an optional zone name by providing it in the \texttt{name} variable and specifying its size in \texttt{nameSz}. If the passed strings contain null-terminating characters, these characters must be excluded from the provided sizes.
|
||||
|
||||
The \texttt{\_\_\_tracy\_alloc\_srcloc} and \texttt{\_\_\_tracy\_alloc\_srcloc\_name} functions
|
||||
return an \texttt{uint64\_t} source location identifier corresponding to an \emph{allocated source
|
||||
@@ -2502,16 +2599,17 @@ Set the following environment variables before launching (or export them in your
|
||||
PYTHONPATH=/path/to/tracy/build/python/Release
|
||||
TRACY_CAPTURES_DIR=/path/to/captures # enables list_captures
|
||||
TRACY_MCP_PORT=47380 # optional; default 47380
|
||||
TRACY_MCP_TRANSPORT=streamable-http # optional; streamable-http or sse
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection{Integrating with an AI assistant}
|
||||
|
||||
The server runs as a singleton on SSE transport (port 47380 by default). Only one process loads \texttt{TracyServerBindings} regardless of how many editor windows are open; subsequent launches detect the port is taken and exit immediately.
|
||||
The server runs as a singleton on Streamable HTTP transport (port 47380 by default). Only one process loads \texttt{TracyServerBindings} regardless of how many editor windows are open; subsequent launches detect the port is taken and exit immediately. Set \texttt{TRACY\_MCP\_TRANSPORT=sse} before launching to use the legacy SSE transport instead.
|
||||
|
||||
The server prints its URL on startup and writes it to \texttt{extra/mcp/tracy\_mcp.port}:
|
||||
|
||||
\begin{lstlisting}
|
||||
Tracy MCP listening on http://127.0.0.1:47380/sse
|
||||
Tracy MCP listening on http://127.0.0.1:47380/mcp
|
||||
\end{lstlisting}
|
||||
|
||||
Configure your AI assistant using that URL. For example, for a JSON-based MCP configuration:
|
||||
@@ -2520,7 +2618,7 @@ Configure your AI assistant using that URL. For example, for a JSON-based MCP co
|
||||
{
|
||||
"mcpServers": {
|
||||
"tracy": {
|
||||
"url": "http://127.0.0.1:47380/sse"
|
||||
"url": "http://127.0.0.1:47380/mcp"
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -2787,8 +2885,8 @@ logo=\bclampe
|
||||
Use the following calls in your implementations of allocator/deallocator:
|
||||
|
||||
\begin{itemize}
|
||||
\item \texttt{tracy\_memory\_alloc(ptr, size, name, depth, secure)}
|
||||
\item \texttt{tracy\_memory\_free(ptr, name, depth, secure)}
|
||||
\item \texttt{tracy\_memory\_alloc(ptr, size, name, depth)}
|
||||
\item \texttt{tracy\_memory\_free(ptr, name, depth)}
|
||||
\end{itemize}
|
||||
|
||||
Correctly using this functionality can be pretty tricky especially in Fortran.
|
||||
@@ -2844,7 +2942,7 @@ Tracy will perform an automatic collection of system data without user intervent
|
||||
|
||||
Some profiling data can only be retrieved using the kernel facilities, which are not available to users with normal privilege level. To collect such data, you will need to elevate your rights to the administrator level. You can do so either by running the profiled program from the \texttt{root} account on Unix or through the \emph{Run as administrator} option on Windows\footnote{To make this easier, you can run MSVC with admin privileges, which will be inherited by your program when you start it from within the IDE.}. On Android, you will need to have a rooted device (see section~\ref{androidlunacy} for additional information).
|
||||
|
||||
As this system-level tracing functionality is part of the automated collection process, no user intervention is necessary to enable it (assuming that the program was granted the rights needed). However, if, for some reason, you would want to prevent your application from trying to access kernel data, you may recompile your program with the \texttt{TRACY\_NO\_SYSTEM\_TRACING} define. If you want to disable this functionality dynamically at runtime instead, you can set the \texttt{TRACY\_NO\_SYSTEM\_TRACING} environment variable to "1".
|
||||
As this system-level tracing functionality is part of the automated collection process, no user intervention is necessary to enable it (assuming that the program was granted the rights needed). However, if, for some reason, you would want to prevent your application from trying to access kernel data, you may recompile your program with the \texttt{TRACY\_NO\_SYSTEM\_TRACING} define. If you want to disable this functionality dynamically at runtime instead, you can set the \texttt{TRACY\_NO\_SYSTEM\_TRACING} environment variable to \enquote{1}.
|
||||
|
||||
\begin{bclogo}[
|
||||
noborder=true,
|
||||
@@ -2909,7 +3007,7 @@ By default, sampling is performed at 8 kHz frequency on Windows (the maximum pos
|
||||
|
||||
Call stack sampling may be disabled by using the \texttt{TRACY\_NO\_SAMPLING} define.
|
||||
|
||||
When enabled, by default, sampling starts at the beginning of the application and ends with it. You can instead have programmatic (manual) control over when sampling should begin and end by defining \texttt{TRACY\_SAMPLING\_PROFILER\_MANUAL\_START} when compiling \texttt{TracyClient.cpp}. Use \texttt{tracy::BeginSamplingProfiling()} and \texttt{tracy::EndSamplingProfiling()} to control it. There are C interfaces for it as well: \texttt{TracyCBeginSamplingProfiling()} and \texttt{TracyCEndSamplingProfiling()}.
|
||||
When enabled, by default, sampling starts at the beginning of the application and ends with it. You can instead have programmatic (manual) control over when sampling should begin and end by defining \texttt{TRACY\_SAMPLING\_PROFILER\_MANUAL\_START} when compiling \texttt{TracyClient.cpp}. You can then use \texttt{tracy::BeginSamplingProfiling()} and \texttt{tracy::EndSamplingProfiling()} to control it. There are C interfaces for it as well: \texttt{TracyCBeginSamplingProfiling()} and \texttt{TracyCEndSamplingProfiling()}.
|
||||
|
||||
\begin{bclogo}[
|
||||
noborder=true,
|
||||
@@ -2996,7 +3094,7 @@ On Linux, Tracy will override the \texttt{dlclose} function call to prevent shar
|
||||
|
||||
\subsubsection{Vertical synchronization}
|
||||
|
||||
On Windows and Linux, Tracy will automatically capture hardware Vsync events, provided that the application has access to the kernel data (privilege elevation may be needed, see section~\ref{privilegeelevation}). These events will be reported as '\texttt{[x] Vsync}' frame sets, where \texttt{x} is the identifier of a specific monitor. Note that hardware vertical synchronization might not correspond to the one seen by your application due to desktop composition, command queue buffering, and so on. Also, in some instances, when there is nothing to update on the screen, the graphic driver may choose to stop issuing screen refresh. As a result, there may be periods where no vertical synchronization events are reported.
|
||||
On Windows and Linux, Tracy will automatically capture hardware Vsync events, provided that the application has access to the kernel data (privilege elevation may be needed, see section~\ref{privilegeelevation}). These events will be reported as \enquote{\texttt{[x] Vsync}} frame sets, where \texttt{x} is the identifier of a specific monitor. Note that hardware vertical synchronization might not correspond to the one seen by your application due to desktop composition, command queue buffering, and so on. Also, in some instances, when there is nothing to update on the screen, the graphic driver may choose to stop issuing screen refresh. As a result, there may be periods where no vertical synchronization events are reported.
|
||||
|
||||
Use the \texttt{TRACY\_NO\_VSYNC\_CAPTURE} macro to disable capture of Vsync events.
|
||||
|
||||
@@ -3009,9 +3107,15 @@ Sometimes it is desired to change how the profiled application behaves during th
|
||||
void Callback(void* data, uint32_t idx, int32_t val)
|
||||
\end{lstlisting}
|
||||
|
||||
The \texttt{data} parameter will have the same value as was specified in the macro. The \texttt{idx} argument is an user-defined parameter index and \texttt{val} is the value set in the profiler user interface.
|
||||
The \texttt{data} parameter will have the same value as was specified in the macro. The \texttt{idx} argument is an user-defined parameter index and \texttt{val} is the value set in the profiler user interface (in the connection information popup, see section~\ref{connectionpopup}).
|
||||
|
||||
To specify individual parameters, use the \texttt{TracyParameterSetup(idx, name, isBool, val)} macro. The \texttt{idx} value will be passed to the callback function for identification purposes (Tracy doesn't care what it's set to). \texttt{Name} is the parameter label, displayed on the list of parameters. Finally, \texttt{isBool} determines if \texttt{val} should be interpreted as a boolean value, or as an integer number.
|
||||
To specify individual parameters, use the \texttt{TracyParameterSetup(idx, name, type, val)} macro. The \texttt{idx} value will be passed to the callback function for identification purposes (Tracy doesn't care what it's set to), \texttt{name} is the parameter label, displayed on the list of parameters, and \texttt{val} is the initial value. Finally, \texttt{type} determines how to interpret the \texttt{val} value, and can be selected from:
|
||||
|
||||
\begin{itemize}
|
||||
\item \texttt{TracyParamTypeInt} -- \texttt{val} is an integer value, with the profiler UI displaying a numerical entry field.
|
||||
\item \texttt{TracyParamTypeBool} -- \texttt{val} is a boolean value, with a checkbox in the user interface.
|
||||
\item \texttt{TracyParamTypeTrigger} -- a~\emph{\faCircleDot{}~Trigger} button is displayed, for cases when you just want some action to happen. Repeats the initial \texttt{val} value provided to the setup function.
|
||||
\end{itemize}
|
||||
|
||||
\begin{bclogo}[
|
||||
noborder=true,
|
||||
@@ -3150,7 +3254,7 @@ If you want to look at the profile data in real-time (or load a saved trace file
|
||||
|
||||
The \emph{\faWrench{}~Wrench} button opens the about dialog, which also contains a number of global settings you may want to tweak (section~\ref{aboutwindow}).
|
||||
|
||||
The client \emph{address entry} field and the \faWifi{}~\emph{Connect} button are used to connect to a running client\footnote{Note that a custom port may be provided here, for example by entering '127.0.0.1:1234'.}. You can use the connection history button~\faCaretDown{} to display a list of commonly used targets, from which you can quickly select an address. You can remove entries from this list by hovering the \faArrowPointer{}~mouse cursor over an entry and pressing the \keys{\del} button on the keyboard.
|
||||
The client \emph{address entry} field and the \faWifi{}~\emph{Connect} button are used to connect to a running client\footnote{Note that a custom port may be provided here, for example by entering \enquote{127.0.0.1:1234}.}. You can use the connection history button~\faCaretDown{} to display a list of commonly used targets, from which you can quickly select an address. You can remove entries from this list by hovering the \faArrowPointer{}~mouse cursor over an entry and pressing the \keys{\del} button on the keyboard.
|
||||
|
||||
If you want to open a trace that you have stored on the disk, you can do so by pressing the \faFolderOpen{}~\emph{Open saved trace} button.
|
||||
|
||||
@@ -3235,7 +3339,7 @@ You can use the \faFloppyDisk{}~\emph{Save trace} button to save the current pro
|
||||
|
||||
If frame image capture has been implemented (chapter~\ref{frameimages}), a thumbnail of the last received frame image will be provided for reference.
|
||||
|
||||
Suppose the profiled application opted to provide trace parameters (see section~\ref{traceparameters}) and the connection is still active. In that case, this pop-up will also contain a \emph{trace parameters} section, listing all the provided options. A callback function will be executed on the client when you change any value here.
|
||||
Suppose the profiled application opted to provide trace parameters (see section~\ref{traceparameters}) and the connection is still active. In that case, this pop-up will also contain a \emph{Trace parameters} section, listing all the provided options. A callback function will be executed on the client when you change any value here.
|
||||
|
||||
\subsubsection{Automatic loading or connecting}
|
||||
|
||||
@@ -3550,7 +3654,7 @@ The following three items show the \emph{\faEye{}~view time range}, the \emph{\f
|
||||
|
||||
\paragraph{Notification area}
|
||||
|
||||
The notification area displays informational notices, for example, how long it took to load a trace from the disk. The three pulsing dots indicator shows that some background tasks are being performed that may need to be completed before full capabilities of the profiler are available. If a crash was captured during profiling (section~\ref{crashhandling}), a \emph{\faSkull{}~crash} icon will be displayed. The red \faSatelliteDish{}~icon indicates that queries are currently being backlogged, while the same yellow icon indicates that some queries are currently in-flight (see chapter~\ref{connectionpopup} for more information).
|
||||
The notification area displays informational notices, for example, how long it took to load a trace from the disk. The three pulsing dots indicator shows that some background tasks are being performed that may need to be completed before full capabilities of the profiler are available. If a crash was captured during profiling (section~\ref{crashhandling}), a \emph{\faSkull{}~crash} icon will be displayed. You can click this icon to see the crash call stack. The red \faSatelliteDish{}~icon indicates that queries are currently being backlogged, while the same yellow icon indicates that some queries are currently in-flight (see chapter~\ref{connectionpopup} for more information).
|
||||
|
||||
If the drawing of timeline elements was disabled in the options menu (section~\ref{options}), the profiler will use the following orange icons to remind you about that fact. Click on the icons to enable drawing of the selected elements. Note that collapsed labels (section~\ref{zoneslocksplots}) are not taken into account here.
|
||||
|
||||
@@ -3797,10 +3901,10 @@ You will find the zones with locks and their associated threads on this combined
|
||||
The left-hand side \emph{index area} of the timeline view displays various labels (threads, locks), which can be categorized in the following way:
|
||||
|
||||
\begin{itemize}
|
||||
\item \emph{Light blue label} -- GPU context. Multi-threaded Vulkan, OpenCL, Direct3D 12 and Metal contexts are additionally split into separate threads.
|
||||
\item \emph{Light blue label} -- GPU context. Multi-threaded Vulkan, OpenCL, Direct3D 12, Metal and WebGPU contexts are additionally split into separate threads.
|
||||
\item \emph{Pink label} -- CPU data graph.
|
||||
\item \emph{White label} -- A CPU thread. It will be replaced by a bright red label in a thread that has crashed (section~\ref{crashhandling}). If automated sampling was performed, clicking the~\LMB{}~left mouse button on the \emph{\faGhost{}~ghost zones} button will switch zone display mode between 'instrumented' and 'ghost.'
|
||||
\item \emph{Green label} -- Fiber, coroutine, or any other sort of cooperative multitasking 'green thread.'
|
||||
\item \emph{White label} -- A CPU thread. It will be replaced by a bright red label in a thread that has crashed (section~\ref{crashhandling}). If automated sampling was performed, clicking the~\LMB{}~left mouse button on the \emph{\faGhost{}~ghost zones} button will switch zone display mode between \enquote{instrumented} and \enquote{ghost.}
|
||||
\item \emph{Green label} -- Fiber, coroutine, or any other sort of cooperative multitasking \enquote{green thread.}
|
||||
\item \emph{Light red label} -- Indicates a lock.
|
||||
\item \emph{Yellow label} -- Plot.
|
||||
\end{itemize}
|
||||
@@ -3819,7 +3923,7 @@ In an example in figure~\ref{zoneslocks} you can see that there are two threads:
|
||||
|
||||
Meanwhile, the \emph{Streaming thread} is performing some \emph{Streaming jobs}. The first \emph{Streaming job} sent a message (section~\ref{messagelog}). In addition to being listed in the message log, it is indicated by a triangle over the thread separator. When multiple messages are in one place, the triangle outline shape changes to a filled triangle.
|
||||
|
||||
The GPU zones are displayed just like CPU zones, with an OpenGL/Vulkan/Direct3D/Metal/OpenCL context in place of a thread name.
|
||||
The GPU zones are displayed just like CPU zones, with an OpenGL/Vulkan/Direct3D/Metal/OpenCL/CUDA/WebGPU context in place of a thread name.
|
||||
|
||||
Hovering the \faArrowPointer{} mouse pointer over a zone will highlight all other zones that have the exact source location with a white outline. Clicking the \LMB{}~left mouse button on a zone will open the zone information window (section~\ref{zoneinfo}). Holding the \keys{\ctrl} key and clicking the \LMB{}~left mouse button on a zone will open the zone statistics window (section~\ref{findzone}). Clicking the \MMB{}~middle mouse button on a zone will zoom the view to the extent of the zone.
|
||||
|
||||
@@ -3975,7 +4079,7 @@ To define a time range, drag the \LMB{}~left mouse button over the timeline view
|
||||
\item \emph{\faMagnifyingGlass{}~Limit find zone time range} -- this will limit find zone results. See chapter~\ref{findzone} for more details.
|
||||
\item \emph{\faArrowUpWideShort{}~Limit statistics time range} -- selecting this option will limit statistics results. See chapter~\ref{statistics} for more details.
|
||||
\item \emph{\faFire{}~Limit flame graph time range} -- limits flame graph results. Refer to chapter~\ref{flamegraph}.
|
||||
\item \emph{\faHourglassHalf{}~Limit wait stacks time range} -- limits wait stacks results. Refer to chapter~\ref{waitstackswindow}.
|
||||
\item \emph{\faHourglassHalf{}~Limit wait stacks time range} -- limits wait stacks results. Refer to chapter~\ref{stackwindows}.
|
||||
\item \emph{\faMemory{}~Limit memory time range} -- limits memory results. Read more about this in chapter~\ref{memorywindow}.
|
||||
\item \emph{\faNoteSticky{}~Add annotation} -- use to annotate regions of interest, as described in chapter~\ref{annotatingtrace}.
|
||||
\end{itemize}
|
||||
@@ -3991,7 +4095,7 @@ You can freely adjust each time range on the timeline by clicking the \LMB{}~lef
|
||||
|
||||
Tracy allows adding custom notes to the trace. For example, you may want to mark a region to ignore because the application was out-of-focus or a region where a new user was connecting to the game, which resulted in a frame drop that needs to be investigated.
|
||||
|
||||
Methods of specifying the annotation region are described in section~\ref{timeranges}. When a new annotation is added, a settings window is displayed (section~\ref{annotationsettings}), allowing you to enter a description.
|
||||
Methods of specifying the annotation region are described in section~\ref{timeranges}. When a new annotation is added, it is assigned a semi-unique random name to make it distinguishable. The settings window is also opened (section~\ref{annotationsettings}), allowing you to enter your own description of the annotation.
|
||||
|
||||
Annotations are displayed on the timeline, as presented in figure~\ref{annotation}. Clicking on the circle next to the text description will open the annotation settings window, in which you can modify or remove the region. List of all annotations in the trace is available in the annotations list window described in section~\ref{annotationlist}, which is accessible through the \emph{\faScrewdriverWrench{} Tools} button on the control menu.
|
||||
|
||||
@@ -4007,7 +4111,7 @@ Annotations are displayed on the timeline, as presented in figure~\ref{annotatio
|
||||
\label{annotation}
|
||||
\end{figure}
|
||||
|
||||
Please note that while the annotations persist between profiling sessions, they are not saved in the trace but in the user data files, as described in section~\ref{tracespecific}.
|
||||
Please note that while the annotations persist between profiling sessions, they are not saved in the trace but in the trace sidecar file, as described in section~\ref{tracespecific}.
|
||||
|
||||
\subsection{Options menu}
|
||||
\label{options}
|
||||
@@ -4028,7 +4132,7 @@ In this window, you can set various trace-related options. For example, the time
|
||||
\begin{itemize}
|
||||
\item \emph{\faSignature{} Draw CPU usage graph} -- You can disable drawing of the CPU usage graph here.
|
||||
\end{itemize}
|
||||
\item \emph{\faEye{} Draw GPU zones} -- Allows disabling display of OpenGL/Vulkan/Metal/Direct3D/OpenCL zones. The \emph{GPU zones} drop-down allows disabling individual GPU contexts and setting CPU/GPU drift offsets of uncalibrated contexts (see section~\ref{gpuprofiling} for more information). The \emph{\faRobot~Auto} button automatically measures the GPU drift value\footnote{There is an assumption that drift is linear. Automated measurement calculates and removes change over time in delay-to-execution of GPU zones. Resulting value may still be incorrect.}.
|
||||
\item \emph{\faEye{} Draw GPU zones} -- Allows disabling display of OpenGL/Vulkan/Metal/Direct3D/OpenCL/CUDA/WebGPU zones. The \emph{GPU zones} drop-down allows disabling individual GPU contexts and setting CPU/GPU drift offsets of uncalibrated contexts (see section~\ref{gpuprofiling} for more information). The \emph{\faRobot~Auto} button automatically measures the GPU drift value\footnote{There is an assumption that drift is linear. Automated measurement calculates and removes change over time in delay-to-execution of GPU zones. Resulting value may still be incorrect.}.
|
||||
\item \emph{\faMicrochip{} Draw CPU zones} -- Determines whether CPU zones are displayed.
|
||||
\begin{itemize}
|
||||
\item \emph{\faGhost{} Draw ghost zones} -- Controls if ghost zones should be displayed in threads which don't have any instrumented zones available.
|
||||
@@ -4078,9 +4182,9 @@ You can filter the message list in the following ways:
|
||||
|
||||
\begin{itemize}
|
||||
\item By the originating thread in the \emph{\faShuffle{} Visible threads} drop-down.
|
||||
\item By matching the message text to the expression in the \emph{\faFilter{}~Filter messages} entry field. Multiple filter expressions can be comma-separated (e.g. 'warn, info' will match messages containing strings 'warn' \emph{or} 'info'). You can exclude matches by preceding the term with a minus character (e.g., '-debug' will hide all messages containing the string 'debug').
|
||||
\item By message source, distinguishing between user messages and internal Tracy diagnostics.
|
||||
\item By severity level: \emph{Trace}, \emph{Debug}, \emph{Info}, \emph{Warning}, \emph{Error}, or \emph{Fatal}.
|
||||
\item By matching the message text to the expression in the \emph{\faFilter{}~Filter messages} entry field. Multiple filter expressions can be comma-separated (e.g. \enquote{warn, info} will match messages containing strings \enquote{warn} \emph{or} \enquote{info}). You can exclude matches by preceding the term with a minus character (e.g., \enquote{-debug} will hide all messages containing the string \enquote{debug}).
|
||||
\item By message source, distinguishing between \emph{\faUser{}~User} messages and internal \emph{\faMicroscope{}~Tracy} diagnostics.
|
||||
\item By severity level: \emph{\faShoePrints{}~Trace}, \emph{\faBug{}~Debug}, \emph{\faInfo{}~Info}, \emph{\faTriangleExclamation{}~Warning}, \emph{\faCircleXmark{}~Error}, or \emph{\faSkullCrossbones{}~Fatal}.
|
||||
\end{itemize}
|
||||
|
||||
\subsection{Statistics window}
|
||||
@@ -4111,9 +4215,9 @@ Data displayed in this mode is, in essence, very similar to the instrumentation
|
||||
|
||||
First and foremost, the presented information is constructed from many call stack samples, which represent real addresses in the application's binary code, mapped to the line numbers in the source files. This reverse mapping may not always be possible or could be erroneous. Furthermore, due to the nature of the sampling process, it is impossible to obtain exact time measurements. Instead, time values are guesstimated by multiplying the number of sample counts by mean time between two different samples.
|
||||
|
||||
The sample statistics list symbols, not functions. These terms are similar, but not exactly the same. A symbol always has a base function that gives it its name. In most cases, a symbol will also contain a number of inlined functions. In some cases, the same function may be inlined more than once within the same symbol.
|
||||
The sample statistics list symbols, not functions. These terms are similar but not exactly the same. A symbol always has a base function that gives it its name. In most cases, a symbol will also contain a number of inlined functions. In some cases, the same function may be inlined more than once within the same symbol. Inspecting the local call stacks displayed in tooltips will show the specific paths by which these inlines were called within the symbol. See section~\ref{assemblymode} for more detail.
|
||||
|
||||
The \emph{Name} column contains name of the symbol in which the sampling was done. Kernel-mode symbol samples are distinguished with the red color. Symbols containing inlined functions are listed with the number of inlined functions in parentheses and can be expanded to show all inlined functions (some functions may be hidden if the \emph{\faPuzzlePiece{}~Show all} option is disabled due to lack of sampling data). Clicking the \LMB{}~left mouse button on a function name will open a popup with options to select: you can either open the symbol view window (section~\ref{symbolview}), or the sample entry stacks window (see chapter~\ref{sampleparents})\footnote{Note that if inclusive times are displayed, listed functions will be partially or completely coming from mid-stack frames, preventing, or limiting the capability to display the data.}.
|
||||
The \emph{Name} column contains name of the symbol in which the sampling was done. Kernel-mode symbol samples are distinguished with the red color. Symbols containing inlined functions are listed with the number of inlined functions in parentheses and can be expanded to show all inlined functions (some functions may be hidden if the \emph{\faPuzzlePiece{}~Show all} option is disabled due to lack of sampling data). Clicking the \LMB{}~left mouse button on a function name will open a popup with options to select: you can either open the symbol view window (section~\ref{symbolview}), or the sample entry stacks window (see chapter~\ref{stackwindows})\footnote{Note that if inclusive times are displayed, listed functions will be partially or completely coming from mid-stack frames, preventing, or limiting the capability to display the data.}.
|
||||
|
||||
By default, each inlining of a function is listed separately. If you prefer to combine the measurements for functions that are inlined multiple times within a function, you can do so by enabling the \emph{\faLayerGroup{}~Aggregate} option. You cannot view sample entry stacks of inlined functions when this grouping method is enabled.
|
||||
|
||||
@@ -4429,6 +4533,8 @@ The default sorting order of the zones on a flame graph \emph{approximates} the
|
||||
|
||||
You can use an alternative sorting method by enabling the \emph{Sort by time} option. This will place the most time-consuming zones first (to the left) on the graph.
|
||||
|
||||
You can navigate the flame graph using the mouse. Use the mouse wheel to zoom in and out around the mouse pointer. Pressing the \keys{\ctrl} key makes zooming more precise, while pressing the \keys{\shift} key makes it faster. Dragging with the \RMB{}~right mouse button pans the graph horizontally and vertically. The bar below the time ruler shows the current horizontal view position and can be dragged to pan the graph. The \emph{Reset view} button resets the view to show the whole graph.
|
||||
|
||||
Similar to the statistics window (section~\ref{statistics}), the flame graph can operate in two modes: \emph{\faSyringe{}~Instrumentation} and \emph{\faEyeDropper{}~Sampling}. In the instrumentation mode, the graph represents the zones you put in your program. In the sampling mode, the graph is constructed from the automatically captured call stack data (section~\ref{sampling}).
|
||||
|
||||
In the sampling mode, external frames from system libraries are hidden by default. These typically include internal implementation details of starting threads, handling smart pointers, and other such things that are quick to execute and not really interesting. Enabling the \emph{\faShieldHalved{}~External} option will show these frames. One exception is \emph{external tails}, or calls that your code makes that do not eventually land in your application down the call chain. Think of functions that write to a file or send data on the network. These can be time-consuming, and you may want to see them. There is a separate option to disable these.
|
||||
@@ -4467,13 +4573,13 @@ This view may help assess the general memory behavior of the application or in d
|
||||
\subsubsection{Bottom-up call stack tree}
|
||||
\label{callstacktree}
|
||||
|
||||
The \emph{\faTree{}~Bottom-up call stack tree} pane is only available, if the memory events were collecting the call stack data (section~\ref{collectingcallstacks}). In this view, you are presented with a tree of memory allocations, starting at the call stack entry point and going up to the allocation's pinpointed place. Each tree level is sorted according to the number of bytes allocated in the given branch.
|
||||
The \emph{\faTree\faArrowUp{}~Bottom-up call stack tree} pane is only available, if the memory events were collecting the call stack data (section~\ref{collectingcallstacks}). In this view, you are presented with a tree of memory allocations, starting at the call stack entry point and going up to the allocation's pinpointed place. Each tree level is sorted according to the number of bytes allocated in the given branch.
|
||||
|
||||
Each tree node consists of the function name, the source file location, and the memory allocation data. The memory allocation data is either yellow \emph{inclusive} events count (allocations performed by children) or the cyan \emph{exclusive} events count (allocations that took place in the node)\footnote{Due to the way call stacks work, there is no possibility for an entry to have both inclusive and exclusive counts, in an adequately instrumented program.}. Two values are counted: total memory size and number of allocations.
|
||||
|
||||
The \emph{\faLayerGroup{}~Group by function name} option controls how tree nodes are grouped. If it is disabled, the grouping is performed at a machine instruction-level granularity. This may result in a very verbose output, but the displayed source locations are precise. To make the tree more readable, you may opt to perform grouping at the function name level, which will result in less valid source file locations, as multiple entries are collapsed into one.
|
||||
See chapter~\ref{stackwindows} for description of the \emph{\faLayerGroup{}~Group by function name} option.
|
||||
|
||||
Enabling the \emph{Only active allocations} option will limit the call stack tree only to display active allocations. Enabling \emph{Only inactive allocations} option will have similar effect for inactive allocations. Both are mutually exclusive, enabling one disables the other. Displaing inactive allocations, when combined with \emph{Limit range}, will show short lived allocatios highlighting potentially unwanted behavior in the code.
|
||||
Enabling the \emph{Only active allocations} option will limit the call stack tree only to display active allocations. Enabling \emph{Only inactive allocations} option will have similar effect for inactive allocations. Both are mutually exclusive, enabling one disables the other. Displaying inactive allocations, when combined with \emph{Limit range}, will show short lived allocations highlighting potentially unwanted behavior in the code.
|
||||
|
||||
Clicking the \RMB{}~right mouse button on the function name will open the allocations list window (see section \ref{alloclist}), which lists all the allocations included at the current call stack tree level. Likewise, clicking the \RMB{}~right mouse button on the source file location will open the source file view window (if applicable, see section~\ref{sourceview}).
|
||||
|
||||
@@ -4481,7 +4587,7 @@ Some function names may be too long to correctly display, with the events count
|
||||
|
||||
\subsubsection{Top-down call stack tree}
|
||||
|
||||
This pane is identical in functionality to the \emph{Bottom-up call stack tree}, but the call stack order is reversed when the tree is built. This means that the tree starts at the memory allocation functions and goes down to the call stack entry point.
|
||||
The \emph{\faTree\faArrowDown{}~Top-down call stack tree} pane is identical in functionality to the \emph{Bottom-up call stack tree}, but the call stack order is reversed when the tree is built. This means that the tree starts at the memory allocation functions and goes down to the call stack entry point.
|
||||
|
||||
\subsubsection{Looking back at the memory history}
|
||||
|
||||
@@ -4500,7 +4606,11 @@ The information about the selected memory allocation is displayed in this window
|
||||
\subsection{Trace information window}
|
||||
\label{traceinfo}
|
||||
|
||||
This window contains information about the current trace: captured program name, time of the capture, profiler version which performed the capture, and a custom trace description, which you can fill in.
|
||||
This window contains information about the current trace: captured program name, time of the capture, profiler version which performed the capture.
|
||||
|
||||
There's an text entry field for an optional custom description of the trace for you to fill in. This description will appear on the profiler window title bar, or when comparing two traces (section~\ref{compare}), enabling you to quickly recognize what the trace contains. For some people it's fine to just have \emph{any} semi-unique description to be able to identify a specific trace. For such purposes there's an \emph{\faDice{}~Generate name} button, which will set the trace description to an abstract meaningless identifier.
|
||||
|
||||
If the \emph{\faUserGear{}~Public sidecar} option is selected, the file containing trace-specific user settings (see section~\ref{tracespecific}) will be saved on disk next to the trace file.
|
||||
|
||||
Open the \emph{Trace statistics} section to see information about the trace, such as achieved timer resolution, number of captured zones, lock events, plot data points, memory allocations, etc.
|
||||
|
||||
@@ -4523,7 +4633,7 @@ Let's say we have an Unix-based operating system with program sources in \texttt
|
||||
\end{itemize}
|
||||
\end{bclogo}
|
||||
|
||||
By default, all source file modification times need to be older than the cature time of the trace. This can be disabled using the \emph{Enforce source file modification time older than trace capture time} check box, i.e. when the source files are under source control and the file modification time is not relevant.
|
||||
By default, all source file modification times need to be older than the capture time of the trace. This can be disabled using the \emph{Enforce source file modification time older than trace capture time} check box, i.e. when the source files are under source control and the file modification time is not relevant.
|
||||
|
||||
In this window, you can view the information about the machine on which the profiled application was running. This includes the operating system, used compiler, CPU name, total available RAM, etc. In addition, if application information was provided (see section~\ref{appinfo}), it will also be displayed here.
|
||||
|
||||
@@ -4537,7 +4647,7 @@ The zone information window displays detailed information about a single zone. T
|
||||
\begin{itemize}
|
||||
\item Basic source location information: function name, source file location, and the thread name.
|
||||
\item Timing information.
|
||||
\item If the profiler performed context switch capture (section~\ref{contextswitches}) and a thread was suspended during zone execution, a list of wait regions will be displayed, with complete information about the timing, CPU migrations, and wait reasons. If CPU topology data is available (section~\ref{cputopology}), the profiler will mark zone migrations across cores with 'C' and migrations across packages -- with 'P.' In some cases, context switch data might be incomplete\footnote{For example, when capture is ongoing and context switch information has not yet been received.}, in which case a warning message will be displayed.
|
||||
\item If the profiler performed context switch capture (section~\ref{contextswitches}) and a thread was suspended during zone execution, a list of wait regions will be displayed, with complete information about the timing, CPU migrations, and wait reasons. If CPU topology data is available (section~\ref{cputopology}), the profiler will mark zone migrations across cores with \enquote{C} and migrations across packages -- with \enquote{P.} In some cases, context switch data might be incomplete\footnote{For example, when capture is ongoing and context switch information has not yet been received.}, in which case a warning message will be displayed.
|
||||
\item Memory events list, both summarized and a list of individual allocation/free events (see section~\ref{memorywindow} for more information on the memory events list).
|
||||
\item List of messages that the profiler logged in the zone's scope. If the \emph{exclude children} option is disabled, messages emitted in child zones will also be included.
|
||||
\item Parent zones list, showing the hierarchy of parent zones that contain the current zone. Hovering the \faArrowPointer{}~mouse pointer over a parent zone will highlight it on the timeline view with a red outline. Clicking the \LMB{}~left mouse button on a zone will switch the zone info window to that zone. Clicking the \MMB{}~middle mouse button on a zone will zoom the timeline view to the zone's extent. Clicking the \RMB{}~right mouse button on a source file location will open the source file view window (if applicable, see section~\ref{sourceview}).
|
||||
@@ -4572,9 +4682,13 @@ Clicking on the \emph{\faClipboard{}~Copy to clipboard} buttons will copy the ap
|
||||
\subsection{Call stack window}
|
||||
\label{callstackwindow}
|
||||
|
||||
This window shows the frames contained in the selected call stack. Each frame is described by a function name, source file location, and originating image\footnote{Executable images are called \emph{modules} by Microsoft.} name. Function frames originating from the kernel are marked with a red color. Clicking the \LMB{}~left mouse button on either the function name of source file location will copy the name to the clipboard. Clicking the \RMB{}~right mouse button on the source file location will open the source file view window (if applicable, see section~\ref{sourceview}).
|
||||
This window shows the frames contained in the selected call stack. Information about the originating thread is included. Each frame is described by a function name, source file location, and originating image\footnote{Executable images are called \emph{modules} by Microsoft.} name. Function frames originating from the kernel are marked with a red color. Clicking the \LMB{}~left mouse button on either the function name of source file location will copy the name to the clipboard. Clicking the \RMB{}~right mouse button on the source file location will open the source file view window (if applicable, see section~\ref{sourceview}).
|
||||
|
||||
A single stack frame may have multiple function call places associated with it. This happens in the case of inlined function calls. Such entries will be displayed in the call stack window, with \emph{inline} in place of frame number\footnote{Or '\faCaretRight{}'~icon in case of call stack tooltips.}.
|
||||
A single stack frame may have multiple function call places associated with it. This happens in the case of inlined function calls. Such entries will be displayed in the call stack window, with \emph{inline} in place of frame number\footnote{Or \enquote{\faCaretRight{}}~icon in case of call stack tooltips.}.
|
||||
|
||||
If the call stack shows a crash (see section~\ref{crashhandling}), a red \emph{\faSkull{}~Crash} label will be displayed. Clicking it will center the timeline on the crash. Note that the crash stack may contain OS or Tracy frames where the crash was intercepted and processed.
|
||||
|
||||
If the call stack shows a wait stack (see section~\ref{waitstacks}), a blue \emph{\faHourglassHalf{}~Wait stack} label will be displayed. Hovering the \faArrowPointer{}~mouse pointer over it will display a tooltip displaying how much time was spent waiting in the stack, what was the wait reason and status.
|
||||
|
||||
Stack frame location may be displayed in the following number of ways, depending on the \emph{Frame~at} option selection:
|
||||
|
||||
@@ -4585,13 +4699,13 @@ Stack frame location may be displayed in the following number of ways, depending
|
||||
\item \emph{Symbol address} -- displays begin address of the function containing the frame address.
|
||||
\end{itemize}
|
||||
|
||||
In some cases, it may not be possible to decode stack frame addresses correctly. Such frames will be presented with a dimmed '\texttt{[ntdll.dll]}' name of the image containing the frame address, or simply '\texttt{[unknown]}' if the profiler cannot retrieve even this information. Additionally, '\texttt{[kernel]}' is used to indicate unknown stack frames within the operating system's internal routines.
|
||||
In some cases, it may not be possible to decode stack frame addresses correctly. Such frames will be presented with a dimmed \enquote{\texttt{[ntdll.dll]}} name of the image containing the frame address, or simply \enquote{\texttt{[unknown]}} if the profiler cannot retrieve even this information. Additionally, \enquote{\texttt{[kernel]}} is used to indicate unknown stack frames within the operating system's internal routines.
|
||||
|
||||
External frames from system libraries are hidden by default. Enabling the \emph{\faShieldHalved{}~External} option will show these frames, which can be useful for debugging issues in external code. When external frames are displayed, they are dimmed out.
|
||||
|
||||
The \emph{\faScissors{}~Short images} option shortens the displayed executable image name to only the file name. The full path is available in the tooltip.
|
||||
|
||||
If the displayed call stack is a sampled call stack (chapter~\ref{sampling}), an additional button will be available, \emph{\faDoorOpen{}~Entry stacks}. Clicking it will open the sample entry stacks window (chapter~\ref{sampleparents}) for the current call stack.
|
||||
If the displayed call stack is a sampled call stack (chapter~\ref{sampling}), an additional button may be available, \emph{\faDoorOpen{}~Entry stacks}. Clicking it will open the sample entry stacks window (chapter~\ref{stackwindows}) for the current call stack.
|
||||
|
||||
Clicking on the \emph{\faClipboard{}~Copy to clipboard} button will copy call stack to the clipboard.
|
||||
|
||||
@@ -4627,13 +4741,6 @@ At the first glance it may look like \texttt{unique\_ptr::reset} was the \emph{c
|
||||
|
||||
Moreover, the linker may determine in some rare cases that any two functions in your program are identical\footnote{For example, if all they do is zero-initialize a region of memory. As some constructors would do.}. As a result, only one copy of the binary code will be provided in the executable for both functions to share. While this optimization produces more compact programs, it also means that there's no way to distinguish the two functions apart in the resulting machine code. In effect, some call stacks may look nonsensical until you perform a small investigation.
|
||||
|
||||
\subsection{Sample entry stacks window}
|
||||
\label{sampleparents}
|
||||
|
||||
This window displays statistical information about the selected symbol. All sampled call stacks (chapter~\ref{sampling}) leading to the symbol are counted and displayed in descending order. You can choose the displayed call stack using the \emph{entry call stack} controls, which also display time spent in the selected call stack. Alternatively, sample counts may be shown by disabling the \emph{\faStopwatch{}~Show time} option, which is described in more detail in chapter~\ref{statisticssampling}.
|
||||
|
||||
The layout of frame list and the \emph{\faAt{}~Frame location} option selection is similar to the call stack window, described in chapter~\ref{callstackwindow}.
|
||||
|
||||
\subsection{Source view window}
|
||||
\label{sourceview}
|
||||
|
||||
@@ -4666,7 +4773,7 @@ Nevertheless, \textbf{the displayed source files might still not reflect the cod
|
||||
|
||||
A much more capable symbol view mode is available if the inspected source location has an associated symbol context (i.e., if it comes from a call stack capture, from call stack sampling, etc.). A symbol is a unit of machine code, basically a callable function. It may be generated using multiple source files and may consist of numerous inlined functions. A list of all captured symbols is available in the statistics window, as described in chapter~\ref{statisticssampling}.
|
||||
|
||||
The header of symbol view window contains a name of the selected \emph{\faPuzzlePiece{}~symbol}, a list of \emph{\faSitemap{}~functions} that contribute to the symbol, and information such as count of probed \emph{\faEyeDropper{}~Samples}. The entry stacks (section~\ref{sampleparents}) of the symbol can be viewed by clicking on the \emph{Entry stacks} button.
|
||||
The header of symbol view window contains a name of the selected \emph{\faPuzzlePiece{}~symbol}, a list of \emph{\faSitemap{}~functions} that contribute to the symbol, and information such as count of probed \emph{\faEyeDropper{}~Samples}. The entry stacks (section~\ref{stackwindows}) of the symbol can be viewed by clicking on the \emph{Entry stacks} button.
|
||||
|
||||
Additionally, you may use the \emph{Mode} selector to decide what content should be displayed in the panels below:
|
||||
|
||||
@@ -4680,11 +4787,12 @@ Some modes may be unavailable in some circumstances (missing or outdated source
|
||||
|
||||
\paragraph{Source mode}
|
||||
|
||||
This is pretty much the source file view window, but with the ability to select one of the source files that the compiler used to build the symbol. Additionally, each source file line that produced machine code in the symbol will show a count of associated assembly instructions, displayed with an '\texttt{@}' prefix, and will be marked with grey color on the scroll bar. Due to how optimizing compilers work, some lines may seemingly not produce any machine code, for example, because iterating a loop counter index might have been reduced to advancing a data pointer. Some other lines may have a disproportionate amount of associated instructions, e.g., when the compiler applied a loop unrolling optimization. This varies from case to case and from compiler to compiler.
|
||||
This is pretty much the source file view window, but with the ability to select one of the source files that the compiler used to build the symbol. Additionally, each source file line that produced machine code in the symbol will show a count of associated assembly instructions, displayed with an \enquote{\texttt{@}} prefix, and will be marked with grey color on the scroll bar. Due to how optimizing compilers work, some lines may seemingly not produce any machine code, for example, because iterating a loop counter index might have been reduced to advancing a data pointer. Some other lines may have a disproportionate amount of associated instructions, e.g., when the compiler applied a loop unrolling optimization. This varies from case to case and from compiler to compiler.
|
||||
|
||||
The \emph{Propagate inlines} option, available when sample data is present, will enable propagation of the instruction costs down the local call stack. For example, suppose a base function in the symbol issues a call to an inlined function (which may not be readily visible due to being contained in another source file). In that case, any cost attributed to the inlined function will be visible in the base function. Because the cost information is added to all the entries in the local call stacks, it is possible to see seemingly nonsense total cost values when this feature is enabled. To quickly toggle this on or off, you may also press the \keys{X} key.
|
||||
|
||||
\paragraph{Assembly mode}
|
||||
\label{assemblymode}
|
||||
|
||||
This mode shows the disassembly of the symbol machine code. If only one inline function is selected through the \emph{\faSitemap{}~Function} selector, assembly instructions outside of this function will be dimmed out. Each assembly instruction is displayed listed with its location in the program memory during execution. If the \emph{\faMagnifyingGlassLocation{}~Relative address} option is selected, the profiler will print an offset from the symbol beginning instead. Clicking the \LMB{}~left mouse button on the address/offset will switch to counting line numbers, using the selected one as the origin (i.e., zero value). Line numbers are displayed inside \texttt{[]} brackets. This display mode can be useful to correlate lines with the output of external tools, such as \texttt{llvm-mca}. To disable line numbering click the \RMB{}~right mouse button on a line number.
|
||||
|
||||
@@ -4697,7 +4805,7 @@ logo=\bclampe
|
||||
]{Local call stack}
|
||||
In some cases, it may be challenging to understand what is being displayed in the disassembly. For example, calling the \texttt{std::lower\_bound} function may generate multiple levels of inlined functions: first, we enter the search algorithm, then the comparison functions, which in turn may be lambdas that call even more external code, and so on. In such an event, you will most likely see that some external code is taking a long time to execute, and you will be none the wiser on improving things.
|
||||
|
||||
The local call stack for an assembly instruction represents all the inline function calls \emph{within the symbol} (hence the 'local' part), which were made to reach the instruction. Deeper inspection of the local call stack, including navigation to the source call site of each participating inline function, can be performed through the context menu accessible by pressing the \RMB{}~right mouse button on the source location.
|
||||
The local call stack for an assembly instruction represents all the inline function calls \emph{within the symbol} (hence the \enquote{local} part), which were made to reach the instruction. Deeper inspection of the local call stack, including navigation to the source call site of each participating inline function, can be performed through the context menu accessible by pressing the \RMB{}~right mouse button on the source location.
|
||||
\end{bclogo}
|
||||
|
||||
Selecting the \emph{\faGears{}~Raw code} option will enable the display of raw machine code bytes for each line. Individual bytes are displayed with interwoven colors to make reading easier.
|
||||
@@ -4748,13 +4856,15 @@ In this mode, the source and assembly panes will be displayed together, providin
|
||||
|
||||
\paragraph{Instruction pointer cost statistics}
|
||||
|
||||
If automated call stack sampling (see chapter~\ref{sampling}) was performed, additional profiling information will be available. The first column of source and assembly views will contain percentage counts of collected instruction pointer samples for each displayed line, both in numerical and graphical bar form. You can use this information to determine which function line takes the most time. The displayed percentage values are heat map color-coded, with the lowest values mapped to dark red and the highest to bright yellow. The color code will appear next to the percentage value and on the scroll bar so that you can identify 'hot' places in the code at a glance.
|
||||
If automated call stack sampling (see chapter~\ref{sampling}) was performed, additional profiling information will be available. The first column of source and assembly views will contain percentage counts of collected instruction pointer samples for each displayed line, both in numerical and graphical bar form. You can use this information to determine which function line takes the most time. The displayed percentage values are heat map color-coded, with the lowest values mapped to dark red and the highest to bright yellow. The color code will appear next to the percentage value and on the scroll bar so that you can identify \enquote{hot} places in the code at a glance.
|
||||
|
||||
By default, samples are displayed only within the selected symbol, in isolation. In some cases, you may, however, want to include samples from functions that the selected symbol called. To do so, enable the \emph{\faRightFromBracket{}~Child calls} option, which you may also temporarily toggle by holding the \keys{Z} key. You can also click the~\faCaretDown{}~drop down control to display a child call distribution list, which shows each known function\footnote{You should remember that these are results of random sampling. Some function calls may be missing here.} that the symbol called. Make sure to familiarize yourself with section~\ref{readingcallstacks} to be able to read the results correctly.
|
||||
By default, samples are displayed only within the selected symbol, in isolation. In some cases, you may, however, want to include samples from functions that the selected symbol called. To do so, enable the \emph{\faRightFromBracket{}~Child calls} option, which you may also temporarily toggle by holding the \keys{Z} key. You can also click the~\faCaretDown{}~drop down control to display a child call distribution list\footnote{The height of the list can be changed by dragging the separator bar.}, which shows each known function\footnote{You should remember that these are results of random sampling. Some function calls may be missing here.} that the symbol called. Make sure to familiarize yourself with section~\ref{readingcallstacks} to be able to read the results correctly. Each child call on the list has an attributed time cost, which is also displayed as a percentage of the child calls (\enquote{\%~Calls}) and the percentage of the total symbol time (\enquote{\%~Total}).
|
||||
|
||||
The total number of collected samples is displayed in the UI under the~\emph{\faEyeDropper~Samples} label and converted to a time approximation at the~\emph{\faStopwatch~Time} label. The displayed values show the local count if child calls are disabled and the total count if the option is enabled. In either case, the number of samples attributed only to the child calls is displayed in parentheses with the + or - symbol and as a percentage of the total symbol time.
|
||||
|
||||
Instruction timings can be viewed as a group. To begin constructing such a group, click the \LMB{}~left mouse button on the percentage value. Additional instructions can be added using the \keys{\ctrl}~key while holding the \keys{\shift}~key will allow selection of a range. To cancel the selection, click the \RMB{}~right mouse button on a percentage value. Group statistics can be seen at the bottom of the pane.
|
||||
|
||||
Clicking the \MMB{}~middle mouse button on the percentage value of an assembly instruction will display entry call stacks of the selected sample (see chapter~\ref{sampleparents}). This functionality is only available for instructions that have collected sampling data and only in the assembly view, as the source code may be inlined multiple times, which would result in ambiguous location data. Note that number of entry call stacks is displayed in a tooltip for a quick reference.
|
||||
Clicking the \MMB{}~middle mouse button on the percentage value of an assembly instruction will display entry call stacks of the selected sample (see chapter~\ref{stackwindows}). This functionality is only available for instructions that have collected sampling data and only in the assembly view, as the source code may be inlined multiple times, which would result in ambiguous location data. Note that number of entry call stacks is displayed in a tooltip for a quick reference.
|
||||
|
||||
The sample data source is controlled by the \emph{\faSitemap{}~Function} control in the window header. If this option should be disabled, sample data will represent the whole symbol. If it is enabled, then the sample data will only include the selected function. You can change the currently selected function by opening the drop-down box, which includes time statistics. The time percentage values of each contributing function are calculated relative to the total number of samples collected within the symbol.
|
||||
|
||||
@@ -4792,18 +4902,25 @@ logo=\bcattention
|
||||
The percentage values when \emph{\faCarBurst{}~Impact} option is not selected will not take into account the relative count of events. For example, you may see a 100\% cache miss rate when some instruction missed 10 out of 10 cache accesses. While not ideal, this is not as important as a seemingly better 50\% cache miss rate instruction, which actually has missed 1000 out of 2000 accesses. Therefore, you should always cross-check the presented information with the respective event counts. To help with this, Tracy will dim statistically unimportant values.
|
||||
\end{bclogo}
|
||||
|
||||
\subsection{Wait stacks window}
|
||||
\label{waitstackswindow}
|
||||
\subsection{Stacks windows}
|
||||
\label{stackwindows}
|
||||
|
||||
If wait stack information has been captured (chapter~\ref{waitstacks}), here you will be able to inspect the collected data. There are three different views available:
|
||||
The profiler can group call stacks leading to certain events and display the resulting information in a variety of ways. In essence, this shows the code paths that lead to these events and the distribution of these paths. At this moment, these events include:
|
||||
|
||||
\begin{itemize}
|
||||
\item \emph{\faTable{}~List} -- shows all unique wait stacks, sorted by the number of times they were observed.
|
||||
\item \emph{\faTree{}~Bottom-up tree} -- displays wait stacks in the form of a collapsible tree, which starts at the bottom of the call stack.
|
||||
\item \emph{\faTree{}~Top-down tree} -- displays wait stacks in the form of a collapsible tree, which starts at the top of the call stack.
|
||||
\item \textbf{Sample entry stacks} -- this window shows all the paths that lead to execution of the selected symbol. Requires sampling (chapter~\ref{sampling}) to be active.
|
||||
\item \textbf{Wait stacks} -- this windows shows all the places where the application was sleeping. See chapter~\ref{waitstacks} for more information.
|
||||
\end{itemize}
|
||||
|
||||
Displayed data may be narrowed down to a specific time range or to include only selected threads.
|
||||
The call stack paths may be displayed in the following ways:
|
||||
|
||||
\begin{itemize}
|
||||
\item \emph{\faTable{}~List} -- shows all unique stacks, sorted by the number of times they were observed. The frame list is similar to the call stack window, described in chapter~\ref{callstackwindow}.
|
||||
\item \emph{\faTree\faArrowUp{}~Bottom-up tree} -- displays stacks in the form of a collapsible tree, which starts at the bottom of the call stack.
|
||||
\item \emph{\faTree\faArrowDown{}~Top-down tree} -- displays stacks in the form of a collapsible tree, which starts at the top of the call stack.
|
||||
\end{itemize}
|
||||
|
||||
The \emph{\faLayerGroup{}~Group by function name} option controls how tree nodes are grouped. If it is disabled, the grouping is performed at machine-instruction-level granularity. This may result in very verbose output, but the displayed source locations are precise. To make the tree more readable, you may opt to group at the function-name level, which will result in fewer valid source file locations, as multiple entries are collapsed into one. The number of aggregated entries is displayed next to function names.
|
||||
|
||||
\subsection{Lock information window}
|
||||
\label{lockwindow}
|
||||
@@ -4833,7 +4950,7 @@ The profiled program is highlighted using green color. Furthermore, the yellow h
|
||||
\subsection{Annotation settings window}
|
||||
\label{annotationsettings}
|
||||
|
||||
In this window, you may modify how a timeline annotation (section~\ref{annotatingtrace}) is presented by setting its text description or selecting region highlight color. If the note is no longer needed, you may also remove it here.
|
||||
In this window, you may modify how a timeline annotation (section~\ref{annotatingtrace}) is presented by setting its text description or selecting region highlight color. A random annotation description can be set with the \emph{\faDice{}~Generate name} button. If the note is no longer needed, you may also remove it here.
|
||||
|
||||
\subsection{Annotation list window}
|
||||
\label{annotationlist}
|
||||
@@ -4865,7 +4982,7 @@ A new view-sized annotation can be added in this window by pressing the \emph{\f
|
||||
\subsection{Time range limits}
|
||||
\label{timerangelimits}
|
||||
|
||||
This window displays information about time range limits (section~\ref{timeranges}) for find zone (section~\ref{findzone}), statistics (section~\ref{statistics}), flame graph (section~\ref{flamegraph}), memory (section~\ref{memorywindow}) and wait stacks (section~\ref{waitstackswindow}) results. Each limit can be enabled or disabled and adjusted through the following options:
|
||||
This window displays information about time range limits (section~\ref{timeranges}) for find zone (section~\ref{findzone}), statistics (section~\ref{statistics}), flame graph (section~\ref{flamegraph}), memory (section~\ref{memorywindow}) and wait stacks (section~\ref{stackwindows}) results. Each limit can be enabled or disabled and adjusted through the following options:
|
||||
|
||||
\begin{itemize}
|
||||
\item \emph{Limit to view} -- Set the time range limit to current view.
|
||||
@@ -4880,12 +4997,12 @@ This window displays information about time range limits (section~\ref{timerange
|
||||
|
||||
Note that ranges displayed in the window have color hints that match the color of the striped regions on the timeline.
|
||||
|
||||
\subsection{Tracy Assist}
|
||||
\section{Tracy Assist}
|
||||
\label{tracyassist}
|
||||
|
||||
With Tracy Profiler, you can use GenAI features to get help using the profiler or analyzing the code you're profiling.
|
||||
|
||||
The automated assistant can search the user manual to answer your questions about the profiler. It can also read the source code when you ask about program performance or algorithms. It has the capacity for access to Wikipedia, the ability to search the web, and the capability to access web pages in response to general questions.
|
||||
The automated assistant can search the user manual to answer your questions about the profiler. It can also read the source code or analyze captured profile data when you ask about program performance or algorithms. It has the capacity for access to Wikipedia, the ability to search the web, and the capability to access web pages in response to general questions.
|
||||
|
||||
This feature can be completely disabled in the \emph{Global settings}, as described in section~\ref{aboutwindow}.
|
||||
|
||||
@@ -4907,7 +5024,43 @@ You do not. Tracy is not a money funnel for Silicon Valley tech bros to get rich
|
||||
The only way to access the assistant is to run everything locally on your system. This ensures that everything you do stays private and that you won't be subject to forced changes in features or terms and conditions. You should own the tools you work with instead of renting them from someone else.
|
||||
\end{bclogo}
|
||||
|
||||
\subsubsection{Service provider}
|
||||
If you just want to get things running and have a reasonably powerful hardware, follow the steps below.
|
||||
|
||||
\begin{enumerate}
|
||||
\item Go to \url{https://llama.app/} and follow instructions to install llama.cpp.
|
||||
\item Create the following \texttt{llama.ini} configuration file:
|
||||
\begin{lstlisting}
|
||||
; Launch with: llama-server --models-preset llama.ini
|
||||
|
||||
[*]
|
||||
version = 1
|
||||
cache-type-k = q8_0
|
||||
cache-type-v = q8_0
|
||||
|
||||
[unsloth/Qwen3.6-35B-A3B-MTP-GGUF:UD-Q4_K_M]
|
||||
hf = unsloth/Qwen3.6-35B-A3B-MTP-GGUF:UD-Q4_K_M
|
||||
parallel = 1
|
||||
spec-default = true
|
||||
spec-type = draft-mtp
|
||||
chat-template-kwargs = {"preserve_thinking": true}
|
||||
ctx-size = 100000
|
||||
|
||||
[nomic-ai/nomic-embed-text-v1.5-GGUF:Q4_K_M]
|
||||
hf = nomic-ai/nomic-embed-text-v1.5-GGUF:Q4_K_M
|
||||
embedding = true
|
||||
parallel = 4
|
||||
ctx-size = 8192
|
||||
cache-ram = 0
|
||||
\end{lstlisting}
|
||||
\item Run \texttt{llama serve -{}-models-preset llama.ini}
|
||||
\item Connect to llama.cpp inside the profiler!
|
||||
\end{enumerate}
|
||||
|
||||
The models will be automatically downloaded when trying to access them for the first time. It may take some time.
|
||||
|
||||
If you have the resources available you may try replacing \texttt{unsloth/Qwen3.6-35B-A3B-MTP-GGUF:UD-Q4\_K\_M} with \texttt{unsloth/Qwen3.6-27B-MTP-GGUF:Q4\_K\_M} in the configuration file to get a more capable model.
|
||||
|
||||
\subsection{Service provider}
|
||||
|
||||
To get started, you will need to install an LLM\footnote{Large Language Model.} provider on your system. Any service that's compatible with the standard API should work, but some may work better than others. The LLM field is advancing quickly, with new models frequently being released that often require specific support from provider services to deliver the best experience.
|
||||
|
||||
@@ -4916,31 +5069,29 @@ The ideal LLM provider should be a system service that loads and unloads models
|
||||
There are no ideal LLM providers, but here are some options:
|
||||
|
||||
\begin{itemize}
|
||||
\item \emph{llama.cpp} (\url{https://github.com/ggml-org/llama.cpp}) -- Recommended as the easiest to use. Clone from git and build it yourself. By default it fits the model automatically to available memory. It is rapidly advancing with new features and model support. Most other providers use it to do the actual work, and they typically use an outdated release.
|
||||
\item \emph{llama.cpp} (\url{https://github.com/ggml-org/llama.cpp}) -- Recommended as the easiest to use. Clone from git and build it yourself. By default it fits the model automatically to available memory. It is rapidly advancing with new features and model support. Most other providers use it to do the actual work, and they typically use an outdated release. The \url{https://llama.app/} site might provide easy way to install llama.
|
||||
\item \emph{llama-swap} (\url{https://github.com/mostlygeek/llama-swap}) -- Wrapper for llama.cpp that allows model selection. Recommended to augment the above.
|
||||
\item \emph{LM Studio} (\url{https://lmstudio.ai/}) -- It is easy to install on all platforms and has a GUI. But it is overwhelming when it comes to the number of options it offers. Some people may question the licensing. Its features lag a behind llama.cpp. Manual configuration of each model is required. To get it to work properly, go to it settings (using the gear icon in the bottom right corner of the program window), then select the Developer tab and enable "When applicable, separate \texttt{reasoning\_content} and \texttt{content} in API responses".
|
||||
\item \emph{LM Studio} (\url{https://lmstudio.ai/}) -- It is easy to install on all platforms and has a GUI. But it is overwhelming when it comes to the number of options it offers. Some people may question the licensing. Its features lag a behind llama.cpp. Manual configuration of each model is required. To get it to work properly, go to it settings (using the gear icon in the bottom right corner of the program window), then select the Developer tab and enable \enquote{When applicable, separate \texttt{reasoning\_content} and \texttt{content} in API responses}.
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{Model selection}
|
||||
\subsection{Model selection}
|
||||
|
||||
Once you have installed the service provider, you will need to download the model files. The exact process depends on the provider you chose. LM Studio, for example, has a built-in downloader with an easy-to-use UI. For llama.cpp, you can follow their documentation or download the model file via your web browser. Tracy will not issue commands to download any model on its own.
|
||||
Once you have installed the service provider, you will need to download the model files. The exact process depends on the provider you chose. LM Studio, for example, has a built-in downloader with an easy-to-use UI. For llama.cpp, you can follow their documentation (e.g., the \texttt{-hf} parameter) or download the model file via your web browser. Tracy will not issue commands to download any model on its own.
|
||||
|
||||
There are three different model types that Tracy expects to have available. Ideally all three models would be loaded and ready to go at the same time.
|
||||
|
||||
\paragraph{Chat model}
|
||||
\subsubsection{Chat model}
|
||||
|
||||
This is the model used for conversation purposes. You should strive to maximize its capabilities and context size. This model should support reasoning and tool usage.
|
||||
|
||||
A good starting point that will work fairly well on almost any hardware is \textbf{Qwen3 4B Thinking 2507}.
|
||||
A good \emph{starting} point that will work fairly well on almost any hardware is the \textbf{most recent} 4B model from the \textbf{Qwen} family. For real use, you will want to choose a larger model that fits your hardware, though.
|
||||
|
||||
\begin{bclogo}[
|
||||
noborder=true,
|
||||
couleur=black!5,
|
||||
logo=\bclampe
|
||||
]{Model quantization}
|
||||
Running a model with full 32-bit floating-point weights is not feasible due to memory requirements. Instead, the model parameters are quantized, for which 4 bits is typically the sweet spot. In general, the lower the parameter precision, the more "dumbed down" the model becomes. However, the loss of model coherence due to quantization is less than the benefit of being able to run a larger model.
|
||||
|
||||
There are different ways of doing quantization that give the same bit size. It's best to follow the recommendations provided by LM Studio, for example.
|
||||
Running a model with full 32-bit floating-point weights is not feasible due to memory requirements. Instead, the model parameters are quantized, for which 4 bits is typically the sweet spot. In general, the lower the parameter precision, the more \enquote{dumbed down} the model becomes. However, the loss of model coherence due to quantization is less than the benefit of being able to run a larger model.
|
||||
\end{bclogo}
|
||||
|
||||
\begin{bclogo}[
|
||||
@@ -4948,7 +5099,9 @@ noborder=true,
|
||||
couleur=black!5,
|
||||
logo=\bclampe
|
||||
]{Model size}
|
||||
Another thing to consider when selecting a model is its size, which is typically measured in billions of parameters (weights) and written as 4B, for example. The model size determines how much memory, computation, and time are required to run it. Generally, the larger the model, the "smarter" its responses will be.
|
||||
Another thing to consider when selecting a model is its size, which is typically measured in billions of parameters (weights) and written as 4B, for example. The model size determines how much memory, computation, and time are required to run it. Generally, the larger the model, the \enquote{smarter} its responses will be.
|
||||
|
||||
Most modern models will be \enquote{Mixture of Experts}, or \enquote{MoE}, and their size will be denoted, for example, 35B-A3B. This means that the model size is 35B, but only 3B parameters are active and used to compute the next token. In practice, this means that the model has knowledge closer to the full, dense 35B model but speed and GPU memory requirements closer to the fast 3B model.
|
||||
\end{bclogo}
|
||||
|
||||
\begin{bclogo}[
|
||||
@@ -4956,42 +5109,26 @@ noborder=true,
|
||||
couleur=black!5,
|
||||
logo=\bclampe
|
||||
]{Context size}
|
||||
The model size only indicates the minimum memory requirement. For the model to operate properly, you also need to set the context size, which determines how much information from the conversation the model can "remember". This size is measured in tokens, and a very rough approximation is that each token is a combination of three or four letters.
|
||||
The model size only indicates the minimum memory requirement. For the model to operate properly, you also need to set the context size, which determines how much information from the conversation the model can \enquote{remember}. This size is measured in tokens, and a very rough approximation is that each token is a combination of three or four letters.
|
||||
|
||||
Each token present in the context window may require a fairly large amount of memory, and that can quickly add up to gigabytes. Some modern models use solutions that greatly reduce context memory requirements, but that varies from model to model. If needed, the KV cache used for context can be quantized, just like model parameters. In this case, the recommended size per weight is 8 bits.
|
||||
|
||||
The bare minimum required context size for Tracy to run the assistant is 8K, but don't expect things to run smoothly. Using 16K provides more room to operate, but it's still tight. To get things working well you should not go less than 32K or 64K for the context size.
|
||||
The realistic minimum required context size for Tracy to run the assistant is 100K tokens, but feel free to experiment.
|
||||
\end{bclogo}
|
||||
|
||||
\paragraph{Fast model}
|
||||
\subsubsection{Fast model}
|
||||
|
||||
Sometimes Tracy needs to do some language processing where speed is more important than the smarts. For this kind of model, choose a small amount of parameters (that still work well), and no reasoning (also referred to as "thinking").
|
||||
Sometimes Tracy needs to do some language processing where speed is more important than the smarts. The default setting is to use the chat model with the reasoning disabled, which is fine for most applications.
|
||||
|
||||
A good starting point here is \textbf{Qwen3 4B Instruct 2507}. Using a 16K context should be enough for most applications.
|
||||
It may be more convenient to use a small, quick model instead, in which case enable the \emph{Fast model} checkbox and choose the second model. To save precious GPU resources for the chat model, you may want to keep this model entirely in system RAM (set \texttt{-ngl 0} for llama.cpp or set \enquote{GPU offload} to 0 in LM Studio) and disable the KV cache offload to GPU (set \texttt{-nkvo} for llama.cpp or disable \enquote{Offload KV Cache to GPU Memory} in LM Studio).
|
||||
|
||||
To save the precious GPU resources for the chat model, you may want to keep this model entirely in system RAM (set \texttt{-ngl 0} for llama.cpp, or set "GPU offload" to 0 in LM Studio) and disable the KV cache offload to GPU (set \texttt{-nkvo} for llama.cpp, or disable "Offload KV Cache to GPU Memory" in LM Studio). The slowdown is not significant.
|
||||
|
||||
\paragraph{Embedding model}
|
||||
\subsubsection{Embedding model}
|
||||
|
||||
This is a small model used for semantic search in the user manual. This should be \textbf{nomic-embed-text-1.5}, which is provided by default by LM Studio, or which you can download on your own for llama.cpp.
|
||||
|
||||
LM Studio properly labels the model's capabilities. This is not the case with the llama.cpp/llama-swap setup. To make it work, your embedding model's name must contain the word \texttt{embed}.
|
||||
|
||||
\paragraph{Hardware resources}
|
||||
|
||||
Ideally, you want to keep both the model and the context cache in your GPU's VRAM. This will provide the fastest possible speed. However, this won't be possible in many configurations.
|
||||
|
||||
LLM providers solve this problem by storing part of the model on the GPU and running the rest on the CPU. The more you can run on the GPU, the faster it goes.
|
||||
|
||||
If you use llama.cpp, it will automatically fit the model into the available memory. A short report will be displayed when the program is started, with information about memory use. If there's a deficit, the model will still run, but at a severely reduced speed. Use a smaller context or quantization in that case. If there's a memory surplus, it will be used to make the model run faster.
|
||||
|
||||
Older versions of llama.cpp, typically still provided by the GUI wrappers, require determining how much of the model can be run on the GPU by experimentation. Other programs running on the system may affect or be affected by this setting. Generally, GPU offload capability is measured by the number of neural network layers.
|
||||
|
||||
Another option is to disable KV cache offload to GPU, as was already mentioned earlier. The KV cache is a configurable parameter that typically requires a lot of memory, and it may be better to keep in the system RAM than in limited VRAM.
|
||||
|
||||
Yet another option is to use a "Mixture of Experts" model, where the active portion of the model is small compared to its overall size. For example, you may see notation such as 30B-A3B. This means that the model size is 30B, but only 3B are actively used in computations. You can use the \texttt{-{}-cpu-moe} option in llama.cpp or the "Force Model Expert Weights onto CPU" option in LM Studio to keep the model in RAM, and the active portion in VRAM, which largely reduces the resource requirements of such models, while still being reasonably fast. Alternatively, there's llama.cpp \texttt{-{}-n-cpu-moe} option, similar to the \texttt{-ngl} GPU offload option. You may experiment with it to see what works best for you.
|
||||
|
||||
\paragraph{In practice}
|
||||
\subsubsection{In practice}
|
||||
|
||||
So, which model should you run and what hardware you need to be able to do so? Let's take look at some example systems.
|
||||
|
||||
@@ -5004,7 +5141,7 @@ As a rule of thumb, the specified number of parameters is how much total memory
|
||||
|
||||
To make this practical, the 35B-A3B model at 2 bit quantization requires $35 * 2 / 8 = 8.75$~GB, which fits into the 4 + 16 GB budget in the example above. The 3B active parameters similarly calculate to 0.75~GB, with additional 1~GB or so needed for computation buffer and another 1~GB for the 50K context, which is less than the 4~GB of VRAM available, making everything fit.
|
||||
|
||||
\subsubsection{Usage}
|
||||
\subsection{Usage}
|
||||
\label{llmusage}
|
||||
|
||||
The automated assistant can be accessed via the various \emph{\faRobot{}~Tracy Assist} buttons in the UI. The button in the control menu (section~\ref{controlmenu}) gives quick access to the chat. Buttons in other profiler windows open the chat window and add context related to the program you are profiling.
|
||||
@@ -5021,16 +5158,18 @@ The control section allows you to clear the chat contents, reconnect to the LLM
|
||||
|
||||
\begin{itemize}
|
||||
\item \emph{API} -- Enter the endpoint URL of the LLM provider here. A drop-down list is provided as a convenient way to select the default configuration of various providers. Note that the drop-down list is only used to fill in the endpoint URL. While Tracy does adapt to different ways each provider behaves, the feature detection is performed based on the endpoint conversation, not the drop-down selection.
|
||||
\item \emph{Chat model} -- Here you can select one of the models you have configured in the LLM provider for chat.
|
||||
\item \emph{Fast model} -- Select the fast model.
|
||||
\item \emph{Embeddings model} -- Select the vector embeddings model.
|
||||
\item \emph{Internet access} -- Determines whether the model can access network resources such as Wikipedia queries, web searches, and web page retrievals.
|
||||
\item \emph{Annotate call stacks} -- Enables automatic annotation of call stacks (see section~\ref{callstackwindow}). Disabled by default, as it requires proper configuration of the fast model.
|
||||
\item \emph{Tool reply size limit} -- Configurable maximum size for tool responses.
|
||||
\item \emph{\faComments{}~Chat model} -- Here you can select one of the models you have configured in the LLM provider for chat.
|
||||
\item \emph{\faBoltLightning{}~Fast model} -- Select the fast model.
|
||||
\item \emph{\faBookBookmark{}~Embeddings model} -- Select the vector embeddings model.
|
||||
\item \emph{\faEarthAmericas{}~Internet access} -- Determines whether the model can access network resources such as Wikipedia queries, web searches, and web page retrievals.
|
||||
\item \emph{\faTag{}~Annotate call stacks} -- Enables automatic annotation of call stacks (see section~\ref{callstackwindow}). Disabled by default, as it requires proper configuration of the fast model.
|
||||
\item \emph{\faHandPointRight{}~Show summary} -- Shows a short conversation topic after the initial question is asked.
|
||||
\item \emph{\faCommentDots{}~Chat suggestions} -- Suggests the next question the user may want to ask.
|
||||
\item \emph{Advanced} -- More advanced options are hidden here.
|
||||
\begin{itemize}
|
||||
\item \emph{Temperature} -- Allows changing default model temperature setting.
|
||||
\item \emph{Show all thinking regions} -- Always shows all reasoning sections and all tool calls made by model.
|
||||
\item \emph{\faTemperatureHalf{}~Temperature} -- Allows changing default model temperature setting.
|
||||
\item \emph{\faLightbulb{}~Show all thinking regions} -- Always shows all reasoning sections and all tool calls made by model.
|
||||
\item \emph{Tool reply size limit} -- Configurable maximum size for tool responses.
|
||||
\item \emph{User agent} -- Allows changing the user agent parameter in web queries.
|
||||
\item \emph{Google Search Engine} and \emph{API Key} -- Enables use of Google search. If this is not set, searches will fall back to Brave search, and then to DuckDuckGo, which is very rate limited.
|
||||
\end{itemize}
|
||||
@@ -5040,11 +5179,15 @@ The \emph{\faBook{}~Learn manual} button is used to build the search index for t
|
||||
|
||||
The horizontal meter directly below shows how much of the context size has been used. Tracy uses various techniques to manage context size, such as limiting the amount of data provided to the model or removing older data. However, the context will eventually be fully utilized during an extended conversation, resulting in a significant degradation of the quality of model responses.
|
||||
|
||||
The chat section contains the conversation with the automated assistant.
|
||||
The chat section contains the conversation with the automated assistant with alternating user and assistant turns. Clicking on the~\emph{\faUser{}~User} role icon removes the chat content up to the selected question. Similarly, clicking on the~\emph{\faRobot{}~Assistant} role icon removes the conversation content up to this point and generates another response from the assistant.
|
||||
|
||||
Clicking on the~\emph{\faUser{}~User} role icon removes the chat content up to the selected question. Similarly, clicking on the~\emph{\faRobot{}~Assistant} role icon removes the conversation content up to this point and generates another response from the assistant.
|
||||
The assistant may give preliminary replies to the user, for example, \enquote{I will now check the source of function foobar}, followed by performing the actual check, then a continuation of the reply, such as \enquote{Now I can see that...}. To make reading these tiered replies easier, only the most recent reply is printed in normal text, while the preliminary responses are dimmed out.
|
||||
|
||||
\subsubsection{Tools}
|
||||
Each assistant reply contains a note about the language model that was used and the time it took to generate the text.
|
||||
|
||||
The chat entry at the bottom is composed of the text input box and the \emph{\faPaperPlane{}~Send} button. When the assistant is writing a reply, this section is replaced with the~\emph{\faStop{}~Stop} button. If the~\emph{\faCommentDots{}~Chat suggestions} option is enabled, the writing prompt for the subsequent questions will be provided by a proposed question prepended with the~\faCommentDots{}~icon. This suggestion can be simply accepted by pressing \keys{Enter}.
|
||||
|
||||
\subsection{Tools}
|
||||
|
||||
The automated assistant has access to a set of tools that allow it to gather information. These tools are used automatically when needed to answer your questions. The following tools are available:
|
||||
|
||||
@@ -5056,11 +5199,14 @@ The automated assistant has access to a set of tools that allow it to gather inf
|
||||
\item \emph{Manual search} -- Perform semantic search in this user manual. Requires an embeddings model to be selected and the \emph{Learn manual} button to be clicked.
|
||||
\item \emph{Source file} -- Retrieve source file contents from the captured trace.
|
||||
\item \emph{Source search} -- Search within the captured source files using regular expressions.
|
||||
\item \emph{Symbol disassembly} -- Retrieve the disassembly and the captured profiling data of the symbol.
|
||||
\item \emph{Symbol parents} -- Get the entry call stacks for the symbol.
|
||||
\item \emph{Sampling statistics} -- List the functions that took the most program execution time.
|
||||
\end{itemize}
|
||||
|
||||
Note that Wikipedia, dictionary, web search, and webpage retrieval tools require the \emph{Internet access} option to be enabled.
|
||||
|
||||
\subsubsection{Attachments}
|
||||
\subsection{Attachments}
|
||||
|
||||
You can provide context to the assistant by attaching relevant data from the profiler. The following types of attachments are available:
|
||||
|
||||
@@ -5075,6 +5221,8 @@ You can provide context to the assistant by attaching relevant data from the pro
|
||||
|
||||
Attachments can be added through the \emph{\faRobot{}~Tracy Assist} buttons available in various profiler windows, such as the call stack window or the symbol view.
|
||||
|
||||
Contents of some attachments can be viewed by clicking the \emph{\faEye{}~View} button next to the attachment.
|
||||
|
||||
\section{Exporting zone statistics to CSV}
|
||||
\label{csvexport}
|
||||
|
||||
@@ -5101,8 +5249,8 @@ You can customize the output with the following command line options:
|
||||
\item \texttt{-h, -\hspace{-1.25ex} -help} -- Display a help message
|
||||
\item \texttt{-f, -\hspace{-1.25ex} -filter <name>} -- Filter the zone names
|
||||
\item \texttt{-c, -\hspace{-1.25ex} -case} -- Make the name filtering case sensitive
|
||||
\item \texttt{-s, -\hspace{-1.25ex} -sep <separator>} -- Customize the CSV separator (default is ``\texttt{,}'')
|
||||
\item \texttt{-e, -\hspace{-1.25ex} -self} -- Use self time (equivalent to the ``Self time'' toggle in the profiler GUI)
|
||||
\item \texttt{-s, -\hspace{-1.25ex} -sep <separator>} -- Customize the CSV separator (default is \enquote{\texttt{,}})
|
||||
\item \texttt{-e, -\hspace{-1.25ex} -self} -- Use self time (equivalent to the \enquote{Self time} toggle in the profiler GUI)
|
||||
\item \texttt{-u, -\hspace{-1.25ex} -unwrap} -- Report each zone individually; this will discard the statistics columns and instead report the timestamp and duration for each zone entry
|
||||
\item \texttt{-g, -\hspace{-1.25ex} -gpu} -- Report each GPU zone event
|
||||
\item \texttt{-m, -\hspace{-1.25ex} -messages} -- Report only messages
|
||||
@@ -5181,9 +5329,9 @@ Various files at the root configuration directory store common profiler state su
|
||||
|
||||
Trace files saved on disk are immutable and can't be changed. Still, it may be desirable to store additional per-trace information to be used by the profiler, for example, a custom description of the trace or the timeline view position used in the previous profiling session.
|
||||
|
||||
This external data is stored in the \texttt{user/[letter]/[program]/[week]/[epoch]} directory, relative to the configuration's root directory. The \texttt{program} part is the name of the profiled application (for example \texttt{program.exe}). The \texttt{letter} part is the first letter of the profiled application's name. The \texttt{week} part is a count of weeks since the Unix epoch, and the \texttt{epoch} part is a count of seconds since the Unix epoch. This rather unusual convention prevents the creation of directories with hundreds of entries.
|
||||
This external sidecar data is stored by default in the \texttt{sidecar/[program]/[date].json} file, relative to the configuration root directory. The \texttt{program} part is the name of the profiled application (for example \texttt{program.exe}). The \texttt{date} part is in year-month-day-\emph{dash}-hour-minutes-seconds format.
|
||||
|
||||
The profiler never prunes user settings.
|
||||
The sidecar file can be made public (see section~\ref{traceinfo}), in which case it will be placed next to the trace file with the \texttt{.json} extension, allowing both files to be easily moved or copied.
|
||||
|
||||
\subsection{Cache files}
|
||||
|
||||
|
||||
@@ -21,6 +21,10 @@ if get_option('callstack') > 0
|
||||
tracy_common_args += ['-DTRACY_CALLSTACK='+get_option('callstack').to_string()]
|
||||
endif
|
||||
|
||||
if get_option('platform_header') != ''
|
||||
tracy_common_args += ['-DTRACY_PLATFORM_HEADER="'+get_option('platform_header')+'"']
|
||||
endif
|
||||
|
||||
if get_option('no_callstack')
|
||||
tracy_common_args += ['-DTRACY_NO_CALLSTACK']
|
||||
endif
|
||||
@@ -131,6 +135,10 @@ if get_option('ignore_memory_faults')
|
||||
tracy_common_args += ['-DTRACY_IGNORE_MEMORY_FAULTS']
|
||||
endif
|
||||
|
||||
if get_option('opengl_auto_calibration')
|
||||
tracy_common_args += ['-DTRACY_OPENGL_AUTO_CALIBRATION']
|
||||
endif
|
||||
|
||||
tracy_shared_libs = get_option('default_library') == 'shared'
|
||||
|
||||
if tracy_shared_libs
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
option('tracy_enable', type : 'boolean', value : false, description : 'Enable profiling', yield: true)
|
||||
option('on_demand', type : 'boolean', value : false, description : 'On-demand profiling')
|
||||
option('callstack', type : 'integer', value : 0, description : 'Enforce callstack collection for tracy zones x frames deep')
|
||||
option('platform_header', type : 'string', value : '', description : 'Path to a header providing TRACY_HAS_CUSTOM_* hooks for an unsupported platform')
|
||||
option('no_callstack', type : 'boolean', value : false, description : 'Disable all callstack related functionality')
|
||||
option('no_callstack_inlines', type : 'boolean', value : false, description : 'Disables the inline functions in callstacks')
|
||||
option('only_localhost', type : 'boolean', value : false, description : 'Only listen on the localhost interface')
|
||||
@@ -28,3 +29,4 @@ option('verbose', type : 'boolean', value : false, description : 'Enable verbose
|
||||
option('no_internal_message', type : 'boolean', value : false, description : 'Prevent the profiler from logging messages')
|
||||
option('debuginfod', type : 'boolean', value : false, description : 'Enable debuginfod support')
|
||||
option('ignore_memory_faults', type : 'boolean', value : false, description : 'Ignore instrumentation errors from memory free events that do not have a matching allocation')
|
||||
option('opengl_auto_calibration', type : 'boolean', value : false, description : 'Periodically recalibrate OpenGL GPU/CPU clock drift (forces a CPU/GPU sync each time)')
|
||||
|
||||
@@ -5,10 +5,11 @@ include(${CMAKE_CURRENT_LIST_DIR}/../cmake/options.cmake)
|
||||
set_option(NO_FILESELECTOR "Disable the file selector" OFF)
|
||||
set_option(GTK_FILESELECTOR "Use the GTK file selector on Linux instead of the xdg-portal one" OFF)
|
||||
set_option(LEGACY "Instead of Wayland, use the legacy X11 backend on Linux" OFF)
|
||||
set_option(NO_STATISTICS "Disable calculation of statistics" OFF)
|
||||
set_option(SELF_PROFILE "Enable self-profiling" OFF)
|
||||
set_option_value(SANITIZE "Sanitizer parameters" "")
|
||||
|
||||
set(NO_STATISTICS OFF)
|
||||
|
||||
include(${CMAKE_CURRENT_LIST_DIR}/../cmake/version.cmake)
|
||||
|
||||
set(CMAKE_CXX_STANDARD 20)
|
||||
@@ -43,10 +44,16 @@ ExternalProject_Add(embed
|
||||
)
|
||||
|
||||
function(Embed LIST NAME FILE)
|
||||
cmake_parse_arguments(EMBED "TEXT" "" "" ${ARGN})
|
||||
if(EMBED_TEXT)
|
||||
set(EMBED_FLAGS -t)
|
||||
else()
|
||||
set(EMBED_FLAGS)
|
||||
endif()
|
||||
add_custom_command(
|
||||
OUTPUT data/${NAME}.cpp data/${NAME}.hpp
|
||||
COMMAND ${CMAKE_COMMAND} -E make_directory data
|
||||
COMMAND ${CMAKE_CURRENT_BINARY_DIR}/embed ${NAME} ${CMAKE_CURRENT_LIST_DIR}/${FILE} data/${NAME}
|
||||
COMMAND ${CMAKE_CURRENT_BINARY_DIR}/embed ${EMBED_FLAGS} ${NAME} ${CMAKE_CURRENT_LIST_DIR}/${FILE} data/${NAME}
|
||||
DEPENDS embed ${CMAKE_CURRENT_LIST_DIR}/${FILE}
|
||||
)
|
||||
list(APPEND ${LIST} data/${NAME}.cpp)
|
||||
@@ -69,6 +76,7 @@ set(SERVER_FILES
|
||||
TracyMarkdown.cpp
|
||||
TracyMicroArchitecture.cpp
|
||||
TracyMouse.cpp
|
||||
TracyNameGen.cpp
|
||||
TracyProtoHistory.cpp
|
||||
TracySourceContents.cpp
|
||||
TracySourceTokenizer.cpp
|
||||
@@ -144,18 +152,33 @@ set(PROFILER_FILES
|
||||
src/winmainArchDiscovery.cpp
|
||||
)
|
||||
|
||||
Embed(PROFILER_FILES SystemPrompt src/llm/system.prompt.md)
|
||||
Embed(PROFILER_FILES SkillCallstack src/llm/skill.callstack.md)
|
||||
Embed(PROFILER_FILES SkillOptimization src/llm/skill.optimization.md)
|
||||
Embed(PROFILER_FILES ToolsJson src/llm/tools.json)
|
||||
Embed(PROFILER_FILES SystemPrompt src/llm/system.prompt.md TEXT)
|
||||
Embed(PROFILER_FILES SkillCallstack src/llm/skill.callstack.md TEXT)
|
||||
Embed(PROFILER_FILES SkillOptimization src/llm/skill.optimization.md TEXT)
|
||||
Embed(PROFILER_FILES ToolsJson src/llm/tools.json TEXT)
|
||||
|
||||
Embed(PROFILER_FILES FontFixed src/font/FiraCode-Retina.ttf)
|
||||
Embed(PROFILER_FILES FontIcons src/font/Font\ Awesome\ 6\ Free-Solid-900.otf)
|
||||
Embed(PROFILER_FILES FontIcons src/font/Font\ Awesome\ 7\ Free-Solid-900.otf)
|
||||
Embed(PROFILER_FILES FontNormal src/font/Roboto-Regular.ttf)
|
||||
Embed(PROFILER_FILES FontBold src/font/Roboto-Bold.ttf)
|
||||
Embed(PROFILER_FILES FontItalic src/font/Roboto-Italic.ttf)
|
||||
Embed(PROFILER_FILES FontBoldItalic src/font/Roboto-BoldItalic.ttf)
|
||||
Embed(PROFILER_FILES FontEmoji src/font/NotoEmoji-Regular.ttf)
|
||||
Embed(PROFILER_FILES Manual ../manual/tracy.md)
|
||||
|
||||
Embed(PROFILER_FILES Manual ../manual/tracy.md TEXT)
|
||||
|
||||
Embed(PROFILER_FILES Text100Million src/achievements/100Million.md TEXT)
|
||||
Embed(PROFILER_FILES TextConnectToClient src/achievements/ConnectToClient.md TEXT)
|
||||
Embed(PROFILER_FILES TextFindZone src/achievements/FindZone.md TEXT)
|
||||
Embed(PROFILER_FILES TextFrameImages src/achievements/FrameImages.md TEXT)
|
||||
Embed(PROFILER_FILES TextGlobalSettings src/achievements/GlobalSettings.md TEXT)
|
||||
Embed(PROFILER_FILES TextInstrumentationIntro src/achievements/InstrumentationIntro.md TEXT)
|
||||
Embed(PROFILER_FILES TextInstrumentationStatistics src/achievements/InstrumentationStatistics.md TEXT)
|
||||
Embed(PROFILER_FILES TextInstrumentFrames src/achievements/InstrumentFrames.md TEXT)
|
||||
Embed(PROFILER_FILES TextIntro src/achievements/Intro.md TEXT)
|
||||
Embed(PROFILER_FILES TextLoadTrace src/achievements/LoadTrace.md TEXT)
|
||||
Embed(PROFILER_FILES TextSamplingIntro src/achievements/SamplingIntro.md TEXT)
|
||||
Embed(PROFILER_FILES TextSaveTrace src/achievements/SaveTrace.md TEXT)
|
||||
|
||||
set(INCLUDES "${CMAKE_CURRENT_BINARY_DIR}")
|
||||
set(LIBS "")
|
||||
@@ -277,7 +300,19 @@ if(NOT EMSCRIPTEN)
|
||||
endif()
|
||||
|
||||
if(EMSCRIPTEN)
|
||||
target_link_options(${PROJECT_NAME} PRIVATE -pthread -sASSERTIONS=0 -sINITIAL_MEMORY=384mb -sALLOW_MEMORY_GROWTH=1 -sMAXIMUM_MEMORY=4gb -sSTACK_SIZE=1048576 -sWASM_BIGINT=1 -sPTHREAD_POOL_SIZE=8 -sEXPORTED_FUNCTIONS=_main,_nativeOpenFile,_tracy_paste_clipboard -sEXPORTED_RUNTIME_METHODS=ccall -sENVIRONMENT=web,worker --preload-file embed.tracy)
|
||||
target_link_options(${PROJECT_NAME} PRIVATE
|
||||
-pthread
|
||||
-sASSERTIONS=0
|
||||
-sINITIAL_MEMORY=384mb
|
||||
-sALLOW_MEMORY_GROWTH=1
|
||||
-sMAXIMUM_MEMORY=4gb
|
||||
-sSTACK_SIZE=1048576
|
||||
-sPTHREAD_POOL_SIZE=8
|
||||
-sEXPORTED_FUNCTIONS=_main,_nativeOpenFile,_tracy_paste_clipboard
|
||||
-sEXPORTED_RUNTIME_METHODS=ccall
|
||||
-sENVIRONMENT=web,worker
|
||||
--preload-file embed.tracy
|
||||
)
|
||||
|
||||
file(DOWNLOAD https://share.nereid.pl/i/embed.tracy ${CMAKE_CURRENT_BINARY_DIR}/embed.tracy EXPECTED_MD5 ca0fa4f01e7b8ca5581daa16b16c768d)
|
||||
file(COPY ${CMAKE_CURRENT_LIST_DIR}/wasm/index.html DESTINATION ${CMAKE_CURRENT_BINARY_DIR})
|
||||
|
||||
@@ -1,17 +1,27 @@
|
||||
#include <stdint.h>
|
||||
#include <stdio.h>
|
||||
#include <string.h>
|
||||
#include <string>
|
||||
|
||||
#include "../../public/common/tracy_lz4hc.hpp"
|
||||
|
||||
static void Usage()
|
||||
{
|
||||
fprintf( stderr, "Usage: embed <objectName> <source> <destination>\n" );
|
||||
fprintf( stderr, "Usage: embed [-t] <objectName> <source> <destination>\n" );
|
||||
fprintf( stderr, " destination should be without extension, will create cpp, hpp pair\n" );
|
||||
fprintf( stderr, " -t: treat source as text, convert line endings to unix\n" );
|
||||
}
|
||||
|
||||
int main( int argc, char** argv )
|
||||
{
|
||||
bool text = false;
|
||||
if( argc >= 2 && strcmp( argv[1], "-t" ) == 0 )
|
||||
{
|
||||
text = true;
|
||||
argc--;
|
||||
argv++;
|
||||
}
|
||||
|
||||
if( argc < 4 )
|
||||
{
|
||||
Usage();
|
||||
@@ -38,6 +48,16 @@ int main( int argc, char** argv )
|
||||
fread( data, 1, sz, src );
|
||||
fclose( src );
|
||||
|
||||
if( text )
|
||||
{
|
||||
size_t pos = 0;
|
||||
for( size_t i=0; i<sz; i++ )
|
||||
{
|
||||
if( data[i] != '\r' ) data[pos++] = data[i];
|
||||
}
|
||||
sz = pos;
|
||||
}
|
||||
|
||||
const auto lz4szMax = tracy::LZ4_compressBound( sz );
|
||||
auto lz4data = new uint8_t[lz4szMax];
|
||||
const auto lz4sz = tracy::LZ4_compress_HC( (const char*)data, (char*)lz4data, sz, lz4szMax, 6 );
|
||||
|
||||
@@ -162,6 +162,15 @@ static ImGuiKey TranslateKeyCode( const char* code )
|
||||
return ImGuiKey_None;
|
||||
}
|
||||
|
||||
static void UpdateKeyModifiers( const EmscriptenKeyboardEvent* e )
|
||||
{
|
||||
ImGuiIO& io = ImGui::GetIO();
|
||||
io.AddKeyEvent( ImGuiMod_Ctrl, e->ctrlKey );
|
||||
io.AddKeyEvent( ImGuiMod_Shift, e->shiftKey );
|
||||
io.AddKeyEvent( ImGuiMod_Alt, e->altKey );
|
||||
io.AddKeyEvent( ImGuiMod_Super, e->metaKey );
|
||||
}
|
||||
|
||||
Backend::Backend( const char* title, const std::function<void()>& redraw, const std::function<void(float)>& scaleChanged, const std::function<int(void)>& isBusy, RunQueue* mainThreadTasks )
|
||||
{
|
||||
constexpr EGLint eglConfigAttrib[] = {
|
||||
@@ -243,6 +252,7 @@ Backend::Backend( const char* title, const std::function<void()>& redraw, const
|
||||
return EM_TRUE;
|
||||
} );
|
||||
emscripten_set_keydown_callback( EMSCRIPTEN_EVENT_TARGET_WINDOW, nullptr, EM_TRUE, [] ( int, const EmscriptenKeyboardEvent* e, void* ) -> EM_BOOL {
|
||||
UpdateKeyModifiers( e );
|
||||
const auto code = TranslateKeyCode( e->code );
|
||||
if( code == ImGuiKey_None ) return EM_FALSE;
|
||||
ImGui::GetIO().AddKeyEvent( code, true );
|
||||
@@ -250,6 +260,7 @@ Backend::Backend( const char* title, const std::function<void()>& redraw, const
|
||||
return EM_TRUE;
|
||||
} );
|
||||
emscripten_set_keyup_callback( EMSCRIPTEN_EVENT_TARGET_WINDOW, nullptr, EM_TRUE, [] ( int, const EmscriptenKeyboardEvent* e, void* ) -> EM_BOOL {
|
||||
UpdateKeyModifiers( e );
|
||||
const auto code = TranslateKeyCode( e->code );
|
||||
if( code == ImGuiKey_None ) return EM_FALSE;
|
||||
ImGui::GetIO().AddKeyEvent( code, false );
|
||||
|
||||
@@ -4,7 +4,6 @@
|
||||
#include <misc/freetype/imgui_freetype.h>
|
||||
|
||||
#include "Fonts.hpp"
|
||||
#include "profiler/IconsFontAwesome6.h"
|
||||
#include "profiler/TracyEmbed.hpp"
|
||||
|
||||
#include "data/FontFixed.hpp"
|
||||
|
||||
12
profiler/src/achievements/100Million.md
Normal file
@@ -0,0 +1,12 @@
|
||||
# It's over 100 million!
|
||||
|
||||
Tracy can handle a lot of data. How about 100 million zones in a single trace? Add a lot of zones to your program and see how it handles it!
|
||||
|
||||
Capturing a long-running profile trace is easy. Need to profile an hour of your program execution? You can do it.
|
||||
|
||||
Note that it doesn't make much sense to instrument every little function you might have. The cost of the instrumentation itself will be higher than the cost of the function in such a case.
|
||||
|
||||
> [!TIP]
|
||||
> Keep in mind that the more zones you have, the more memory and CPU time the profiler will use. Be careful not to run out of memory.
|
||||
>
|
||||
> To capture 100 million zones, you will need approximately 4 GB of RAM.
|
||||
10
profiler/src/achievements/ConnectToClient.md
Normal file
@@ -0,0 +1,10 @@
|
||||
# First profiling session
|
||||
|
||||
Let's start our adventure by instrumenting your application and connecting it to the profiler. Here's a quick refresher:
|
||||
|
||||
1. Integrate Tracy Profiler into your application. This can be done using CMake, Meson, or simply by adding the source files to your project.
|
||||
2. Make sure that `TracyClient.cpp` (or the Tracy library) is included in your build.
|
||||
3. Define `TRACY_ENABLE` in your build configuration, for the whole application. Do not do it in a single source file because it won't work.
|
||||
4. Start your application, and * Connect* to it with the profiler.
|
||||
|
||||
Please refer to the [user manual](https://github.com/wolfpld/tracy/releases) for more details.
|
||||
11
profiler/src/achievements/FindZone.md
Normal file
@@ -0,0 +1,11 @@
|
||||
# Find some zones
|
||||
|
||||
You can search for zones in the trace by opening the search window with the * Find zone* button on the top bar. It will ask you for the zone name, which in most cases will be the function name in the code.
|
||||
|
||||
The search may find more than one zone with the same name. A list of all the zones found is displayed, and you can select any of them.
|
||||
|
||||
Alternatively, you can open the Statistics window and click an entry there. This will open the Find zone window as if you had searched for that zone.
|
||||
|
||||
When a zone is selected, a number of statistics are displayed to help you understand the performance of your application. In addition, a histogram of the zone execution times is displayed to make it easier for you to determine the performance of the profiled code. Be sure to select a zone with a large number of calls to make the histogram look interesting!
|
||||
|
||||
Note that you can draw a range on the histogram to limit the number of entries displayed in the zone list below. This list allows you to examine each zone individually. There are also a number of zone groupings that you can select. Each group can be selected and the time associated with the selected group will be highlighted on the histogram.
|
||||