@@ -28,7 +28,7 @@ For example, starting with the following base configuration:
28
28
[omnitrace] Outputting ' omnitrace-example-output/wall-clock.txt' ...
29
29
[omnitrace] Outputting ' omnitrace-example-output/wall-clock.json' ...
30
30
31
- If the ``OMNITRACE_USE_PID `` option is enabled, then running a non-MPI executable
31
+ If the ``OMNITRACE_USE_PID `` option is enabled, then running a non-MPI executable
32
32
with a PID of ``63453 `` results in the following output:
33
33
34
34
.. code-block :: shell
@@ -58,7 +58,7 @@ Metadata
58
58
========================================
59
59
60
60
Omnitrace outputs a ``metadata.json `` file. This metadata file contains
61
- information about the settings, environment variables, output files, and info
61
+ information about the settings, environment variables, output files, and info
62
62
about the system and the run, as follows:
63
63
64
64
* Hardware cache sizes
@@ -240,14 +240,14 @@ Metadata JSON Sample
240
240
Configuring the Omnitrace output
241
241
========================================
242
242
243
- Omnitrace includes a core set of options for controlling the format
243
+ Omnitrace includes a core set of options for controlling the format
244
244
and contents of the output files. For additional information, see the guide on
245
245
:doc: `configuring runtime options <./configuring-runtime-options >`.
246
246
247
247
Core configuration settings
248
248
-----------------------------------
249
249
250
- .. csv-table ::
250
+ .. csv-table ::
251
251
:header: "Setting", "Value", "Description"
252
252
:widths: 30, 30, 100
253
253
@@ -261,20 +261,20 @@ Core configuration settings
261
261
Output prefix keys
262
262
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
263
263
264
- Output prefix keys have many uses but are most helpful when dealing with multiple
264
+ Output prefix keys have many uses but are most helpful when dealing with multiple
265
265
profiling runs or large MPI jobs.
266
- They are included in Omnitrace because they were introduced into Timemory
266
+ They are included in Omnitrace because they were introduced into Timemory
267
267
for `compile-time-perf <https://github.com/jrmadsen/compile-time-perf >`_.
268
- They are needed to create different output files for a generic wrapper around
268
+ They are needed to create different output files for a generic wrapper around
269
269
compilation commands while still
270
270
overwriting the output from the last time a file was compiled.
271
271
272
- When doing scaling studies and specifying options via the command line,
272
+ When doing scaling studies and specifying options via the command line,
273
273
the recommended process is to
274
274
use a common ``OMNITRACE_OUTPUT_PATH ``, disable ``OMNITRACE_TIME_OUTPUT ``,
275
275
set ``OMNITRACE_OUTPUT_PREFIX="%argt%-" ``, and let Omnitrace cleanly organize the output.
276
276
277
- .. csv-table ::
277
+ .. csv-table ::
278
278
:header: "String", "Encoding"
279
279
:widths: 20, 120
280
280
@@ -311,16 +311,22 @@ set ``OMNITRACE_OUTPUT_PREFIX="%argt%-"``, and let Omnitrace cleanly organize th
311
311
.. note ::
312
312
313
313
In any output prefix key which contains a ``/ `` character, the ``/ `` characters
314
- are replaced with ``_ `` and any leading underscores are stripped. For example,
315
- an ``%arg0% `` of ``/usr/bin/foo `` translates to ``usr_bin_foo ``. Additionally, any ``%arg<N>% `` keys which
314
+ are replaced with ``_ `` and any leading underscores are stripped. For example,
315
+ an ``%arg0% `` of ``/usr/bin/foo `` translates to ``usr_bin_foo ``. Additionally, any ``%arg<N>% `` keys which
316
316
do not have a command line argument at position ``<N> `` are ignored.
317
317
318
318
Perfetto output
319
319
========================================
320
320
321
- Use the ``OMNITRACE_OUTPUT_FILE `` to specify a specific location. If this is an
321
+ Use the ``OMNITRACE_OUTPUT_FILE `` to specify a specific location. If this is an
322
322
absolute path, then all ``OMNITRACE_OUTPUT_PATH `` and similar
323
- settings are ignored. Visit `ui.perfetto.dev <https://ui.perfetto.dev >`_ and open this file.
323
+ settings are ignored. Visit `ui.perfetto.dev <https://ui.perfetto.dev >`_ and open
324
+ this file.
325
+
326
+ .. important ::
327
+ Perfetto validation is done with trace_processor v46.0 as there is a known issue with v47.0.
328
+ If you are experiencing problems viewing your trace in the latest version of `Perfetto <http://ui.perfetto.dev >`_,
329
+ then try using `Perfetto UI v46.0 <https://ui.perfetto.dev/v46.0-35b3d9845/#!/ >`_.
324
330
325
331
.. image :: ../data/omnitrace-perfetto.png
326
332
:alt: Visualization of a performance graph in Perfetto
@@ -349,20 +355,20 @@ Use ``omnitrace-avail --components --filename`` to view the base filename for ea
349
355
| sampling_wall_clock | true | sampling_wall_clock |
350
356
| ---------------------------------| ---------------| ------------------------|
351
357
352
- The ``OMNITRACE_COLLAPSE_THREADS `` and ``OMNITRACE_COLLAPSE_PROCESSES `` settings are
353
- only valid when full `MPI support is enabled <../install/install.html#mpi-support-within-omnitrace >`_.
354
- When they are set, Timemory combines the per-thread and per-rank data (respectively) of
358
+ The ``OMNITRACE_COLLAPSE_THREADS `` and ``OMNITRACE_COLLAPSE_PROCESSES `` settings are
359
+ only valid when full `MPI support is enabled <../install/install.html#mpi-support-within-omnitrace >`_.
360
+ When they are set, Timemory combines the per-thread and per-rank data (respectively) of
355
361
identical call stacks.
356
362
357
- The ``OMNITRACE_FLAT_PROFILE `` setting removes all call stack hierarchy.
363
+ The ``OMNITRACE_FLAT_PROFILE `` setting removes all call stack hierarchy.
358
364
Using ``OMNITRACE_FLAT_PROFILE=ON `` in combination
359
- with ``OMNITRACE_COLLAPSE_THREADS=ON `` is a useful configuration for identifying
365
+ with ``OMNITRACE_COLLAPSE_THREADS=ON `` is a useful configuration for identifying
360
366
min/max measurements regardless of the calling context.
361
- The ``OMNITRACE_TIMELINE_PROFILE `` setting (with ``OMNITRACE_FLAT_PROFILE=OFF ``) effectively
367
+ The ``OMNITRACE_TIMELINE_PROFILE `` setting (with ``OMNITRACE_FLAT_PROFILE=OFF ``) effectively
362
368
generates similar data to that found
363
- in Perfetto. Enabling timeline and flat profiling effectively generates
369
+ in Perfetto. Enabling timeline and flat profiling effectively generates
364
370
similar data to ``strace ``. However, while Timemory generally
365
- requires significantly less memory than Perfetto, this is not the case in timeline
371
+ requires significantly less memory than Perfetto, this is not the case in timeline
366
372
mode, so use this setting with caution.
367
373
368
374
Timemory text output
@@ -381,11 +387,11 @@ The truncation settings be changed through the ``OMNITRACE_MAX_WIDTH`` setting.
381
387
Timemory text output example
382
388
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
383
389
384
- In the following example, the ``NN `` field in ``|NN>>> `` is the thread ID. If MPI support is enabled,
390
+ In the following example, the ``NN `` field in ``|NN>>> `` is the thread ID. If MPI support is enabled,
385
391
this becomes ``|MM|NN>>> `` where ``MM `` is the rank.
386
- If ``OMNITRACE_COLLAPSE_THREADS=ON `` and ``OMNITRACE_COLLAPSE_PROCESSES=ON `` are configured,
392
+ If ``OMNITRACE_COLLAPSE_THREADS=ON `` and ``OMNITRACE_COLLAPSE_PROCESSES=ON `` are configured,
387
393
neither the ``MM `` nor the ``NN `` are present unless the
388
- component explicitly sets type traits. Type traits specify that the data is only
394
+ component explicitly sets type traits. Type traits specify that the data is only
389
395
relevant per-thread or per-process, such as the ``thread_cpu_clock `` clock component.
390
396
391
397
.. code-block :: shell
@@ -573,15 +579,15 @@ relevant per-thread or per-process, such as the ``thread_cpu_clock`` clock compo
573
579
Timemory JSON output
574
580
-------------------------------------------------------------------------
575
581
576
- Timemory represents the data within the JSON output in two forms:
582
+ Timemory represents the data within the JSON output in two forms:
577
583
a flat structure and a hierarchical structure.
578
584
The flat JSON data represents the data similar to the text files, where the hierarchical information
579
585
is represented by the indentation of the ``prefix `` field and the ``depth `` field.
580
- The hierarchical JSON contains additional information with respect
586
+ The hierarchical JSON contains additional information with respect
581
587
to inclusive and exclusive values. However,
582
588
its structure must be processed using recursion. This section of the JSON output supports analysis
583
589
by `hatchet <https://github.com/hatchet/hatchet >`_.
584
- All the data entries for the flat structure are in a single JSON array. It is easier to
590
+ All the data entries for the flat structure are in a single JSON array. It is easier to
585
591
write a simple Python script for post-processing using this format than with the hierarchical structure.
586
592
587
593
.. note ::
@@ -929,7 +935,7 @@ Timemory JSON output Python post-processing example
929
935
)
930
936
)
931
937
932
- The result of applying this script to the corresponding JSON output from the :ref: `text-output-example-label `
938
+ The result of applying this script to the corresponding JSON output from the :ref: `text-output-example-label `
933
939
section is as follows:
934
940
935
941
.. code-block :: shell
0 commit comments