diff --git a/Doc/howto/perf_profiling.rst b/Doc/howto/perf_profiling.rst index ad2eb7b4d58aa5..6af5536166f58a 100644 --- a/Doc/howto/perf_profiling.rst +++ b/Doc/howto/perf_profiling.rst @@ -15,9 +15,9 @@ information about the performance of your application. that aid with the analysis of the data that it produces. The main problem with using the ``perf`` profiler with Python applications is that -``perf`` only allows to get information about native symbols, this is, the names of -the functions and procedures written in C. This means that the names and file names -of the Python functions in your code will not appear in the output of the ``perf``. +``perf`` only gets information about native symbols, that is, the names of +functions and procedures written in C. This means that the names and file names +of Python functions in your code will not appear in the output of ``perf``. Since Python 3.12, the interpreter can run in a special mode that allows Python functions to appear in the output of the ``perf`` profiler. When this mode is @@ -28,8 +28,8 @@ relationship between this piece of code and the associated Python function using .. note:: - Support for the ``perf`` profiler is only currently available for Linux on - selected architectures. Check the output of the configure build step or + Support for the ``perf`` profiler is currently only available for Linux on + select architectures. Check the output of the ``configure`` build step or check the output of ``python -m sysconfig | grep HAVE_PERF_TRAMPOLINE`` to see if your system is supported. @@ -52,11 +52,11 @@ For example, consider the following script: if __name__ == "__main__": baz(1000000) -We can run ``perf`` to sample CPU stack traces at 9999 Hertz:: +We can run ``perf`` to sample CPU stack traces at 9999 hertz:: $ perf record -F 9999 -g -o perf.data python my_script.py -Then we can use ``perf`` report to analyze the data: +Then we can use ``perf report`` to analyze the data: .. code-block:: shell-session @@ -97,7 +97,7 @@ Then we can use ``perf`` report to analyze the data: | | | | | |--2.97%--_PyObject_Malloc ... -As you can see here, the Python functions are not shown in the output, only ``_Py_Eval_EvalFrameDefault`` appears +As you can see, the Python functions are not shown in the output, only ``_Py_Eval_EvalFrameDefault`` (the function that evaluates the Python bytecode) shows up. Unfortunately that's not very useful because all Python functions use the same C function to evaluate bytecode so we cannot know which Python function corresponds to which bytecode-evaluating function. @@ -151,7 +151,7 @@ Instead, if we run the same experiment with ``perf`` support enabled we get: How to enable ``perf`` profiling support ---------------------------------------- -``perf`` profiling support can either be enabled from the start using +``perf`` profiling support can be enabled either from the start using the environment variable :envvar:`PYTHONPERFSUPPORT` or the :option:`-X perf <-X>` option, or dynamically using :func:`sys.activate_stack_trampoline` and @@ -192,7 +192,7 @@ Example, using the :mod:`sys` APIs in file :file:`example.py`: How to obtain the best results ------------------------------ -For the best results, Python should be compiled with +For best results, Python should be compiled with ``CFLAGS="-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer"`` as this allows profilers to unwind using only the frame pointer and not on DWARF debug information. This is because as the code that is interposed to allow ``perf``