Skip to content

Commit a41238f

Browse files
authored
Optimize doc builds (#14856)
cudf docs are generally very slow to build. This problem was exacerbated by the recent addition of libcudf C++ API documentation to the Sphinx build. This PR aims to ameliorate this issue for both local and CI builds by making the following changes: - The XML parsing logic used to clean up doxygen XML now avoids rewriting files unless they are actually modified. This prevents Sphinx from doing extra work during a second (text) build after the first (HTML) build. - toctrees on the generated API pages are removed (see https://pydata-sphinx-theme.readthedocs.io/en/stable/user_guide/performance.html#selectively-remove-pages-from-your-sidebar). - Text docs are disabled in PRs and only occur in nightly/branch builds. The net result is roughly a halving of the CI run time for the builds (~40 min to ~20 min). Further potential optimizations: - Reenabling parallel builds. We cannot fully revert #14796 until the theme is fixed, but if we can put in a warning filter we could reenable parallelism and have it work on just the reading steps of the build and not the writes. That would still improve performance. - Better caching of notebooks. [nbsphinx supports caching](https://myst-nb.readthedocs.io/en/latest/computation/execute.html#execute-cache), but there are various caveats w.r.t. 1) local vs CI builds, 2) proper cache invalidation, e.g. when notebook source does not change but underlying libraries do, and 3) forcing rebuilds. Alternatively, we could enable some environment variable that allows devs to turn off notebook execution locally. Making it opt-in would make the default behavior safe while providing an escape hatch for power users who want the builds to be fast. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Lawrence Mitchell (https://github.com/wence-) URL: #14856
1 parent a0c637f commit a41238f

File tree

5 files changed

+27
-7
lines changed

5 files changed

+27
-7
lines changed

ci/build_docs.sh

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -41,19 +41,25 @@ popd
4141
rapids-logger "Build Python docs"
4242
pushd docs/cudf
4343
make dirhtml
44-
make text
45-
mkdir -p "${RAPIDS_DOCS_DIR}/cudf/"{html,txt}
44+
mkdir -p "${RAPIDS_DOCS_DIR}/cudf/html"
4645
mv build/dirhtml/* "${RAPIDS_DOCS_DIR}/cudf/html"
47-
mv build/text/* "${RAPIDS_DOCS_DIR}/cudf/txt"
46+
if [[ "${RAPIDS_BUILD_TYPE}" != "pull-request" ]]; then
47+
make text
48+
mkdir -p "${RAPIDS_DOCS_DIR}/cudf/txt"
49+
mv build/text/* "${RAPIDS_DOCS_DIR}/cudf/txt"
50+
fi
4851
popd
4952

5053
rapids-logger "Build dask-cuDF Sphinx docs"
5154
pushd docs/dask_cudf
5255
make dirhtml
53-
make text
54-
mkdir -p "${RAPIDS_DOCS_DIR}/dask-cudf/"{html,txt}
56+
mkdir -p "${RAPIDS_DOCS_DIR}/dask-cudf/html"
5557
mv build/dirhtml/* "${RAPIDS_DOCS_DIR}/dask-cudf/html"
56-
mv build/text/* "${RAPIDS_DOCS_DIR}/dask-cudf/txt"
58+
if [[ "${RAPIDS_BUILD_TYPE}" != "pull-request" ]]; then
59+
make text
60+
mkdir -p "${RAPIDS_DOCS_DIR}/dask-cudf/txt"
61+
mv build/text/* "${RAPIDS_DOCS_DIR}/dask-cudf/txt"
62+
fi
5763
popd
5864

5965
rapids-upload-docs

conda/environments/all_cuda-118_arch-x86_64.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,7 @@ dependencies:
9393
- sphinx-autobuild
9494
- sphinx-copybutton
9595
- sphinx-markdown-tables
96+
- sphinx-remove-toctrees
9697
- sphinxcontrib-websupport
9798
- streamz
9899
- sysroot_linux-64==2.17

conda/environments/all_cuda-120_arch-x86_64.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,7 @@ dependencies:
9191
- sphinx-autobuild
9292
- sphinx-copybutton
9393
- sphinx-markdown-tables
94+
- sphinx-remove-toctrees
9495
- sphinxcontrib-websupport
9596
- streamz
9697
- sysroot_linux-64==2.17

dependencies.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -472,6 +472,7 @@ dependencies:
472472
- sphinx-autobuild
473473
- sphinx-copybutton
474474
- sphinx-markdown-tables
475+
- sphinx-remove-toctrees
475476
- sphinxcontrib-websupport
476477
notebooks:
477478
common:

docs/cudf/source/conf.py

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,12 @@
1616
# add these directories to sys.path here. If the directory is relative to the
1717
# documentation root, use os.path.abspath to make it absolute, like shown here.
1818
#
19+
import filecmp
1920
import glob
2021
import os
2122
import re
2223
import sys
24+
import tempfile
2325
import xml.etree.ElementTree as ET
2426

2527
from docutils.nodes import Text
@@ -62,13 +64,16 @@ class PseudoLexer(RegexLexer):
6264
"sphinx.ext.autodoc",
6365
"sphinx.ext.autosummary",
6466
"sphinx_copybutton",
67+
"sphinx_remove_toctrees",
6568
"numpydoc",
6669
"IPython.sphinxext.ipython_console_highlighting",
6770
"IPython.sphinxext.ipython_directive",
6871
"PandasCompat",
6972
"myst_nb",
7073
]
7174

75+
remove_from_toctrees = ["user_guide/api_docs/api/*"]
76+
7277

7378
# Preprocess doxygen xml for compatibility with latest Breathe
7479
def clean_definitions(root):
@@ -126,7 +131,13 @@ def clean_all_xml_files(path):
126131
for fn in glob.glob(os.path.join(path, "*.xml")):
127132
tree = ET.parse(fn)
128133
clean_definitions(tree.getroot())
129-
tree.write(fn)
134+
with tempfile.NamedTemporaryFile() as tmp_fn:
135+
tree.write(tmp_fn.name)
136+
# Only write files that have actually changed.
137+
if not filecmp.cmp(tmp_fn.name, fn):
138+
tree.write(fn)
139+
140+
130141

131142

132143
# Breathe Configuration

0 commit comments

Comments
 (0)