Skip to content

Rewrite the "Docutils markup API" page #12505

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -185,8 +185,10 @@
('py:class', 'Node'), # sphinx.domains.Domain
('py:class', 'NullTranslations'), # gettext.NullTranslations
('py:class', 'RoleFunction'), # sphinx.domains.Domain
('py:class', 'RSTState'), # sphinx.utils.parsing.nested_parse_to_nodes
('py:class', 'Theme'), # sphinx.application.TemplateBridge
('py:class', 'system_message'), # sphinx.utils.docutils
('py:class', 'StringList'), # sphinx.utils.parsing.nested_parse_to_nodes
('py:class', 'system_message'), # sphinx.utils.docutils.SphinxDirective
('py:class', 'TitleGetter'), # sphinx.domains.Domain
('py:class', 'XRefRole'), # sphinx.domains.Domain
('py:class', 'docutils.nodes.Element'),
Expand Down
161 changes: 120 additions & 41 deletions doc/extdev/markupapi.rst
Original file line number Diff line number Diff line change
@@ -1,12 +1,53 @@
Docutils markup API
===================

This section describes the API for adding ReST markup elements (roles and
directives).
This section describes the API for adding reStructuredText markup elements
(roles and directives).


Roles
-----

Roles follow the interface described below.
They have to be registered by an extension using
:meth:`.Sphinx.add_role` or :meth:`.Sphinx.add_role_to_domain`.


.. code-block:: python

def role_function(
role_name: str, raw_source: str, text: str,
lineno: int, inliner: Inliner,
options: dict = {}, content: list = [],
) -> tuple[list[Node], list[system_message]]:
elements = []
messages = []
return elements, messages

The *options* and *content* parameters are only used for custom roles
created via the :dudir:`role` directive.
The return value is a tuple of two lists,
the first containing the text nodes and elements from the role,
and the second containing any system messages generated.
For more information, see the `custom role overview`_ from Docutils.

.. _custom role overview: https://docutils.sourceforge.io/docs/howto/rst-roles.html


Creating custom roles
^^^^^^^^^^^^^^^^^^^^^

Sphinx provides two base classes for creating custom roles,
:class:`~sphinx.util.docutils.SphinxRole` and :class:`~sphinx.util.docutils.ReferenceRole`.

These provide a class-based interface for creating roles,
where the main logic must be implemented in your ``run()`` method.
The classes provide a number of useful methods and attributes,
such as ``self.text``, ``self.config``, and ``self.env``.
The ``ReferenceRole`` class implements Sphinx's ``title <target>`` logic,
exposing ``self.target`` and ``self.title`` attributes.
This is useful for creating cross-reference roles.


Directives
----------
Expand Down Expand Up @@ -85,68 +126,106 @@ using :meth:`.Sphinx.add_directive` or :meth:`.Sphinx.add_directive_to_domain`.
The state and state machine which controls the parsing. Used for
``nested_parse``.

.. seealso::

`Creating directives`_ HOWTO of the Docutils documentation

.. _Creating directives: https://docutils.sourceforge.io/docs/howto/rst-directives.html


.. _parsing-directive-content-as-rest:

Parsing directive content as reStructuredText
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

ViewLists
^^^^^^^^^
Many directives will contain more markup that must be parsed.
To do this, use one of the following APIs from the :meth:`~Directive.run` method:

Docutils represents document source lines in a class
``docutils.statemachine.ViewList``. This is a list with extended functionality
-- for one, slicing creates views of the original list, and also the list
contains information about the source line numbers.
* :py:meth:`.SphinxDirective.parse_content_to_nodes()`
* :py:meth:`.SphinxDirective.parse_text_to_nodes()`

The :attr:`Directive.content` attribute is a ViewList. If you generate content
to be parsed as ReST, you have to create a ViewList yourself. Important for
content generation are the following points:
The first method parses all the directive's content as markup,
whilst the second only parses the given *text* string.
Both methods return the parsed Docutils nodes in a list.

* The constructor takes a list of strings (lines) and a source (document) name.
The methods are used as follows:

* The ``.append()`` method takes a line and a source name as well.
.. code-block:: python

def run(self) -> list[Node]:
# either
parsed = self.parse_content_to_nodes()
# or
parsed = self.parse_text_to_nodes('spam spam spam')
return parsed

Parsing directive content as ReST
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. note::

The above utility methods were added in Sphinx 7.4.
Prior to Sphinx 7.4, the following methods should be used to parse content:

Many directives will contain more markup that must be parsed. To do this, use
one of the following APIs from the :meth:`Directive.run` method:
* ``self.state.nested_parse``
* :func:`sphinx.util.nodes.nested_parse_with_titles` -- this allows titles in
the parsed content.

* ``self.state.nested_parse``
* :func:`sphinx.util.nodes.nested_parse_with_titles` -- this allows titles in
the parsed content.
.. code-block:: python

Both APIs parse the content into a given node. They are used like this::
def run(self) -> list[Node]:
container = docutils.nodes.Element()
# either
nested_parse_with_titles(self.state, self.result, container)
# or
self.state.nested_parse(self.result, 0, container)
parsed = container.children
return parsed

node = docutils.nodes.paragraph()
# either
nested_parse_with_titles(self.state, self.result, node)
# or
self.state.nested_parse(self.result, 0, node)
To parse inline markup,
use :py:meth:`~sphinx.util.docutils.SphinxDirective.parse_inline()`.
This must only be used for text which is a single line or paragraph,
and does not contain any structural elements
(headings, transitions, directives, etc).

.. note::

``sphinx.util.docutils.switch_source_input()`` allows to change a target file
during nested_parse. It is useful to mixed contents.
For example, ``sphinx.ext.autodoc`` uses it to parse docstrings::
``sphinx.util.docutils.switch_source_input()`` allows changing
the source (input) file during parsing content in a directive.
It is useful to parse mixed content, such as in ``sphinx.ext.autodoc``,
where it is used to parse docstrings.

.. code-block:: python

from sphinx.util.docutils import switch_source_input
from sphinx.util.docutils import switch_source_input
from sphinx.util.parsing import nested_parse_to_nodes

# Switch source_input between parsing content.
# Inside this context, all parsing errors and warnings are reported as
# happened in new source_input (in this case, ``self.result``).
with switch_source_input(self.state, self.result):
node = docutils.nodes.paragraph()
self.state.nested_parse(self.result, 0, node)
# Switch source_input between parsing content.
# Inside this context, all parsing errors and warnings are reported as
# happened in new source_input (in this case, ``self.result``).
with switch_source_input(self.state, self.result):
parsed = nested_parse_to_nodes(self.state, self.result)

.. deprecated:: 1.7

Until Sphinx 1.6, ``sphinx.ext.autodoc.AutodocReporter`` was used for this
purpose. It is replaced by ``switch_source_input()``.

If you don't need the wrapping node, you can use any concrete node type and
return ``node.children`` from the Directive.

.. _ViewLists:

.. seealso::
ViewLists and StringLists
^^^^^^^^^^^^^^^^^^^^^^^^^

`Creating directives`_ HOWTO of the Docutils documentation
Docutils represents document source lines in a ``StringList`` class,
which inherits from ``ViewList``, both in the ``docutils.statemachine`` module.
This is a list with extended functionality,
including that slicing creates views of the original list and
that the list contains information about source line numbers.

The :attr:`Directive.content` attribute is a ``StringList``.
If you generate content to be parsed as reStructuredText,
you have to create a ``StringList`` for the Docutils APIs.
The utility functions provided by Sphinx handle this automatically.
Important for content generation are the following points:

.. _Creating directives: https://docutils.sourceforge.io/docs/howto/rst-directives.html
* The ``ViewList`` constructor takes a list of strings (lines)
and a source (document) name.
* The ``ViewList.append()`` method takes a line and a source name as well.
9 changes: 9 additions & 0 deletions doc/extdev/utils.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ Utilities

Sphinx provides utility classes and functions to develop extensions.


Base classes for components
---------------------------

Expand Down Expand Up @@ -30,12 +31,20 @@ components (e.g. :class:`.Config`, :class:`.BuildEnvironment` and so on) easily.
.. autoclass:: sphinx.transforms.post_transforms.images.ImageConverter
:members:


Utility components
------------------

.. autoclass:: sphinx.events.EventManager
:members:


Utility functions
-----------------

.. autofunction:: sphinx.util.parsing.nested_parse_to_nodes


Utility types
-------------

Expand Down
6 changes: 6 additions & 0 deletions sphinx/util/docutils.py
Original file line number Diff line number Diff line change
Expand Up @@ -444,6 +444,8 @@ def parse_content_to_nodes(self, allow_section_headings: bool = False) -> list[N
an incoherent doctree. In Docutils, section nodes should
only be children of ``Structural`` nodes, which includes
``document``, ``section``, and ``sidebar`` nodes.

.. versionadded:: 7.4
"""
return nested_parse_to_nodes(
self.state,
Expand All @@ -468,6 +470,8 @@ def parse_text_to_nodes(
``document``, ``section``, and ``sidebar`` nodes.
:param offset:
The offset of the content.

.. versionadded:: 7.4
"""
if offset == -1:
offset = self.content_offset
Expand All @@ -491,6 +495,8 @@ def parse_inline(
The line number where the interpreted text begins.
:returns:
A list of nodes (text and inline elements) and a list of system_messages.

.. versionadded:: 7.4
"""
if lineno == -1:
lineno = self.lineno
Expand Down
10 changes: 6 additions & 4 deletions sphinx/util/parsing.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
import contextlib
from typing import TYPE_CHECKING

from docutils import nodes
from docutils.nodes import Element, Node
from docutils.statemachine import StringList, string2lines

if TYPE_CHECKING:
Expand All @@ -22,7 +22,7 @@ def nested_parse_to_nodes(
offset: int = 0,
allow_section_headings: bool = True,
keep_title_context: bool = False,
) -> list[nodes.Node]: # Element | nodes.Text
) -> list[Node]: # Element | nodes.Text
"""Parse *text* into nodes.

:param state:
Expand All @@ -47,13 +47,15 @@ def nested_parse_to_nodes(
This is useful when the parsed content comes from
a completely different context, such as docstrings.
If this is True, then title underlines must match those in
the surrounding document, otherwise errors will occur. TODO: check!
the surrounding document, otherwise the behaviour is undefined.

.. versionadded:: 7.4
"""
document = state.document
content = _text_to_string_list(
text, source=source, tab_width=document.settings.tab_width,
)
node = nodes.Element() # Anonymous container for parsing
node = Element() # Anonymous container for parsing
node.document = document

if keep_title_context:
Expand Down