xdoctest.parser module¶

The XDoctest Parser¶

This parses a docstring into one or more “doctest part” after the docstrings have been extracted from the source code by either static or dynamic means.

Terms and definitions:

logical block:
a snippet of code that can be executed by itself if given the correct global / local variable context.

PS1:
The original meaning is “Prompt String 1”. For details see: [SE32096] [BashPS1] [CustomPrompt] [GeekPrompt]. In the context of xdoctest, instead of referring to the prompt prefix, we use PS1 to refer to a line that starts a “logical block” of code. In the original doctest module these all had to be prefixed with “>>>”. In xdoctest the prefix is used to simply denote the code is part of a doctest. It does not necessarily mean a new “logical block” is starting.

PS2:
The original meaning is “Prompt String 2”. In the context of xdoctest, instead of referring to the prompt prefix, we use PS2 to refer to a line that continues a “logical block” of code. In the original doctest module these all had to be prefixed with “…”. However, xdoctest uses parsing to automatically determine this.

want statement:
Lines directly after a logical block of code in a doctest indicating the desired result of executing the previous block.

While I do believe this AST-based code is a significant improvement over the RE-based builtin doctest parser, I acknowledge that I’m not an AST expert and there is room for improvement here.

References

[SE32096]

https://unix.stackexchange.com/questions/32096/why-is-bashs-prompt-variable-called-ps1

[BashPS1]

https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#index-PS1

[CustomPrompt]

https://wiki.archlinux.org/title/Bash/Prompt_customization

[GeekPrompt]

https://web.archive.org/web/20230824025647/https://www.thegeekstuff.com/2008/09/bash-shell-take-control-of-ps1-ps2-ps3-ps4-and-prompt_command/

class xdoctest.parser.DoctestParser(simulate_repl: bool = False)[source]¶

Bases: object

Breaks docstrings into parts using the parse method.

Example

>>> from xdoctest.parser import *  # NOQA
>>> parser = DoctestParser()
>>> doctest_parts = parser.parse(
>>>     '''
>>>     >>> j = 0
>>>     >>> for i in range(10):
>>>     >>>     j += 1
>>>     >>> print(j)
>>>     10
>>>     '''.lstrip('\n'))
>>> print('\n'.join(list(map(str, doctest_parts))))
<DoctestPart(ln 0, src="j = 0...", want=None)>
<DoctestPart(ln 3, src="print(j)...", want="10...")>

Example

>>> # Having multiline strings in doctests can be nice
>>> string = utils.codeblock(
        '''
        >>> name = 'name'
        'anything'
        ''')
>>> self = DoctestParser()
>>> doctest_parts = self.parse(string)
>>> print('\n'.join(list(map(str, doctest_parts))))

Parameters:: simulate_repl (bool) – if True each line will be treated as its own doctest. This more closely mimics the original doctest module. Defaults to False.

parse(string: str, info: dict | None = None) → list[DoctestPart | str][source]¶

Divide the given string into examples and interleaving text.

Parameters:

string (str) – The docstring that may contain one or more doctests.
info (dict | None) – info about where the string came from in case of an error

Returns:

a list of DoctestPart objects and intervening text in the input docstring.

Return type:

List[xdoctest.doctest_part.DoctestPart | str]

CommandLine

python -m xdoctest.parser DoctestParser.parse

Example

>>> docstr = '''
>>>     A simple docstring contains text followed by an example.
>>>     >>> numbers = [1, 2, 3, 4]
>>>     >>> thirds = [x / 3 for x in numbers]
>>>     >>> print(thirds)
>>>     [0.33  0.66  1  1.33]
>>> '''
>>> from xdoctest import parser
>>> self = parser.DoctestParser()
>>> results = self.parse(docstr)
>>> assert len(results) == 3
>>> for index, result in enumerate(results):
>>>     print(f'results[{index}] = {result!r}')
results[0] = '\nA simple docstring contains text followed by an example.'
results[1] = <DoctestPart(ln 2, src="numbers ...", want=None) at ...>
results[2] = <DoctestPart(ln 4, src="print(th...", want="[0.33  0...") at ...>

Example

>>> s = 'I am a dummy example with two parts'
>>> x = 10
>>> print(s)
I am a dummy example with two parts
>>> s = 'My purpose it so demonstrate how wants work here'
>>> print('The new want applies ONLY to stdout')
>>> print('given before the last want')
>>> '''
    this wont hurt the test at all
    even though its multiline '''
>>> y = 20
The new want applies ONLY to stdout
given before the last want
>>> # Parts from previous examples are executed in the same context
>>> print(x + y)
30

this is simply text, and doesnt apply to the previous doctest the <BLANKLINE> directive is still in effect.

Example

>>> from xdoctest.parser import *  # NOQA
>>> from xdoctest import parser
>>> from xdoctest.docstr import docscrape_google
>>> from xdoctest import core
>>> self = parser.DoctestParser()
>>> docstr = self.parse.__doc__
>>> blocks = docscrape_google.split_google_docblocks(docstr)
>>> doclineno = self.parse.__func__.__code__.co_firstlineno
>>> key, (string, offset) = blocks[-2]
>>> self._label_docsrc_lines(string)
>>> doctest_parts = self.parse(string)
>>> # each part with a want-string needs to be broken in two
>>> assert len(doctest_parts) == 6
>>> len(doctest_parts)

_package_groups(grouped_lines)[source]¶

_package_chunk(raw_source_lines, raw_want_lines, lineno=0)[source]¶

if self.simulate_repl is True, then each statement is broken into its own part. Otherwise, statements are grouped by the closest want statement.

Todo

[ ] EXCEPT IN CASES OF EXPLICIT CONTINUATION

Example

>>> from xdoctest.parser import *
>>> raw_source_lines = ['>>> "string"']
>>> raw_want_lines = ['string']
>>> self = DoctestParser()
>>> part, = self._package_chunk(raw_source_lines, raw_want_lines)
>>> part.source
'"string"'
>>> part.want
'string'

_group_labeled_lines(labeled_lines) → list[list | tuple | str][source]¶

Group labeled lines into logical parts to be executed together

Returns:: A list of parts. Text parts are just returned as a list of lines. Executable parts are returned as a tuple of source lines and an optional “want” statement.
Return type:: List[List[str] | Tuple[List[str], str]]

_locate_ps1_linenos(source_lines: list[str]) → tuple[list[int], str][source]¶

Determines which lines in the source begin a “logical block” of code.

Parameters:: source_lines (List[str]) – lines belonging only to the doctest src these will be unindented, prefixed, and without any want.
Returns:: linenos is the first value a list of indices indicating which lines are considered “PS1” and mode_hint, the second value, is a flag indicating if the final line should be considered for a got/want assertion.
Return type:: Tuple[List[int], bool]

Example

>>> self = DoctestParser()
>>> source_lines = ['>>> def foo():', '>>>     return 0', '>>> 3']
>>> linenos, mode_hint = self._locate_ps1_linenos(source_lines)
>>> assert linenos == [0, 2]
>>> assert mode_hint == 'eval'

Example

>>> from xdoctest.parser import *  # NOQA
>>> self = DoctestParser()
>>> source_lines = ['>>> x = [1, 2, ', '>>> 3, 4]', '>>> print(len(x))']
>>> linenos, mode_hint = self._locate_ps1_linenos(source_lines)
>>> assert linenos == [0, 2]
>>> assert mode_hint == 'eval'

Example

>>> from xdoctest.parser import *  # NOQA
>>> self = DoctestParser()
>>> source_lines = [
>>>    '>>> x = 1',
>>>    '>>> try: raise Exception',
>>>    '>>> except Exception: pass',
>>>    '...',
>>> ]
>>> linenos, mode_hint = self._locate_ps1_linenos(source_lines)
>>> assert linenos == [0, 1]
>>> assert mode_hint == 'exec'

Example

>>> from xdoctest.parser import *  # NOQA
>>> self = DoctestParser()
>>> source_lines = [
>>>    '>>> import os; print(os)',
>>>    '...',
>>> ]
>>> linenos, mode_hint = self._locate_ps1_linenos(source_lines)
>>> assert linenos == [0]
>>> assert mode_hint == 'single'

Example

>>> # We should ensure that decorators are PS1 lines
>>> from xdoctest.parser import *  # NOQA
>>> self = DoctestParser()
>>> source_lines = [
>>>    '>>> # foo',
>>>    '>>> @foo',
>>>    '... def bar():',
>>>    '...     ...',
>>> ]
>>> linenos, mode_hint = self._locate_ps1_linenos(source_lines)
>>> print(f'linenos={linenos}')
>>> assert linenos == [0, 1]

_label_docsrc_lines(string: str) → list[tuple[str, str]][source]¶

Give each line in the docstring a label so we can distinguish what parts are text, what parts are code, and what parts are “want” string.

Parameters:

string (str) – doctest source

Returns:

labeled_lines - the above source broken: up by lines, each with a label indicating its type for later use in parsing.

Return type:

List[Tuple[str, str]]

Todo

[ ] Sphinx does not parse this doctest properly

Example

>>> from xdoctest.parser import *
>>> # Having multiline strings in doctests can be nice
>>> string = utils.codeblock(
        '''
        text
        >>> items = ['also', 'nice', 'to', 'not', 'worry',
        >>>          'about', '...', 'vs', '>>>']
        ... print('but its still allowed')
        but its still allowed

more text ‘’’)

>>> self = DoctestParser()
>>> labeled = self._label_docsrc_lines(string)
>>> expected = [
>>>     ('text', 'text'),
>>>     ('dsrc', ">>> items = ['also', 'nice', 'to', 'not', 'worry',"),
>>>     ('dsrc', ">>>          'about', '...', 'vs', '>>>']"),
>>>     ('dcnt', "... print('but its still allowed')"),
>>>     ('want', 'but its still allowed'),
>>>     ('text', ''),
>>>     ('text', 'more text')
>>> ]
>>> assert labeled == expected

xdoctest.parser._min_indentation(s)[source]¶: Return the minimum indentation of any non-blank line in s

xdoctest.parser._complete_source(line, state_indent, line_iter)[source]¶

helper remove lines from the iterator if they are needed to complete source

This uses static.is_balanced_statement() to do the heavy lifting

Example

>>> from xdoctest.parser import *  # NOQA
>>> from xdoctest.parser import _complete_source
>>> state_indent = 0
>>> line = '>>> x = { # The line is not finished'
>>> remain_lines = ['>>> 1:2,', '>>> 3:4,', '>>> 5:6}', '>>> y = 7']
>>> line_iter = enumerate(remain_lines, start=1)
>>> finished = list(_complete_source(line, state_indent, line_iter))
>>> final = chr(10).join([t[1] for t in finished])
>>> print(final)

xdoctest.parser._iterthree(items, pad_value=None)[source]¶

Iterate over a sliding window of size 3 with None padding on both sides.

Example

>>> from xdoctest.parser import *
>>> print(list(_iterthree([])))
>>> print(list(_iterthree(range(1))))
>>> print(list(_iterthree([1, 2])))
>>> print(list(_iterthree([1, 2, 3])))
>>> print(list(_iterthree(range(4))))
>>> print(list(_iterthree(range(7))))

xdoctest.parser._hasprefix(line, prefixes) → bool[source]¶: helper prefix test