xdoctest.parser module¶
The XDoctest Parser¶
This parses a docstring into one or more “doctest part” after the docstrings have been extracted from the source code by either static or dynamic means.
Terms and definitions:
- logical block:
a snippet of code that can be executed by itself if given the correct global / local variable context.
- PS1:
The original meaning is “Prompt String 1”. For details see: [SE32096] [BashPS1] [CustomPrompt] [GeekPrompt]. In the context of xdoctest, instead of referring to the prompt prefix, we use PS1 to refer to a line that starts a “logical block” of code. In the original doctest module these all had to be prefixed with “>>>”. In xdoctest the prefix is used to simply denote the code is part of a doctest. It does not necessarily mean a new “logical block” is starting.
- PS2:
The original meaning is “Prompt String 2”. In the context of xdoctest, instead of referring to the prompt prefix, we use PS2 to refer to a line that continues a “logical block” of code. In the original doctest module these all had to be prefixed with “…”. However, xdoctest uses parsing to automatically determine this.
- want statement:
Lines directly after a logical block of code in a doctest indicating the desired result of executing the previous block.
While I do believe this AST-based code is a significant improvement over the RE-based builtin doctest parser, I acknowledge that I’m not an AST expert and there is room for improvement here.
References
- class xdoctest.parser.DoctestParser(simulate_repl=False)[source]¶
Bases:
object
Breaks docstrings into parts using the parse method.
Example
>>> from xdoctest.parser import * # NOQA >>> parser = DoctestParser() >>> doctest_parts = parser.parse( >>> ''' >>> >>> j = 0 >>> >>> for i in range(10): >>> >>> j += 1 >>> >>> print(j) >>> 10 >>> '''.lstrip('\n')) >>> print('\n'.join(list(map(str, doctest_parts)))) <DoctestPart(ln 0, src="j = 0...", want=None)> <DoctestPart(ln 3, src="print(j)...", want="10...")>
Example
>>> # Having multiline strings in doctests can be nice >>> string = utils.codeblock( ''' >>> name = 'name' 'anything' ''') >>> self = DoctestParser() >>> doctest_parts = self.parse(string) >>> print('\n'.join(list(map(str, doctest_parts))))
- Parameters:
simulate_repl (bool) – if True each line will be treated as its own doctest. This more closely mimics the original doctest module. Defaults to False.
- parse(string, info=None)[source]¶
Divide the given string into examples and interleaving text.
- Parameters:
string (str) – The docstring that may contain one or more doctests.
info (dict | None) – info about where the string came from in case of an error
- Returns:
a list of DoctestPart objects and intervening text in the input docstring.
- Return type:
List[xdoctest.doctest_part.DoctestPart | str]
CommandLine
python -m xdoctest.parser DoctestParser.parse
Example
>>> docstr = ''' >>> A simple docstring contains text followed by an example. >>> >>> numbers = [1, 2, 3, 4] >>> >>> thirds = [x / 3 for x in numbers] >>> >>> print(thirds) >>> [0.33 0.66 1 1.33] >>> ''' >>> from xdoctest import parser >>> self = parser.DoctestParser() >>> results = self.parse(docstr) >>> assert len(results) == 3 >>> for index, result in enumerate(results): >>> print(f'results[{index}] = {result!r}') results[0] = '\nA simple docstring contains text followed by an example.' results[1] = <DoctestPart(ln 2, src="numbers ...", want=None) at ...> results[2] = <DoctestPart(ln 4, src="print(th...", want="[0.33 0...") at ...>
Example
>>> s = 'I am a dummy example with two parts' >>> x = 10 >>> print(s) I am a dummy example with two parts >>> s = 'My purpose it so demonstrate how wants work here' >>> print('The new want applies ONLY to stdout') >>> print('given before the last want') >>> ''' this wont hurt the test at all even though its multiline ''' >>> y = 20 The new want applies ONLY to stdout given before the last want >>> # Parts from previous examples are executed in the same context >>> print(x + y) 30
this is simply text, and doesnt apply to the previous doctest the <BLANKLINE> directive is still in effect.
Example
>>> from xdoctest.parser import * # NOQA >>> from xdoctest import parser >>> from xdoctest.docstr import docscrape_google >>> from xdoctest import core >>> self = parser.DoctestParser() >>> docstr = self.parse.__doc__ >>> blocks = docscrape_google.split_google_docblocks(docstr) >>> doclineno = self.parse.__func__.__code__.co_firstlineno >>> key, (string, offset) = blocks[-2] >>> self._label_docsrc_lines(string) >>> doctest_parts = self.parse(string) >>> # each part with a want-string needs to be broken in two >>> assert len(doctest_parts) == 6 >>> len(doctest_parts)
- _package_chunk(raw_source_lines, raw_want_lines, lineno=0)[source]¶
if self.simulate_repl is True, then each statement is broken into its own part. Otherwise, statements are grouped by the closest want statement.
Todo
[ ] EXCEPT IN CASES OF EXPLICIT CONTINUATION
Example
>>> from xdoctest.parser import * >>> raw_source_lines = ['>>> "string"'] >>> raw_want_lines = ['string'] >>> self = DoctestParser() >>> part, = self._package_chunk(raw_source_lines, raw_want_lines) >>> part.source '"string"' >>> part.want 'string'
- _group_labeled_lines(labeled_lines)[source]¶
Group labeled lines into logical parts to be executed together
- _locate_ps1_linenos(source_lines)[source]¶
Determines which lines in the source begin a “logical block” of code.
- Parameters:
source_lines (List[str]) – lines belonging only to the doctest src these will be unindented, prefixed, and without any want.
- Returns:
linenos is the first value a list of indices indicating which lines are considered “PS1” and mode_hint, the second value, is a flag indicating if the final line should be considered for a got/want assertion.
- Return type:
Example
>>> self = DoctestParser() >>> source_lines = ['>>> def foo():', '>>> return 0', '>>> 3'] >>> linenos, mode_hint = self._locate_ps1_linenos(source_lines) >>> assert linenos == [0, 2] >>> assert mode_hint == 'eval'
Example
>>> from xdoctest.parser import * # NOQA >>> self = DoctestParser() >>> source_lines = ['>>> x = [1, 2, ', '>>> 3, 4]', '>>> print(len(x))'] >>> linenos, mode_hint = self._locate_ps1_linenos(source_lines) >>> assert linenos == [0, 2] >>> assert mode_hint == 'eval'
Example
>>> from xdoctest.parser import * # NOQA >>> self = DoctestParser() >>> source_lines = [ >>> '>>> x = 1', >>> '>>> try: raise Exception', >>> '>>> except Exception: pass', >>> '...', >>> ] >>> linenos, mode_hint = self._locate_ps1_linenos(source_lines) >>> assert linenos == [0, 1] >>> assert mode_hint == 'exec'
Example
>>> from xdoctest.parser import * # NOQA >>> self = DoctestParser() >>> source_lines = [ >>> '>>> import os; print(os)', >>> '...', >>> ] >>> linenos, mode_hint = self._locate_ps1_linenos(source_lines) >>> assert linenos == [0] >>> assert mode_hint == 'single'
Example
>>> # We should ensure that decorators are PS1 lines >>> from xdoctest.parser import * # NOQA >>> self = DoctestParser() >>> source_lines = [ >>> '>>> # foo', >>> '>>> @foo', >>> '... def bar():', >>> '... ...', >>> ] >>> linenos, mode_hint = self._locate_ps1_linenos(source_lines) >>> print(f'linenos={linenos}') >>> assert linenos == [0, 1]
- _label_docsrc_lines(string)[source]¶
Give each line in the docstring a label so we can distinguish what parts are text, what parts are code, and what parts are “want” string.
- Parameters:
string (str) – doctest source
- Returns:
- labeled_lines - the above source broken
up by lines, each with a label indicating its type for later use in parsing.
- Return type:
Todo
[ ] Sphinx does not parse this doctest properly
Example
>>> from xdoctest.parser import * >>> # Having multiline strings in doctests can be nice >>> string = utils.codeblock( ''' text >>> items = ['also', 'nice', 'to', 'not', 'worry', >>> 'about', '...', 'vs', '>>>'] ... print('but its still allowed') but its still allowed
more text ‘’’)
>>> self = DoctestParser() >>> labeled = self._label_docsrc_lines(string) >>> expected = [ >>> ('text', 'text'), >>> ('dsrc', ">>> items = ['also', 'nice', 'to', 'not', 'worry',"), >>> ('dsrc', ">>> 'about', '...', 'vs', '>>>']"), >>> ('dcnt', "... print('but its still allowed')"), >>> ('want', 'but its still allowed'), >>> ('text', ''), >>> ('text', 'more text') >>> ] >>> assert labeled == expected
- xdoctest.parser._min_indentation(s)[source]¶
Return the minimum indentation of any non-blank line in s
- xdoctest.parser._complete_source(line, state_indent, line_iter)[source]¶
helper remove lines from the iterator if they are needed to complete source
This uses
static.is_balanced_statement()
to do the heavy liftingExample
>>> from xdoctest.parser import * # NOQA >>> from xdoctest.parser import _complete_source >>> state_indent = 0 >>> line = '>>> x = { # The line is not finished' >>> remain_lines = ['>>> 1:2,', '>>> 3:4,', '>>> 5:6}', '>>> y = 7'] >>> line_iter = enumerate(remain_lines, start=1) >>> finished = list(_complete_source(line, state_indent, line_iter)) >>> final = chr(10).join([t[1] for t in finished]) >>> print(final)
- xdoctest.parser._iterthree(items, pad_value=None)[source]¶
Iterate over a sliding window of size 3 with None padding on both sides.
Example
>>> from xdoctest.parser import * >>> print(list(_iterthree([]))) >>> print(list(_iterthree(range(1)))) >>> print(list(_iterthree([1, 2]))) >>> print(list(_iterthree([1, 2, 3]))) >>> print(list(_iterthree(range(4)))) >>> print(list(_iterthree(range(7))))