xdoctest.static_analysis module¶

The core logic that allows for xdoctest to parse source statically

Bases: object

Variables:

lineno_end (None | int) – the line number the docstring ends on (if known)

Parameters:

callname (str) – the name of the item containing the docstring.
lineno (int) – the line number the item containing the docstring.
docstr (str) – the docstring itself
doclineno (int) – the line number (1 based) the docstring begins on
doclineno_end (int) – the line number (1 based) the docstring ends on
args (None | ast.arguments) – arguments from static analysis TopLevelVisitor.

class xdoctest.static_analysis.TopLevelVisitor(source: str | None = None)[source]¶

Bases: NodeVisitor

Parses top-level function names and docstrings

For other visit_<classname> values see [MeetTheNodes].

References

[MeetTheNodes]

http://greentreesnakes.readthedocs.io/en/latest/nodes.html

CommandLine

python -m xdoctest.static_analysis TopLevelVisitor

Variables:

calldefs (OrderedDict)
source (None | str)
sourcelines (None | List[str])
assignments (list)

Example

>>> from xdoctest.static_analysis import *  # NOQA
>>> from xdoctest import utils
>>> source = utils.codeblock(
        '''
        def foo():
            """ my docstring """
            def subfunc():
                pass
        def bar():
            pass
        class Spam:
            def eggs(self):
                pass
            @staticmethod
            def hams():
                pass
            @property
            def jams(self):
                return 3
            @jams.setter
            def jams2(self, x):
                print('ignoring')
            @jams.deleter
            def jams(self, x):
                print('ignoring')
        ''')
>>> self = TopLevelVisitor.parse(source)
>>> callnames = set(self.calldefs.keys())
>>> assert callnames == {
>>>     'foo', 'bar', 'Spam', 'Spam.eggs', 'Spam.hams',
>>>     'Spam.jams'}
>>> assert self.calldefs['foo'].docstr.strip() == 'my docstring'
>>> assert 'subfunc' not in self.calldefs

Parameters:: source (None | str)

classmethod parse(source: str) → TopLevelVisitor[source]¶

main entry point

executes parsing algorithm and populates self.calldefs

Parameters:: source (str)

syntax_tree() → AST[source]¶

creates the abstract syntax tree

Return type:: ast.Module

process_finished(node: AST | int) → None[source]¶

process (get ending lineno) for everything marked as finished

Parameters:: node (ast.AST)

visit(node: AST) → None[source]¶

Parameters:: node (ast.AST)

_visit_generic_FunctionDef(node: FunctionDef | AsyncFunctionDef) → None[source]¶

visit_FunctionDef(node: FunctionDef) → None[source]¶

Parameters:: node (ast.FunctionDef)

visit_AsyncFunctionDef(node: AsyncFunctionDef) → None[source]¶

Parameters:: node (ast.AsyncFunctionDef)

visit_ClassDef(node: ClassDef) → None[source]¶

Parameters:: node (ast.ClassDef)

visit_Module(node: Module) → None[source]¶

Parameters:: node (ast.Module)

visit_Assign(node: Assign) → None[source]¶

Parameters:: node (ast.Assign)

visit_If(node: If) → None[source]¶

Parameters:: node (ast.If)

_docnode_line_workaround(docnode: Expr | stmt) → tuple[int, int][source]¶

Find the start and ending line numbers of a docstring

Parameters:: docnode (ast.Expr)
Returns:: Tuple[int, int]

CommandLine

xdoctest -m xdoctest.static_analysis TopLevelVisitor._docnode_line_workaround

Example

>>> from xdoctest.static_analysis import *  # NOQA
>>> sq = chr(39)  # single quote
>>> dq = chr(34)  # double quote
>>> source = utils.codeblock(
    '''
    def func0():
        {ddd} docstr0 {ddd}
    def func1():
        {ddd}
        docstr1 {ddd}
    def func2():
        {ddd} docstr2
        {ddd}
    def func3():
        {ddd}
        docstr3
        {ddd}  # foobar
    def func5():
        {ddd}pathological case
        {sss} # {ddd} # {sss} # {ddd} # {ddd}
    def func6():
        " single quoted docstr "
    def func7():
        r{ddd}
        raw line
        {ddd}
    ''').format(ddd=dq * 3, sss=sq * 3)
>>> self = TopLevelVisitor(source)
>>> func_nodes = self.syntax_tree().body
>>> print(utils.add_line_numbers(utils.highlight_code(source), start=1))
>>> wants = [
>>>     (2, 2),
>>>     (4, 5),
>>>     (7, 8),
>>>     (10, 12),
>>>     (14, 15),
>>>     (17, 17),
>>>     (19, 21),
>>> ]
>>> for i, func_node in enumerate(func_nodes):
>>>     docnode = func_node.body[0]
>>>     got = self._docnode_line_workaround(docnode)
>>>     want = wants[i]
>>>     print('got = {!r}'.format(got))
>>>     print('want = {!r}'.format(want))
>>>     assert got == want

classmethod _find_docstr_endpos_workaround(docstr: str, sourcelines: list[str], startpos: int) → tuple[int, int][source]¶

Like docstr_line_workaround, but works from the top-down instead of bottom-up. This is for pypy.

Given a docstring, its original source lines, and where the start position is, this function finds the end-position of the docstr

Example

>>> fmtkw = dict(sss=chr(39) * 3, ddd=chr(34) * 3)
>>> source = utils.codeblock(
        '''
        {ddd}
        docstr0
        {ddd}
        '''.format(**fmtkw))
>>> sourcelines = source.splitlines()
>>> docstr = eval(source, {}, {})
>>> startpos = 0
>>> start, stop = TopLevelVisitor._find_docstr_endpos_workaround(docstr, sourcelines, startpos)
>>> assert (start, stop) == (0, 2)
>>> #
>>> source = utils.codeblock(
        '''
        "docstr0"
        '''.format(**fmtkw))
>>> sourcelines = source.splitlines()
>>> docstr = eval(source, {}, {})
>>> startpos = 0
>>> start, stop = TopLevelVisitor._find_docstr_endpos_workaround(docstr, sourcelines, startpos)
>>> assert (start, stop) == (0, 0)

_find_docstr_startpos_workaround(docstr: str, sourcelines: list[str], endpos: int) → tuple[int, int][source]¶

Find the which sourcelines contain the docstring

Parameters:

docstr (str) – the extracted docstring.
sourcelines (list) – a list of all lines in the file. We assume the docstring exists as a pure string literal in the source. In other words, no postprocessing via split, format, or any other dynamic programmatic modification should be made to the docstrings. Python’s docstring extractor assumes this as well.
endpos (int) – line position (starting at 0) the docstring ends on. Note: positions are 0 based but linenos are 1 based.

Returns:

start, stop:

start: the line position (0 based) the docstring starts on stop: the line position (0 based) that the docstring stops

such that sourcelines[start:stop] will contain the docstring

Return type:

tuple[Int, Int]

CommandLine

python -m xdoctest xdoctest.static_analysis TopLevelVisitor._find_docstr_startpos_workaround
python -m xdoctest xdoctest.static_analysis TopLevelVisitor._find_docstr_startpos_workaround --debug

Example

>>> # xdoctest: +REQUIRES(CPython)
>>> # This function is a specific workaround for a CPython bug.
>>> from xdoctest.static_analysis import *
>>> sq = chr(39)  # single quote
>>> dq = chr(34)  # double quote
>>> source = utils.codeblock(
    '''
    def func0():
        {ddd} docstr0 {ddd}
    def func1():
        {ddd}
        docstr1 {ddd}
    def func2():
        {ddd} docstr2
        {ddd}
    def func3():
        {ddd}
        docstr3
        {ddd}  # foobar
    def func5():
        {ddd}pathological case
        {sss} # {ddd} # {sss} # {ddd} # {ddd}
    def func6():
        " single quoted docstr "
    def func7():
        r{ddd}
        raw line
        {ddd}
    ''').format(ddd=dq * 3, sss=sq * 3)
>>> print(utils.add_line_numbers(utils.highlight_code(source), start=0))
>>> targets = [
>>>     (1, 2),
>>>     (3, 5),
>>>     (6, 8),
>>>     (9, 12),
>>>     (13, 15),
>>>     (16, 17),
>>>     (18, 21),
>>> ]
>>> self = TopLevelVisitor.parse(source)
>>> pt = ast.parse(source.encode('utf8'))
>>> sourcelines = source.splitlines()
>>> # PYPY docnode.lineno specify the startpos of a docstring not
>>> # the end.
>>> print('\n\n====\n\n')
>>> #for i in [0, 1]:
>>> for i in range(len(targets)):
>>>     print('----------')
>>>     funcnode = pt.body[i]
>>>     print('funcnode = {!r}'.format(funcnode))
>>>     docnode = funcnode.body[0]
>>>     print('funcnode.__dict__ = {!r}'.format(funcnode.__dict__))
>>>     print('docnode = {!r}'.format(docnode))
>>>     print('docnode.value = {!r}'.format(docnode.value))
>>>     print('docnode.value.__dict__ = {!r}'.format(docnode.value.__dict__))
>>>     if IS_PY_GE_312:
>>>         print('docnode.value.value = {!r}'.format(docnode.value.value))
>>>     else:
>>>         print('docnode.value.s = {!r}'.format(docnode.value.s))
>>>     print('docnode.lineno = {!r}'.format(docnode.lineno))
>>>     print('docnode.col_offset = {!r}'.format(docnode.col_offset))
>>>     print('docnode = {!r}'.format(docnode))
>>>     #import IPython
>>>     #IPython.embed()
>>>     docstr = ast.get_docstring(funcnode, clean=False)
>>>     print('len(docstr) = {}'.format(len(docstr)))
>>>     endpos = docnode.lineno - 1
>>>     if hasattr(docnode, 'end_lineno'):
>>>         endpos = docnode.end_lineno - 1
>>>     print('endpos = {!r}'.format(endpos))
>>>     start, end = self._find_docstr_startpos_workaround(docstr, sourcelines, endpos)
>>>     print('i = {!r}'.format(i))
>>>     print('got  = {}, {}'.format(start, end))
>>>     print('want = {}, {}'.format(*targets[i]))
>>>     if targets[i] != (start, end):
>>>         print('---')
>>>         print(docstr)
>>>         print('---')
>>>         print('sourcelines = [\n{}\n]'.format(', \n'.join(list(map(repr, enumerate(sourcelines))))))
>>>         print('endpos = {!r}'.format(endpos))
>>>         raise AssertionError('docstr workaround is failing')
>>>     print('----------')

CommandLine

xdoctest -m xdoctest.static_analysis.py TopLevelVisitor._get_docstring

Example

>>> source = utils.codeblock(
    '''
    def foo():
        'docstr'
    ''')
>>> self = TopLevelVisitor(source)
>>> node = self.syntax_tree().body[0]
>>> self._get_docstring(node)
('docstr', 2, 2)

xdoctest.static_analysis.parse_static_calldefs(source: str | None = None, fpath: str | PathLike | None = None) → dict[str, CallDefNode][source]¶

Statically finds top-level callable functions and methods in python source

Parameters:

source (str) – python text
fpath (str) – filepath to read if source is not specified

Returns:

mapping from callnames to CallDefNodes, which contain: info about the item with the doctest.

Return type:

Dict[str, CallDefNode]

Example

>>> from xdoctest import static_analysis
>>> fpath = static_analysis.__file__.replace('.pyc', '.py')
>>> calldefs = parse_static_calldefs(fpath=fpath)
>>> assert 'parse_static_calldefs' in calldefs

xdoctest.static_analysis.parse_calldefs(source: str | None = None, fpath: str | PathLike | None = None) → dict[str, CallDefNode][source]¶

xdoctest.static_analysis._parse_static_node_value(node: AST) → Any[source]¶: Extract a constant value from a node if possible

xdoctest.static_analysis.parse_static_value(key: str, source: str | bytes | None = None, fpath: str | None = None) → object[source]¶

Statically parse a constant variable’s value from python code.

TODO: This does not belong here. Move this to an external static analysis library.

Parameters:

key (str) – name of the variable
source (str) – python text
fpath (str) – filepath to read if source is not specified

Returns:

object

Example

>>> from xdoctest.static_analysis import parse_static_value
>>> key = 'foo'
>>> source = 'foo = 123'
>>> assert parse_static_value(key, source=source) == 123
>>> source = 'foo = "123"'
>>> assert parse_static_value(key, source=source) == '123'
>>> source = 'foo = [1, 2, 3]'
>>> assert parse_static_value(key, source=source) == [1, 2, 3]
>>> source = 'foo = (1, 2, "3")'
>>> assert parse_static_value(key, source=source) == (1, 2, "3")
>>> source = 'foo = {1: 2, 3: 4}'
>>> assert parse_static_value(key, source=source) == {1: 2, 3: 4}
>>> source = 'foo = None'
>>> assert parse_static_value(key, source=source) == None
>>> #parse_static_value('bar', source=source)
>>> #parse_static_value('bar', source='foo=1; bar = [1, foo]')

xdoctest.static_analysis.package_modpaths(pkgpath: str, with_pkg: bool = False, with_mod: bool = True, followlinks: bool = True, recursive: bool = True, with_libs: bool = False, check: bool = True) → Iterator[str][source]¶

Finds sub-packages and sub-modules belonging to a package.

Parameters:

pkgpath (str) – path to a module or package
with_pkg (bool) – if True includes package __init__ files (default = False)
with_mod (bool) – if True includes module files (default = True)
exclude (list) – ignores any module that matches any of these patterns
recursive (bool) – if False, then only child modules are included
with_libs (bool) – if True then compiled shared libs will be returned as well
check (bool) – if False, then then pkgpath is considered a module even if it does not contain an __init__ file.

Yields:

str – module names belonging to the package

References

http://stackoverflow.com/questions/1707709/list-modules-in-py-package

Example

>>> from xdoctest.static_analysis import *
>>> pkgpath = modname_to_modpath('xdoctest')
>>> paths = list(package_modpaths(pkgpath))
>>> print('\n'.join(paths))
>>> names = list(map(modpath_to_modname, paths))
>>> assert 'xdoctest.core' in names
>>> assert 'xdoctest.__main__' in names
>>> assert 'xdoctest' not in names
>>> print('\n'.join(names))

xdoctest.static_analysis.is_balanced_statement(lines: list[str], only_tokens: bool = False, reraise: int = 0) → bool[source]¶

Checks if the lines have balanced braces and quotes.

Parameters:: lines (List[str]) – list of strings, one for each line
Returns:: True if the statement is balanced, otherwise False
Return type:: bool

CommandLine

xdoctest -m xdoctest.static_analysis is_balanced_statement:0

References

https://stackoverflow.com/questions/46061949/parse-until-complete

Example

>>> from xdoctest.static_analysis import *  # NOQA
>>> assert is_balanced_statement(['print(foobar)'])
>>> assert is_balanced_statement(['foo = bar']) is True
>>> assert is_balanced_statement(['foo = (']) is False
>>> assert is_balanced_statement(['foo = (', "')(')"]) is True
>>> assert is_balanced_statement(
...     ['foo = (', "'''", ")]'''", ')']) is True
>>> assert is_balanced_statement(
...     ['foo = ', "'''", ")]'''", ')']) is False
>>> #assert is_balanced_statement(['foo = ']) is False
>>> #assert is_balanced_statement(['== ']) is False
>>> lines = ['def foo():', '', '    x = 1', 'assert True', '']
>>> assert is_balanced_statement(lines)

Example

>>> from xdoctest.static_analysis import *
>>> source_parts = [
>>>     'setup(',
>>>     "    name='extension',",
>>>     '    ext_modules=[',
>>>     '        CppExtension(',
>>>     "            name='extension',",
>>>     "            sources=['extension.cpp'],",
>>>     "            extra_compile_args=['-g'])),",
>>>     '    ],',
>>> ]
>>> print('\n'.join(source_parts))
>>> assert not is_balanced_statement(source_parts)
>>> source_parts = [
>>>     'setup(',
>>>     "    name='extension',",
>>>     '    ext_modules=[',
>>>     '        CppExtension(',
>>>     "            name='extension',",
>>>     "            sources=['extension.cpp'],",
>>>     "            extra_compile_args=['-g']),",
>>>     '    ],',
>>>     '        cmdclass={',
>>>     "            'build_ext': BuildExtension",
>>>     '        })',
>>> ]
>>> print('\n'.join(source_parts))
>>> assert is_balanced_statement(source_parts)

Example

>>> lines = ['try: raise Exception']
>>> is_balanced_statement(lines, only_tokens=1)
True
>>> is_balanced_statement(lines, only_tokens=0)
False

Example

>>> # Cause a failure case on 3.12
>>> from xdoctest.static_analysis import *
>>> lines = ['3, 4]', 'print(len(x))']
>>> is_balanced_statement(lines, only_tokens=1)
False

xdoctest.static_analysis.extract_comments(source: str | list[str]) → Iterator[str][source]¶

Returns the text in each comment in a block of python code. Uses tokenize to account for quotations.

Parameters:: source (str | List[str])

CommandLine

python -m xdoctest.static_analysis extract_comments

Example

>>> from xdoctest import utils
>>> source = utils.codeblock(
>>>    '''
       # comment 1
       a = '# not a comment'  # comment 2
       c = 3
       ''')
>>> comments = list(extract_comments(source))
>>> assert comments == ['# comment 1', '# comment 2']
>>> comments = list(extract_comments(source.splitlines()))
>>> assert comments == ['# comment 1', '# comment 2']

xdoctest.static_analysis._strip_hashtag_comments_and_newlines(source: str | list[str]) → str[source]¶

Removes hashtag comments from underlying source

Parameters:: source (str | List[str])

CommandLine

xdoctest -m xdoctest.static_analysis _strip_hashtag_comments_and_newlines

Todo

would be better if this was some sort of configurable minify API

Example

>>> from xdoctest.static_analysis import _strip_hashtag_comments_and_newlines
>>> from xdoctest import utils
>>> fmtkw = dict(sss=chr(39) * 3, ddd=chr(34) * 3)
>>> source = utils.codeblock(
>>>    '''
       # comment 1
       a = '# not a comment'  # comment 2

multiline_string = {ddd}

one

{ddd} b = [

1, # foo

# bar 3,

] c = 3 ‘’’).format(**fmtkw)

>>> non_comments = _strip_hashtag_comments_and_newlines(source)
>>> print(non_comments)
>>> assert non_comments.count(chr(10)) == 10
>>> assert non_comments.count('#') == 1