DEV Community

Cover image for No Country For \0 (Escaping Characters Issues in ReportPortal)
Oleksii Lytvynov
Oleksii Lytvynov

Posted on • Edited on

No Country For \0 (Escaping Characters Issues in ReportPortal)

This is another story that happened to me during the integration of tests with ReportPortal. ReportPortal allows uploading a test description stored in the test function’s docstring. This is pretty convenient: test cases are stored along with the scripts and are also published together with the results.
Let’s consider a few cases.

Test with plain docstrings

We can use plain text to describe what the test verifies:

def test_with_usual_docstrings():
    """Verify adding integer positive numbers

    First number is 3
    Second number is 5

    Result is 8
    """
    pass
Enter fullscreen mode Exit fullscreen mode

This is the result we get on the test launch page in ReportPortal:
Results with plain text

Empty lines are not displayed. But if you go to the test details page, the description will match the docstring:
Results with plain text in test details

Test with a docstrings having an indent in the first line and an empty line at the end

Docstrings in Python are described in PEP 257 – Docstring Conventions. According to it, any leading spaces in the first line and blank lines at the beginning and end of docstrings should be stripped by tools that creates documentation. This PEP includes a trim function that processes docstrings according to these rules. The pytest_reportportal library has the same trim_docstring function, taken from PEP without modifications.

If we use a docstring with an indent in the first line and an empty line at the end, they will not appear in ReportPortal:

def test_with_indent_in_first_line_and_empty_line_at_the_end_of_docstrings():
    """     Verify adding integer positive numbers

    First number is 4
    Second number is 6

    Result is 10

    """
    pass
Enter fullscreen mode Exit fullscreen mode

Doctring with indent in first line and empty line in the end

Results with indented first line in test details

Test with Markdown in docstrings

On the test details page, there are elements for editing test description. If we check the ReportPortal service-ui subproject responsible for rendering results on the web, we'll find components markdownEditor and markdownViewer.

Let’s create a test with Markdown in the docstring:

def test_with_markdown_in_docstrings():
    """**Verify adding float positive numbers**

    - First number is 1.73
    - Second number is 3.1
    ---
    Result is 4.83
    """
Enter fullscreen mode Exit fullscreen mode

And the result in ReportPortal:
Docstring with Markdown

The first line is bolded, the second and third lines are listed as bullet points, and the result is separated by a horizontal line. On the web page, the formatting looks like this:

<div class="markdownViewer__markdown-viewer--GikqC mode-default">
    <p><strong>Verify adding float positive numbers</strong></p>
    <ul>
        <li>First number is 1.73</li>
        <li>Second number is 3.1</li>
    </ul>
    <hr>
    <p>Result is 4.83</p>
</div>
Enter fullscreen mode Exit fullscreen mode

Test with \t and \n in docstrings

On the work, we often use strings with \t and \n as test strings. For example, something like this test:

def test_with_tab_and_newline_in_docstrings():
    """Verify application processes lines with \t and \n

    Add line "123\t456"
    Add line "789\n012"

    Printed lines equal to entered lines
    """
    pass
Enter fullscreen mode Exit fullscreen mode

It doesn’t contain Markdown formatting, but the result is unexpected:
Docstrings with non-escaped \n and \t

If we examine the markup, we find two code blocks (<code>):

<div class="markdownViewer__markdown-viewer--GikqC mode-default">
    <p>Verify application processes lines with          and</p>
    <pre><code>
        Add line "123       456"
        Add line "789
    </code></pre>
    <p>012"</p>
    <pre><code>Printed lines equal to entered lines</code></pre>
</div>
Enter fullscreen mode Exit fullscreen mode

I then checked how the test description was sent in requests to ReportPortal (with and without special characters) using Wireshark and noted when the text was treated as a code block in Markdown. It turns out that we publish test with the following docstrings:

def test_with_tab_and_newline_in_docstrings():
    """Verify application processes lines with \t and \n

    Add line "123\t456"
    Add line "789
012"

    Printed lines equal to entered lines
    """
    pass
Enter fullscreen mode Exit fullscreen mode

The issue with the indent lies in the trim_docstring function. It has a fragment that calculates the minimum indent in the docstring:

# Determine minimum indentation (first line doesn't count):
indent = sys.maxsize
for line in lines[1:]:
    stripped = line.lstrip()
    if stripped:
        indent = min(indent, len(line) - len(stripped))
Enter fullscreen mode Exit fullscreen mode

Because of the newline character, the line 012" starts at the beginning of line, without an indent. Because of that, lines indented by four spaces are treated as code blocks.

By the way, this function was the first real production code where I saw the usage of the expandtabs string method. Initially, all tabs in the docstring are replaced with 8 spaces, then the string is split into lines:

lines = docstring.expandtabs().splitlines()

This issue can be fixed by escaping special characters:

def test_with_escaped_tab_and_newline_in_docstrings():
    """Verify application processes lines with \\t and \\n

    Add line "123\\t456"
    Add line "789\\n012"

    Printed lines equal to entered lines
    """
    pass
Enter fullscreen mode Exit fullscreen mode

This gives us the expected result:
Docstrings with escaped \n and \t

Test with \0 in docstrings

We also have test strings containing the \0 character. This caused a separate issue. When publishing results in such cases, the test launch is created, but the results are not published.

def test_with_slash_zero_in_docstrings():
    """Verify application processes lines with \0

    Add line with \0

    Printed lines equal to entered lines
    """
    pass
Enter fullscreen mode Exit fullscreen mode

Again, checking the requests sent to ReportPortal, we see that the description field contains \0.

{
  "codeRef": "test_example.py:test_with_slash_zero_in_docstrings", 
  "description": "Verify application processes lines with \u0000\n\nAdd line with \u0000\n\nPrinted lines equal to entered lines", 
  "hasStats": true, 
  "name": "reportportal_escaping/test_example.py::test_with_slash_zero_in_docstrings", 
  "retry": false, 
  "retryOf": null, 
  "startTime": "1732975747211", 
  "testCaseId": "test_example.py:test_with_slash_zero_in_docstrings", 
  "type": "STEP", 
  "launchUuid": "5d3d8726-a58f-47de-bd85-1e21353a39a7", 
  "attributes": [], 
  "parameters": null
}
Enter fullscreen mode Exit fullscreen mode

This issue seems doesn't belong to pytest_reportportal library but further down the stack. Perhaps it relates to request parsing or description field rendering on the web UI. I reported this problem in the pytest_reportportal library issue tracker (but maybe it should be moved to other subproject):
Test not published to Report Portal when docstring or parameter contains \0 #38

It also can be resolved by escaping the special character:

def test_with_escaped_slash_zero_in_docstrings():
    """Verify application processes lines with \\0

    Add line with \\0

    Printed lines equal to entered lines
    """
    pass
Enter fullscreen mode Exit fullscreen mode

Docstrings with escaped \0

Test with \0 in a parameter value of parametrized test

A bigger problem arises if a parameterized test contains \0 in its parameter value.

import pytest

@pytest.mark.parametrize("line", ["abc\0", ])
def test_with_slash_zero_in_params(line):
    pass
Enter fullscreen mode Exit fullscreen mode

Now in the request sent to create the test, the name field contains the escaped \0, but the testCaseId field contains the unescaped value.

{
  "name": "reportportal_escaping/test_example.py::test_with_slash_zero_in_params[abc\\x00]", 
  "testCaseId": "test_example.py:test_with_slash_zero_in_params[abc\u0000]"
}
Enter fullscreen mode Exit fullscreen mode

This value cannot be escaped because the test requires it in its original form. As a workaround, we used a placeholder string for the parameter to indicate that the test string contains \0, and in the test itself, we replaced it with the actual value:

import pytest

@pytest.mark.parametrize("line", ["line with slash zero", ])
def test_with_slash_zero_in_params_workaround(line):
    if line == "line with slash zero":
        line = "abc\0"
    # test logic that uses variable line
Enter fullscreen mode Exit fullscreen mode

ReportPortal is a widely used system for storing and processing test results. It offers advanced features like AI for result analysis. However, as we can see, even such system has shortcomings in rare use cases.

======================================================
[Update, 04 Dec 2024]

Contributor @HardNorth replied regarding this issue:

\0 is not a text character, it is strictly binary one. So strictly, that many libraries use it to check if data is binary or text. And RP wasn't supposed to work with binary data. Also in general we do not modify data on client side, except some special cases, so it's your responsibility to sanitize it.

Nevertheless, he fixed issue with non-publishing results in cases parameter value contains \0 (now it will be escaped, but only for parameters). It works starting from pytest-reportportal 5.4.7 👍

Test with \0 in parameter value

Top comments (0)