DEV Community

Christopher Durham
Christopher Durham

Posted on • Updated on

Conformance Testing in Rust

I'm excited to announce conformance::tests, a new crate I've published as part of building the Tiny-C compiler for the Crafting IDE-Ready-Compilers series.

The idea of this crate is to make it simpler to test any API which takes the shape of a &str -> impl serde::Serialize + serde::Deserialize function. It works with any serde compatible data format, though I find that serde_yaml or ron give the most readable diffs for a failed test.

Using the Tiny-C lexer and YAML for serialization, here's what a test looks like:

- Identifier: 1
- EqualsSign: 1
- Identifier: 1
- EqualsSign: 1
- Identifier: 1
- EqualsSign: 1
- Integer: 1
- LessThanSign: 1
- Integer: 1
- Semicolon: 1
Enter fullscreen mode Exit fullscreen mode

and to hook it up to the standard Rust test runner, all that has to be written is:

    ser = serde_yaml::to_string,
    de = serde_yaml::from_str,
    file = "tests/yaml.test")]
fn lex_tokens(s: &str) -> Vec<Token> {
Enter fullscreen mode Exit fullscreen mode

A test in the test file is represented by a test name (which needs to be a valid rust identifier continuation), terminated by a === on its own line, the test input string, terminated by a --- on its own line, and the serialized expected output, terminated by a ... on its own line. A test file contains any number of these tests. (The extension of the file does not matter, but idiomatically should be format.test.) The file name (excluding the extension) is also required to be a valid Rust identifier (or a . is also allowed).

The procedural macro is applied to the function that you want to test. The function must have the shape of taking a single parameter of type &str and returning a type that can be de/serialized by serde. The arguments to the macro itself are:

  • exact: the produced value must exactly match the expected value. In the future, a separate mode where the produced value is just required to be a superset of the expected value is planned.
  • ser: a path to a function of shape fn<T: serde::Serialize>(&T) -> String.
  • de: a path to a function of shape fn<T: serde::Deserialize>(&str) -> Result<T, impl std::error::Error>.
  • file: a file path string relative to the cargo manifest directory to the intended test file.

Of course, multiple conformance::tests attributes can be applied to the same function to run the tests in multiple files.

If all you want to do is use the library, you now know everything you need to know to use it. But I find the details of how it works interesting to discuss, so read on to see.

The Rust Custom Test Frameworks feature is still experimental and unstable. But, while #[test_case] might simplify the implementation of conformance, it's not needed and by using just #[test] we work on stable.

In order to integrate natively into the standard #[test] test runner, a unique #[test] function is emitted for each test in the test file.

To do the actual testing work, we emit a single function:

fn #testing_fn(expected: &str, actual: &str) -> Result<(), Box<dyn ::std::error::Error>> {
    const _: &str = include_str!(#filepath);
    let actual = #ser(&#fn_name(actual))?;
    let expected = #ser(&#de::<#tested_type>(expected)?)?; // normalize
    assert_eq!(actual, expected);
Enter fullscreen mode Exit fullscreen mode

We call the function by the file name, such that multiple files can be tested in the same Rust namespace. The file path is include_str!'d to add a data dependency onto the test file so changing it will recompile the crate. (Hopefully Rust doesn't get smart enough to just recompile the const until a real API for reading files in a procedural macro in a principled way is stabilized.)

For each function in the test file, we emit a function that delegates to #testing_fn:

fn #test_name() -> Result<(), Box<dyn ::std::error::Error>> {
    #testing_fn(#output, #input)
Enter fullscreen mode Exit fullscreen mode

The function name is assembled as {filename}_{testname}, thus the requirements on naming for the filename and test name. The output and input to the test are sliced out of the test file and included as string literals in the generated token stream.

Because the comparison is just done by assert_eq, that's the whole macro. Complete with all the formalities, it's just under 200 lines (plus syn{features=[full]})! The test failure is best viewed with IntelliJ Rust since they provide a diff view of assert_eq! failures, or with some other assert_eq patch like pretty-assertions to provide an in-terminal diff.

There's a few features I'd like to add to conformance::tests that I'd love to have community assistance in designing.

  • "Includes"/"superset" mode, as opposed to the current "strict" mode. The produced value includes at least the specified data, but is allowed to include more data. This is mostly useful for testing part of a structure, omitting data not relevant to the test.
  • serde = shortcut: give a path to a serde_json/serde_yaml-like module/crate, that exposes a to_string and from_str appropriate for ser and de.
  • A smarter diff algorithm. While comparing the string serialized version of data structures is nicely minimal and utilitarian, I can't help but think we can do better and use some sort of "minimizing diff" algorithm (like Longest Common Sub-sequence is for sequences) that can generally apply to the serde data model. This might even be required for the "superset" mode.
  • Remove the Deserialize requirement. It's currently just used to normalize the serialized format; removing it means many more things can be tested through conformance::tests, even if they can't deserialize well.

If you're interested, check out the repository, and get a bonus sneak peek at the content of next week's post for the Tiny-C lexer.

Top comments (0)