Sometimes, when working on a complex piece of software — or even a simple program or library with a non-trivial number of moving parts — you run into an issue and have absolutely no idea how to reduce your code to the minimum necessary in order to reproduce the error.
For the past several months I’ve been blocked by some trait-resolution errors in Symtern, my general-purpose interner crate. I knew the errors were somehow related to my attempts to work around the lack of associated-type constructors — but how? Some of the errors rustc was printing didn’t even have line numbers!
This is exactly the kind of problem that software like C-Reduce is meant to solve: by iteratively removing parts of an input file and calling a user-provided executable to test whether the modified input still triggers the unwanted behavior, C-Reduce is able to reliably reduce test cases. And it works with Rust programs! With a little massaging we can use it for entire crates, even those with external dependencies.
C-Reduce chews your code with many mouths.
C-Reduce works something like this:
creducewith, as command-line arguments, the names of a test-driver executable and file containing source code that triggers “interesting” behavior.
$ creduce driver.sh crate.rs
C-Reduce runs the driver executable with no arguments. If the source file named on
creduce’s command line is still “interesting”, the driver executable should return with an exit status of
0; any other exit status ends the current reduction attempt. (Continue to step 3 only when the driver executable returns
A driver executable might be, for example, a simple shell script that compiles the named source and checks that the compiler emits the expected error. For example:
#!/bin/sh rustc crate.rs > rustc.out 2>&1 # Check for ICEs grep -q 'internal compiler error' rustc.out return $?
We could instead run a generated executable and check for aberrant behavior just as easily.
#!/bin/sh rustc crate.rs && ./a.out > a.out.log grep -q 'wrong branch' a.out.log return $?
In parallel, masticate the source file using some set of transforms that reduce the file’s size. After each transform completes, return to step two with its output as the new source file; C-Reduce keeps only the smallest transformed source files that remain “interesting”.
Repeat steps two and three until no further reduction occurs. At this point, C-Reduce will print the smallest transformed source file to its standard output and replace the original source file with it.
Note that this explanation is a description of “delta debugging” rather than of C-Reduce in particular, and fails to account for many of the latter’s more salient qualities.
Square peg, round hole.
Earlier I promised a way to use C-Reduce on an entire crate — yet C-Reduce can only chew on a single input file at a time! We’ll have to squeeze the crate a little to make it work.
Expanding a crate into a single source file
C-Reduce expects us to feed it a single input file to munch on; luckily rustc provides a simple way to “expand” all
mod declarations much like
#include lines in C or C++ source would be expanded by the C preprocessor:
$ rustc -Z unstable-options --pretty=normal src/lib.rs > crate.rs
(At the time of writing, you’ll get a warning if you do this using a non-nightly compiler; it’s not entirely clear if Rust’s compiler devs plan to explicitly support anything like this on stable in the future.)
rustc invocation from
Although we now have our single source file, we still need to worry about passing the right flags to
rustc so it can find any
extern crates. Why don’t we ask Cargo how to do that?
$ cargo build -v 2>&1 \ | grep 'Running `rustc' | tail -n 1 \ | sed -e 's/^.*Running `\(.\+\)`$/\1/g'
This should print the
rustc invocation Cargo uses for the current crate. We’ll need to adjust the resulting command a little before we use it:
src/main.rsmust be replaced with the name of the single source file we created earlier (
--out-dir /path/to/my-crate/target/debug/depsshould probably be removed since C-Reduce will be running our driver script in parallel, using a separate temporary directory for each run.
We expect (or hope) that any
extern crate lines will be eliminated by C-Reduce anyway, so going to the extra trouble of supporting external crates lets us skip manually extricating any external dependencies from our code — a big win.
Tying it all together
While we could do all this manually, typing the commands directly into a terminal, it’s both cumbersome and unnecessary. Enter my favorite underappreciated tool, GNU Make.
Using Make allows us to automate most of the steps involved.
The first rule defined in a makefile is the default goal, used when no goal is specified on Make’s command line. We probably want that target to be the one that calls
.PHONY: reduce SOURCE ?= crate.rs DRIVER ?= driver.sh reduce: $(DRIVER) $(SOURCE) creduce $<
For sake of flexibility we’ve made it possible to specify an alternate driver script or source file on the Make command line with e.g.
$ make DRIVER=alternate-driver.sh
$ make SOURCE=alternate-source.rs
Specifying both would just be an overly-verbose way of calling C-Reduce directly; specifying only one of them allows us to depend on the Makefile’s machinery for the other.
Next we’ll define the rule for creating our expanded source file on-demand:
crate.rs: $(wildcard src/**.rs) cargo rustc -- -Z unstable-options --pretty=normal > $@ \ || (rm -f $@; return 1)
Defined this way, it will be updated whenever we change one of our crate’s sources. Good.
Now let’s tackle the driver script. Its build rule is this:
driver.sh: Makefile $(file > $@,$(DRIVER_SCRIPT)) chmod +x $@
We’ve used GNU Make’s
file builtin instead of an
echo shell command in order to bypass the shell quoting we would otherwise need.
We’ll define the script itself right in the Makefile (thus the dependency), which enables us to fetch and transform the
rustc invocation when
DRIVER_SCRIPT is expanded.
# To use shell variables in the driver script, we'll need to escape # Make's variable expansion using `$$`. define DRIVER_SCRIPT #!/bin/sh # if you need to check that C-Reduce hasn't removed important parts of # your source, do it here (before we go to the trouble of compiling). # #grep -q 'some_required_pattern' $(SOURCE) || return 1 # Compile the source, $(RUSTC_CMD) > rustc.out 2>&1 # Check for expected output from rustc. grep -q 'internal compiler error' rustc.out return $$? endef
RUSTC_CMD is defined as follows,
RUSTC_CMD = $(patsubst --outdir %,,$(patsubst src/%.rs,$(SOURCE),$(shell $(CARGO_BUILD_CMD))))
CARGO_BUILD_CMD defined statically (
CARGO_BUILD_CMD := cargo build -v 2>&1 | grep 'Running `rustc' | tail -n 1 | sed -e 's/^.*Running `\(.*\)`/\1/g'