Sometimes, when working on a complex piece of software — or even a simple program or library with a non-trivial number of moving parts — you run into an issue and have absolutely no idea how to reduce your code to the minimum necessary in order to reproduce the error.

For the past several months I’ve been blocked by some trait-resolution errors in Symtern, my general-purpose interner crate. I knew the errors were somehow related to my attempts to work around the lack of associated-type constructors — but how? Some of the errors rustc was printing didn’t even have line numbers!

This is exactly the kind of problem that software like C-Reduce is meant to solve: by iteratively removing parts of an input file and calling a user-provided executable to test whether the modified input still triggers the unwanted behavior, C-Reduce is able to reliably reduce test cases. And it works with Rust programs! With a little massaging we can use it for entire crates, even those with external dependencies.

C-Reduce chews your code with many mouths.

C-Reduce works something like this:

  1. User runs creduce with, as command-line arguments, the names of a test-driver executable and file containing source code that triggers “interesting” behavior.

     $ creduce driver.sh crate.rs
    
  2. C-Reduce runs the driver executable with no arguments. If the source file named on creduce’s command line is still “interesting”, the driver executable should return with an exit status of 0; any other exit status ends the current reduction attempt. (Continue to step 3 only when the driver executable returns 0.)

    A driver executable might be, for example, a simple shell script that compiles the named source and checks that the compiler emits the expected error. For example:

     #!/bin/sh
     rustc crate.rs > rustc.out 2>&1
    
     # Check for ICEs
     grep -q 'internal compiler error' rustc.out
     return $?
    

    We could instead run a generated executable and check for aberrant behavior just as easily.

     #!/bin/sh
     rustc crate.rs && ./a.out > a.out.log
    
     grep -q 'wrong branch' a.out.log
     return $?
    
  3. In parallel, masticate the source file using some set of transforms that reduce the file’s size. After each transform completes, return to step two with its output as the new source file; C-Reduce keeps only the smallest transformed source files that remain “interesting”.

  4. Repeat steps two and three until no further reduction occurs. At this point, C-Reduce will print the smallest transformed source file to its standard output and replace the original source file with it.

Note that this explanation is a description of “delta debugging” rather than of C-Reduce in particular, and fails to account for many of the latter’s more salient qualities.

Square peg, round hole.

Earlier I promised a way to use C-Reduce on an entire crate — yet C-Reduce can only chew on a single input file at a time! We’ll have to squeeze the crate a little to make it work.

Expanding a crate into a single source file

C-Reduce expects us to feed it a single input file to munch on; luckily rustc provides a simple way to “expand” all mod declarations much like #include lines in C or C++ source would be expanded by the C preprocessor:

$ rustc -Z unstable-options --pretty=normal src/lib.rs > crate.rs

(At the time of writing, you’ll get a warning if you do this using a non-nightly compiler; it’s not entirely clear if Rust’s compiler devs plan to explicitly support anything like this on stable in the future.)

Extracting the rustc invocation from cargo

Although we now have our single source file, we still need to worry about passing the right flags to rustc so it can find any extern crates. Why don’t we ask Cargo how to do that?

$ cargo build -v 2>&1 \
  | grep 'Running `rustc' | tail -n 1 \
  | sed -e 's/^.*Running `\(.\+\)`$/\1/g'

This should print the rustc invocation Cargo uses for the current crate. We’ll need to adjust the resulting command a little before we use it:

  • src/lib.rs or src/main.rs must be replaced with the name of the single source file we created earlier (crate.rs).
  • --out-dir /path/to/my-crate/target/debug/deps should probably be removed since C-Reduce will be running our driver script in parallel, using a separate temporary directory for each run.

We expect (or hope) that any extern crate lines will be eliminated by C-Reduce anyway, so going to the extra trouble of supporting external crates lets us skip manually extricating any external dependencies from our code — a big win.

Tying it all together

While we could do all this manually, typing the commands directly into a terminal, it’s both cumbersome and unnecessary. Enter my favorite underappreciated tool, GNU Make.

Using Make allows us to automate most of the steps involved.

Main target

The first rule defined in a makefile is the default goal, used when no goal is specified on Make’s command line. We probably want that target to be the one that calls creduce:

.PHONY: reduce
SOURCE ?= crate.rs
DRIVER ?= driver.sh

reduce: $(DRIVER) $(SOURCE)
	creduce $<

For sake of flexibility we’ve made it possible to specify an alternate driver script or source file on the Make command line with e.g.

$ make DRIVER=alternate-driver.sh

or

$ make SOURCE=alternate-source.rs

Specifying both would just be an overly-verbose way of calling C-Reduce directly; specifying only one of them allows us to depend on the Makefile’s machinery for the other.

Source file

Next we’ll define the rule for creating our expanded source file on-demand:

crate.rs: $(wildcard src/**.rs)
	cargo rustc -- -Z unstable-options --pretty=normal > $@ \
	    || (rm -f $@; return 1)

Defined this way, it will be updated whenever we change one of our crate’s sources. Good.

Driver script

Now let’s tackle the driver script. Its build rule is this:

driver.sh: Makefile
	$(file > $@,$(DRIVER_SCRIPT))
	chmod +x $@

We’ve used GNU Make’s file builtin instead of an echo shell command in order to bypass the shell quoting we would otherwise need.

We’ll define the script itself right in the Makefile (thus the dependency), which enables us to fetch and transform the rustc invocation when DRIVER_SCRIPT is expanded.

# To use shell variables in the driver script, we'll need to escape
# Make's variable expansion using `$$`.
define DRIVER_SCRIPT
#!/bin/sh

# if you need to check that C-Reduce hasn't removed important parts of
# your source, do it here (before we go to the trouble of compiling).
#
#grep -q 'some_required_pattern' $(SOURCE) || return 1

# Compile the source, 
$(RUSTC_CMD) > rustc.out 2>&1

# Check for expected output from rustc.
grep -q 'internal compiler error' rustc.out
return $$?
endef

RUSTC_CMD is defined as follows,

RUSTC_CMD = $(patsubst --outdir %,,$(patsubst src/%.rs,$(SOURCE),$(shell $(CARGO_BUILD_CMD))))

with CARGO_BUILD_CMD defined statically (:=) as

CARGO_BUILD_CMD := cargo build -v 2>&1 | grep 'Running `rustc' | tail -n 1 | sed -e 's/^.*Running `\(.*\)`/\1/g'