[RFC] Type-Directed Relay Fuzzing Library

Flandini · March 10, 2022, 11:43pm

I have some additional comments though I don’t work on TVM/Relay; I just helped Steven out with some parts of the fuzzer.

I’m in favor of just keeping the fuzzer (the main driving part of the fuzzer) in the top level repo since it is pretty small, but I don’t think it’d be much of a difference keeping it in another repo. As to where to storge component-specific fuzzing parts like oracles or pass-specific fuzzing harnesses, I think the two best options are:

Keep these things next to the code. For example, for transformation-specific fuzzing wrappers, just put these wrappers next to their transforms in the src/relay/transforms.
Keep these things in some tooling or test directory. For example, those same transformation-specific fuzzing wrappers would instead go in tests/python/relay (or some subdir there).

I’m not super familiar with Python, but I think that with the way Python modules work and the fuzzer currently written in Python, maybe the second option would be easier (?).

Probably the most important part of the infrastructure to figure out is how this is going to run. Since it sounds like the fuzzer will run quite frequently, the runs should probably be automated by some CI-fuzzing tool that has statistic and bug reporting abilities. I don’t have much experience with CI-based fuzzing toolchains, but I would like to throw ClusterFuzzLite (GitHub - google/clusterfuzzlite: ClusterFuzzLite - Simple continuous fuzzing that runs in CI.) into the mix for consideration. It is a simplified version of Google’s ClusterFuzz, runs in CI, has the reporting facilities, and can handle mixed Python/C++ projects. I found this a ujson fuzzer for ClusterFuzzLite: oss-fuzz/projects/ujson/hypothesis_structured_fuzzer.py at master · google/oss-fuzz · GitHub which mixes Python and C; this could be used as a starting point or template to getting the Relay fuzzer up and running.

For an earlier evaluation of the fuzzer, we used llvm’s SourceBasedCodeCoverage and SanitizerCoverage to instrument and track code coverage of the C++ parts of Relay. This injects code coverage tracking code at compile time, so we couldn’t get full coverage data because of the Python/C++ split in the fuzzer, but unless others have better ways of getting coverage data, I’m guessing this will be the way it gets coverage to start.