Our current test suite takes a while to run. A main reason is that tests that only require a cpu are also being run on testing nodes that have gpus. With multiple PRs, tests running on gpus are often a limiting factor. Because demand is high, PRs have to wait until a gpu node is freed up before testing can begin.
I propose we explicitly mark tests that require a gpu and run only marked tests on the gpu.
Pytest provides a mechanism to do this: markers.
Markers allow tests to be decorated with
@gpu (for example) and then pytest can select only tests with this marker using
pytest -m gpu.
Markers can be combined with
pytest.mark.skipif, to make sure that tests are only run when a required gpu is present.
I propose we use the following markers:
tvm.testing.uses_gpufor tests that use both the gpu and cpu (see below).
tvm.testing.requires_gpufor tests that require the gpu.
tvm.testing.requires_cudafor tests that require the cuda.
tvm.testing.requires_...for tests that require rocm, opencl, etc.
Many tests use a variety of different devices, like llvm, cuda, and rocm.
There are three main ways that tests use devices: 1. tests iterate through
tvm.relay.testing.config.ctx_list 2. tests iterate through
tests/python/topi/python/common.py:get_all_backend and 3. tests iterate through a hand picked list of targets and check if the device is enabled with
These methods do not allow us to separate out the gpu parts from the cpu parts.
To do this separation, I propose we merge 1. and 2. into a function called
tvm.testing.enabled_devices and replace 3. with a function
tvm.testing.device_enabled. These two functions would use an environment variable to determine which devices are enabled (a subset of the ones supported by the current build of TVM).
- Devices we test against are controlled by an environment variable. Environment variables can be hard to discover, so we should document this one well.
- Tests that use
tvm.testing.enabled_devicesmust also mark their testing function with
tvm.testing.uses_gpu. If they don’t then the test will never be run with gpu devices. A fix would be having a special decorator that parameterizes the test over the devices and sets markers appopriately (using [
pytest.mark.parameterize](pytest parameterize)). Unfortunately, this would require rewriting a large amount of tests.