See also: strawman consolidated dependency list
- NOTE: the strawman was made a few weeks ago and may be out of date. It will be manually rebuilt before merging.
Author’s note: dependency management tools, like text editors, are often the subject of holy wars. This RFC seeks only to improve our dependency management in TVM—we can consider using any dependency management tool that fits our requirements. For the purposes of maintaining a constructive conversation here, let’s focus debates between dependency management tools around their impacts on the TVM project as a whole, not any one developer’s individual workflow.
Background
TVM has historically attempted to avoid overly specifying Python package requirements in order to be as lightweight and flexible as its applications allow. However, this practice has led to a scattering of Python dependencies around the codebase, such that it’s now quite difficult for the average TVM developer to create a virtualenv for local development that comes close to matching that used in the regression.
This RFC proposes that we move all Python package requirements into a single location, then source or otherwise generate files where needed. Further, it proposes that we checkin the output of pip freeze
from each CI container, so that it’s easy to lookup the actual package versions TVM tests against.
Challenges
C0: Requirement Groups
TVM’s Python code is organized into parts:
- A core portion, required to use TVM at all
- A set of Relay importers, which depend on a variety of third-party Python packages necessary to parse formats foreign to TVM
- A set of optional components, such as microTVM, which may depend on third-party libraries specific to one use case of TVM
TVM’s Python package requirements depend on which parts of TVM you want to use. Further, TVM developers have an additional set of requirements (pylint, pytest, etc) that should not be included as dependencies of any built TVM package.
C1: Varying Constraints on Dependency Versions
Python dependencies can also be separately categorized into these 3 categories, which bear no relation to the groups from C0.
- Loose dependencies — where any non-ancient version will likely do
- Range dependencies — where some version preference (i.e. in major/minor version) exists, but generally any package in that range will do. Sometimes users may want to purposefully install a package outside the range i.e. to break one feature of TVM but enable another for a particular model. Dependencies that follow semantic versioning are likely candidates for this group.
- Exact dependencies — where using any other version than that used in CI is unworkable
A survey of TVM’s present dependencies is included in the strawman pyproject.toml
from the PoC PR.
C2: Python Package Requirements
The de facto standard Python package manager, pip
, resolves and installs package dependencies by default (i.e. unless --no-deps
is given). However, pip install
produces a set of installed packages that is neither deterministic (even given a frozen index) nor consistent. Specifically, if a user executes pip install a b
, they may see a different set of installed packages than if they execute pip install b a
.
Tools such as pipenv
and poetry
have been written to work around this.
C3: Updating CI containers
Generally speaking, the policy of tvm is to avoid restricting dependency versions when possible. This allows the TVM CI to remain up-to-date with respect to its dependencies as the CI is updated. However, users of TVM would ideally like to install TVM alongside the same set of Python dependencies used in the regression—this gives a predictable user experience.
Currently, Python packages are installed to each container using a series of pip install
commands, and CI container updates are made individually (i.e. ci-cpu
is updated independently of ci-gpu
). This means that it’s entirely possible and expected that we test TVM against different but unpredictable versions of its dependencies.
Topics for this RFC
In general, our Python dependencies are scattered around the codebase and I’d like to argue that any solution going forward should at least centralize these. The topics I’d like to debate with this RFC are:
T0. In what format should we store dependencies?
T1. Which dependencies should be listed in setup.py
?
T2. What changes, if any, do we need to make to the CI process in order to produce a tested list of Python dependencies?
T3. What pathway should we provide to the user to install the dependencies they need to use TVM?
T4. What pathway should we provide to the developer to install dependencies for developing TVM?
Approaches
A0. De-centralized dependency tracking (current approach)
Currently, dependencies are tracked in a decentralized fashion:
-
[setup.py](http://setup.py)
reports the bare minimum dependencies required to use TVM. Some extras are provided, but there is no coordination between the versions specified insetup.py
and the version used in CI test containers (i.e.pip install -e python
is not executed in the CI test container). - CI containers are built with Python package version restrictions specified in the install script. Where versions are not restricted, no checking is performed to ensure that package versions are compatible (i.e. only
pip install
is used, not pipenv, poetry, or another tool that checks the integrity of the version graph). - To run developer tools such as
pylint
, the suggested approach is to use theci-lint
container with a local docker container. - To build docs, developers install dependencies specified in
docs/README.md
. Developers can ensure they have the correct dependencies by comparing againstpip freeze
fromci-gpu
(NOTE however thatci-gpu
requires a gpu instance to run locally).
There are some benefits to this approach:
- It is simple to perform each step of the process separately
- Tests for the most part run against recent Python deps as containers are updated regularly, and non-pinned Python packages update automatically with each container rebuild.
- Since containers run slightly different versions of Python packages, some diversity is present in the set of Python packages TVM is tested against.
However, there are drawbacks:
- It’s very difficult for a developer to tell exactly which versions of dependencies TVM is tested against, short of pulling a multi-gigabyte docker image and running
pip freeze
. - When building documentation, it’s actually impossible to deduce this unless the developer happens to have a machine that can run
nvidia-docker
. Theci-gpu
container, which is used to build docs, can’t be started without binding to a GPU. - Although there is diversity in the dependencies tested, we have no control over this and limited visibility into it.
- End users installing TVM (i.e. from
pip install tlcpack
or frompip install -e ./python
) can’t expect it to depend on the specific tested versions of Python packages. While loose dependency pinning is standard practice in the Python community, having the ability to pin to a known-good configuration can be helpful. Further, there isn’t even a “simple command” a user could run—they need to downloadci-cpu
andci-gpu
and cherry-pick package versions frompip freeze
. - There is a tool to run containers for local developer use, but it doesn’t work well with
git-subtree
and requires developers to lookup the relevant container versions inJenkinsfile
. It’s unwieldy. When usinggit-subtree
, the only way to run the linter locally is to checkout your development branch in the original git repo.
A1. Centralized Management with a set of requirements.txt
Create a set of requirements.txt
files that contain the reference versions of TVM packages. More than 1 requirements.txt
file is necessary because the set of dependencies needed to use TVM varies with your use case, and we wish to maintain flexibility. For instance, to use the pytorch
importer, torch
is needed; but we don’t wish to require users to install that for basic TVM usage or when using TVM purely for running inference. Therefore, a new file requirements-torch.txt
would be generated for this case, and would correspond to a torch
extras_require entry in setup.py
.
Additionally, a requirements-dev.txt
would be created to capture developer requirements such as pylint
and the docs-building requirements.
When building CI containers, care needs to be taken to install Python packages only from requirements.txt
files in the repo. Either all Python packages need to be installed in one pip install
command, or a shell script helper should be written to verify that the packages requested are present in requirements.txt
. When installation is finished, docker/build.sh
should pip freeze
and write the output to docker/python-versions/v0.62-ci-cpu-constraints.txt
.
Finally, setup.py
must read the requirements.txt
files and fill install_requires
and extras_require
from those files.
Pros:
- It is easy to determine the set of TVM dependencies and the actual package versions used in test.
-
requirements.txt
is a universally-consumable format so no additional tooling is imposed on TVM developers or users. -
setup.py
will agree withrequirements.txt
as pip wheels are built.
Cons:
- CI containers may continue to diverge from one another in terms of dependency management.
- The set of installed Python packages could still differ depending on the order of
pip install -r requirements.txt
and e.g.pip install -r requirements-torch.txt
. Developers may not remember the order in which these commands were invoked or the full history of their local virtualenv, so bug reports could arise from dependency problems that are hard to document and reproduce. - The set of installed Python packages could still not be consistent—a Python package mentioned later in a
requirements.txt
may install a dependency incompatible with a previously-mentioned Python package. - Developer usage is still somewhat tricky — multiple
pip install -r
commands are needed - When syncing, developers need to remember to rebuild their virtualenv
A2. Consistent centralized dependencies with a tool such as poetry
or pipenv
This approach is similar to A1, but instead of creating a set of requirements.txt
files, a more advanced dependency management tools such as poetry
or pipenv
is used. These tools tend to favor a centralized file—for instance, poetry stores dependencies in pyproject.toml
at the root of the TVM repo. The set of requirements.txt
files could still be auto-generated for developers who prefer that approach, and a unit test could verify they are in sync with the authoritative pyproject.toml
.
Pros:
- It is easy to determine the set of TVM dependencies and the actual package versions used in test.
- Local developer virtualenv management is automated
- The set of installed packages is always consistent
- Version specification is a little bit better in poetry with operators dedicated to semantic versioning (i.e.
^0.4
means anything>=0.4.0
and<0.5
) -
setup.py
will agree withpyproject.toml
as pip wheels are built.
Cons:
- Additional tooling is needed for the optimal developer experience
- Holy wars abound with respect to developer tooling, though this could be mitigated by tools such as
dephell
. -
setup.py
would need to parsepyproject.toml
and so could be more complex. It would also need to map semantic versions to pip-compatible versions (translating^0.4
to pip constraints>=0.4, <0.5
is straightforward, butsetup.py
may wish to loosen the version constraints on some dependencies). - Nothing in this approach fixes the problem of building CI containers with different dependency sets; however, it does provide a way forward here (see Consistent CI containers below).
- The container dependency snapshot needs to be manually checked-in under
docker/python-versions
Consistent CI Containers
The topic of ensuring CI containers run on the same Python package versions is for another RFC. But, approach A2 enables a fairly straightforward, if notably more complex, flow which I’ll sketch here:
- When it is time to update the CI containers due to a change in Python package version, a script launches the base container (i.e.
ubuntu:18.04
), installspoetry
, and runspoetry lock
. The output is written todocker/python-versions/poetry.lock-v1.01
. - When a new container is built, the corresponding
poetry.lock-v1.01
file is copied fromdocker/python-versions
to the root of the repository. Allpip install
commands are replaced withpoetry install
, and no further change is needed because thepoetry.lock
file specifies the exact version to install plus any dependencies needed (and their exact versions). - When one CI container is updated, all of them are updated and all containers with the same version number share the same
poetry.lock
-file. This is why I’ve bumped the container major version to1
in this example. - The new set of containers is tested against a PR that submits the new
poetry.lock
file and bumps the global container version number inJenkinsfile
. Additionally, adocker/python-versions/poetry.lock-latest
file could be included to view diffs against the previous lock-file in code review.
A more thorough testing flow should be specified at the time this is baked into an RFC. Additional challenges that would need to be addressed in a hypothetical RFC are support for executing Python code for non-linux OS (currently the CI does not do this, but we should not add any impediments to this).
Discussion
Here are some points for discussion. There are probably things I haven’t considered with this RFC, let’s discuss them as well.
C0. Which approach should we take to this problem?
C1. Do you care if TVM adds an extra tool such as poetry
as the preferred way to manage your local development virtualenv? Do you suggest a different tool (please do so based on the merits of such tool, not simply that it’s the one you use)?
C2. How important is it to standardize the CI container build process? Should we further consider a standardized CI container build pipeline?
C3. Is loose dependency specification in the setup.py
the right thing to aim for? At what level should we specify dependency versions there?