Reproducibility of CI environment

Hi, Attempting to run the CI lint checks locally using:

./docker/build.sh ci_lint ./tests/scripts/task_lint.sh

results in a bunch of pylint failures that I don’t observer in the Jenkins CI. For example, the local run produces:

python/tvm/rpc/proxy.py:389:8: R1705: Unnecessary “elif” after “return” (no-else-return)

and numerous other similar errors.

My hunch is that since the Dockerfile and associated scripts under ./docker/build.sh do not pin any versions of the tools either via apt or pip, my Docker image contains newer versions of the various tools than the CI instance. I believe (but Im not certain) that R1705 is a relatively recent addition to pylint.

This makes me wonder:

  • Is ./docker/build.sh … the appropriate mechanism to launch local tests, or should I be using a different mechanism?

  • Is the lack of pinning in the Dockerfiles and hence lack of reproducability of the resulting docker image a conscious design decision or is it an oversight that should be fixed?

  • The complete set of pylint diagnostics I see includes: R1705 (no-else-return), R1716 (chained-comparisons), R1714 (consider-using-in), W0107 (unnecessary-pass), W1308 (duplicate-string-formatting-argument) are these codes that project would prefer to fix in the code, or elide away with in code pylint disable directives?

While looking atPR3520, (I’m still debugging that one), I realized that the version of tensorflow that I had installed depended on what I had done in the setup. While pylint appears to have a fixed version, it appears to me that we should be looking to pin the versions of the various frameworks that we are working with and have a plan to upgrade them depending on what the community thinks about this.

Any thoughts on this ?