RFC: Introduce automatic formatting of Python code

Relatively recently in TVM’s development we introduced the use of clang-format to all of our C++ code. Overall I think this has been a huge win and makes it much simpler to maintain consistent style across the code base and reduces the amount of time spent fighting the linter in CI.

I propose that we adopt a Python formatting tool such as Black to the code base such that we no longer have to spend cycles manually formatting code. Futhermore there are many parts of the code base, such as tests in which the linter is currently disabled due the painful need to maintain style in those files.

If we adopt an automatic formatter I propose we more aggressively turn on linting and by extension auto-formatting for those parts of the code base as well.

9 Likes

Agreed, 100%. I’ve been using a yapf configuration file I got from @comaniac, but I’m happy to standardize.

I’m expecting this for a long while so thanks for bringing up! I used yapf is just because I know this package. It’s fine to use Black or autopep8 after a brief investigation.

+1 for using Black as a global formatter.

I know very few about Black so I did a quick investigation. The biggest difference is that yapf may have non-deterministic results in the same code; while Black guarantees the deterministic results.

In addition, although Black has much fewer options than yapf, looks like we can still customize the column width (Black sets the default to 88).

In summary, Black seems more suitable to play this role.

Reference:

One thing to note about black is it does not support partial formatting of files, so we cannot run in on the diffs of every PR. It would be probably be best if we run it over the whole codebase before/when it is enabled.

I think that’s indeed the must. We did it for C++ codes when enabling clang-format.

+1 it will be super nice!

Personally when using vscode, I use the black integration with “format on save” to make sure the code is formatted well. It will be great if we provide those tooling instructions in the documentation page as well.

It would be useful to comeup with something like git-clang-format(perhaps we could reuse some of the script that enables git-clang-format) to allow us to format only the files that are touched by the commit.

@tqchen Is your worry performance here? The reason we use Black on other code at Octo for example is that its push button on the entire code base like gofmt and rustfmt.

Yah, it was perf but perhaps it is a non-issue.

I was able to recreate a version of the script using black’s functionality see PR: https://github.com/apache/incubator-tvm/pull/6437.