[RFC] A proposed update to the Docker images ci_* tag pattern

For years we’ve been using this patern on our Docker images, which is composed by v0.INCREMENTAL_NUMBER, e.g. tlcpack/ci_cpu:v0.80, and when we update any image we bump the minor number, just as a way to release a new version.

As the project is growing and given we recently have been improving the Docker images building process, I think it is time to consider using a more meaningful tagging scheme, so that it is easier to identify what is included in the images we’re running in our CI.

When looking/using a Docker image, usually we would be interested in:

  1. How long ago was this image generated?
  2. What is the last change added in the current images being used in production?

None of these questions can be answered by our current numbering scheme, and to get that information we would usually need to do some digging and inspecting the image to see what is in there.

Proposal

As an improvement to the current scheme, I’m suggesting we adopt the tagging scheme currently used in the tlcpackstaging repository, composed by:

  1. a timestamp YYYYMMDD-HHMMSS
  2. the last short git hash added in that image

There was some previous mentions to that topic by @manupa-arm in https://github.com/apache/tvm/issues/9659.

One example would be tlcpackstaging/ci_arm:20220201-115323-2af42ba8e. This tells us when this image was generated and up to which point in the repository is included.

I’d like to see what others think about this idea, as I’m planning to submit a proper RFC in tvm-rfcs.

cc @manupa-arm @areusch @Mousius @ramana-arm @driazati @gromero @tqchen

5 Likes

Having recently used the current system to update a number of images (https://github.com/apache/tvm/issues/10120), this is a much welcomed improvement to simplify the update process and give at least a hint as to what might’ve been in each image :smile_cat:!

This sounds great!

Tangentially: has anyone considered if we built the docker images on every new job? It would avoid both having to pin versions in the first place and lengthy docker-updating PRs. For speed we could still keep and continually update from main the tlcpack docker hub, that way most PRs would get a fully cached build that would only take a second or two. Needs more investigation (and shouldn’t block this RFC) but could potentially make maintaining the images less painful.

NOTE :Tangentially (lets not hijack this thread)

@driazati great suggestion!

This is something I brought up in the TVM CI discussion in TVMCon2021 (@leandron @areusch ). I think this is a great way to make sure we do the “correct” verification. I think we just need to sort out security concerns. Would you be able to start a new thread to discuss this ? – I personally fully support this.

1 Like

To the original proposal,

This is great! looking forward for the formal RFC.

Thanks @leandron to picking this up! I just cross-proposed what tlcpackstaging job’s versioning schema which I found to be informative,

This looks like a great proposal @leandron, I’m very supportive! Looking forward to the formal RFC. Would we consider potentially git-describe in place of the hash? This might be a little overkill, or might help.

This is now an officially proposed RFC at:

1 Like