Tensor Packing in TVM?

Does TVM have any built-in or automated support for tensor packing transformations?

I’m referring to optimizations similar to those described in the 8. Packed Convolution — Dive into Deep Learning Compiler 0.1 documentation, where the data layout is changed (e.g., from NCHW to a packed format like NCHW{x}c) to improve cache locality and SIMD utilization on CPUs.

I’d like to know:

  1. Can this kind of tensor packing transformation be applied automatically via MetaSchedule or other TVM auto-tuning/IR passes?
  2. Or, is this kind of packing generally done manually through scheduling primitives and layout transforms?