AutoTVM and CPU vectorization: should I split?

A similar question: suppose we parallelize some axis y like

s[A].parallel(y)

does making the length of y equal to OMP_NUM_THREADS (4 in my case) guarantee to be the best solution?