Auto schedular performance on AMDGPU: the first attempt

Is it worth continue tuning even if after about 7000 trials, improvement after each round is almost negligible (say only 0.001 ms faster) ? How do you decide if tuning is converged?