Why is the constant variables like cse_var_1: T.bool = not T.bool(False) in tir not optimized after tvm.build? cse_var_1 is generated as a function param

I’m tuning dwconv and IrModule look like this

@I.ir_module
class Module:
    @T.prim_func
    def main(p0: T.Buffer((1, 128, 128, 128), "float32"), p1: T.Buffer((128, 1, 5, 5), "float32"), p2: T.Buffer((1, 128, 1, 1), "float32"), T_add: T.Buffer((1, 128, 128, 128), "float32")):
        T.func_attr({"from_legacy_te_schedule": T.bool(True), "tir.noalias": T.bool(True)})
        cse_var_1: T.bool = not T.bool(False)
        blockIdx_x = T.launch_thread("blockIdx.x", 512)
        DepthwiseConv2d = T.allocate([64], "float32", "local")
        ........
        ........

I build this by lib = tvm.build() and get cuda source code by lib.imported_modules[0].get_source(). the cuda code looks like this.

extern "C" __global__ void __launch_bounds__(64) default_function_kernel(float* __restrict__ T_add, float* __restrict__ p0, float* __restrict__ p1, float* __restrict__ p2, bool cse_var_1) {
  float DepthwiseConv2d[64];
  __shared__ float PaddedInput_shared[4896];
  __shared__ float p1_shared[50];
 ..........
 ..........
if (((cse_var_1 && (1 <= (((((int)blockIdx.x) & 3) * 16) + (((((int)threadIdx.x) >> 1) + 2) % 18)))) && ((((((int)blockIdx.x) & 3) * 16) + (((((int)threadIdx.x) >> 1) + 2) % 18)) < 65))) {
    condval_40 = p0[((((((((((int)blockIdx.x) >> 3) * 32768) + (((((int)threadIdx.x) + 2560) / 2448) * 16384)) + (((((int)blockIdx.x) & 7) >> 2) * 8192)) + (((((int)threadIdx.x) + 112) / 36) * 128)) + ((((int)blockIdx.x) & 3) * 32)) + ((((int)threadIdx.x) + 4) % 36)) - 258)];
  } else {
    condval_40 = 0.000000e+00f;
  }
  PaddedInput_shared[(((int)threadIdx.x) + 2560)] = condval_40;

cse_var_1 looks like a bool constant why it is a kernel param in cuda code? Or do I need to use some or if I need some extra Settings to optimize it