TVM runtime and build implementation details for custom harware

Hi

We are creating hardware for deep learning models inference. We have initial workflow through tvm for most of the operators but i want understand and make sure that we should be using TVM as like other backends. currently we are using TVM from user perspective

 def tepac(self):
    build_dir="./build/add"
    os.makedirs(build_dir,exist_ok=True)
    lib_path = f"{build_dir}/model.so"
    target = tvm.target.Target("tepac")
    mod = partition_for_celestial(self.mod)
   **# partition_for_celestial will annotate and partition the graph**
    with tvm.transform.PassContext(opt_level=0):
        tvm_tepac_lib = relay.build(mod, target=target, params=self.params)
       **# relay.build generates C code with host and target device calls**
        tvm_tepac_lib.export_library(lib_path,workspace_dir=build_dir)
      **during export_library we did cross compile for this c code create model.so**
        grahp_json = tvm_cpac_lib.get_graph_json()
    
   
    lib = tvm.runtime.load_module(lib_path)
    dev = tvm.device("tecpac")
    m = graph_executor.GraphModule(lib["default"](dev))
    pacdev = tepacsim.simulate()
    **tepacsim.simulate() will create our harware run pointer**
    tepactvm.bindtvm(pacdev, tepacsim.runtime(pacdev))
   **attaching our hardware run pointer to TVM device apis**
    m.set_input('x', tvm.nd.array(self.ip0))
    m.set_input('y',tvm.nd.array(self.ip1))
    m.run()
    r = m.get_output(0)
    pypac.destroy(pacdev)
    pac = r.numpy()
    print(pac)
    return pac

My questions are

1.How can move annotate and partition as part of relay.build. 2.For some reason without calling export_library i can’t do any inference i,e use it in runtime (Fundamentally i am missing something here) 3.How or where can i add our simulator apis to in TVM so application user need not call them.

I want achieve something like this at the end.

 def cpu(self):
    build_dir="./build/add"
    os.makedirs(build_dir,exist_ok=True)
    lib_path = f"{build_dir}/model_cpu.so"
    dev = tvm.cpu(0)
    target = tvm.target.Target("llvm", host="llvm")
    with tvm.transform.PassContext(opt_level=0):
        lib = relay.build(self.mod, target=target, params=self.params)
        lib.export_library(lib_path,workspace_dir=build_dir) 
        grahp_json = lib.get_graph_json()
    
    lib = tvm.runtime.load_module(lib_path)
    m = graph_executor.GraphModule(lib["default"](dev))
    m.set_input('x', tvm.nd.array(self.ip0))
    m.set_input('y',tvm.nd.array(self.ip1))
    m.run()
    cpu = m.get_output(0)
    print(cpu)
    return cpu