Required : Steps to run a TVM program on Heaxgon DSP simulator

kausthubtm · January 17, 2023, 4:22am

Have installed the Hexagon DSP SDK.
Have installed Apache TVM
Made changes config.cmake : set(USE_LLVM "path/to/llvm-config") , set(USE_HEXAGON ON), set(USE_HEXAGON_SDK "/local/mnt/workspace/Qualcomm/Hexagon_SDK/5.2.0.0"), set(USE_HEXAGON_RPC ON) and ran make runtime -j$(nproc)
Have got the libtvm_runtime.so

My queries are :

What are the next steps to run a program on Hexagon DSP simulator?
My build folder name is hexagon-build and not the default name “build”. What changes should I make ?
When currently running a program I get Check failed: (allow_missing) is false: Device API hexagon is not enabled.

Please feel to ask for more information regarding this if needed.

config.cmake file :

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.

#--------------------------------------------------------------------
#  Template custom cmake configuration for compiling
#
#  This file is used to override the build options in build.
#  If you want to change the configuration, please use the following
#  steps. Assume you are on the root directory. First copy the this
#  file so that any local changes will be ignored by git
#
#  $ mkdir build
#  $ cp cmake/config.cmake build
#
#  Next modify the according entries, and then compile by
#
#  $ cd build
#  $ cmake ..
#
#  Then build in parallel with 8 threads
#
#  $ make -j8
#--------------------------------------------------------------------

#---------------------------------------------
# Backend runtimes.
#---------------------------------------------

# Whether enable CUDA during compile,
#
# Possible values:
# - ON: enable CUDA with cmake's auto search
# - OFF: disable CUDA
# - /path/to/cuda: use specific path to cuda toolkit
set(USE_CUDA OFF)

# Whether enable ROCM runtime
#
# Possible values:
# - ON: enable ROCM with cmake's auto search
# - OFF: disable ROCM
# - /path/to/rocm: use specific path to rocm
set(USE_ROCM OFF)

# Whether enable SDAccel runtime
set(USE_SDACCEL OFF)

# Whether enable Intel FPGA SDK for OpenCL (AOCL) runtime
set(USE_AOCL OFF)

# Whether enable OpenCL runtime
#
# Possible values:
# - ON: enable OpenCL with OpenCL wrapper to remove dependency during build
#       time and trigger dynamic search and loading of OpenCL in runtime
# - OFF: disable OpenCL
# - /path/to/opencl-sdk: use specific path to opencl-sdk
set(USE_OPENCL OFF)

# Whether enable Metal runtime
set(USE_METAL OFF)

# Whether enable Vulkan runtime
#
# Possible values:
# - ON: enable Vulkan with cmake's auto search
# - OFF: disable vulkan
# - /path/to/vulkan-sdk: use specific path to vulkan-sdk
set(USE_VULKAN OFF)

# Whether to use spirv-tools.and SPIRV-Headers from Khronos github or gitlab.
#
# Possible values:
# - OFF: not to use
# - /path/to/install: path to your khronis spirv-tools and SPIRV-Headers installation directory
#
set(USE_KHRONOS_SPIRV OFF)

# whether enable SPIRV_KHR_DOT_PRODUCT
set(USE_SPIRV_KHR_INTEGER_DOT_PRODUCT OFF)

# Whether enable OpenGL runtime
set(USE_OPENGL OFF)

# Whether enable MicroTVM runtime
set(USE_MICRO OFF)

# Whether enable RPC runtime
set(USE_RPC ON)

# Whether to build the C++ RPC server binary
set(USE_CPP_RPC OFF)

# Whether to build the iOS RPC server application
set(USE_IOS_RPC OFF)

# Whether embed stackvm into the runtime
set(USE_STACKVM_RUNTIME OFF)

# Whether enable tiny embedded graph executor.
set(USE_GRAPH_EXECUTOR ON)

# Whether enable tiny graph executor with CUDA Graph
set(USE_GRAPH_EXECUTOR_CUDA_GRAPH OFF)

# Whether enable pipeline executor.
set(USE_PIPELINE_EXECUTOR OFF)

# Whether to enable the profiler for the graph executor and vm
set(USE_PROFILER ON)

# Whether enable microTVM standalone runtime
set(USE_MICRO_STANDALONE_RUNTIME OFF)

# Whether build with LLVM support
# Requires LLVM version >= 4.0
#
# Possible values:
# - ON: enable llvm with cmake's find search
# - OFF: disable llvm, note this will disable CPU codegen
#        which is needed for most cases
# - /path/to/llvm-config: enable specific LLVM when multiple llvm-dev is available.
set(USE_LLVM "/usr/lib/llvm-14/bin/llvm-config")

#---------------------------------------------
# Contrib libraries
#---------------------------------------------
# Whether to build with BYODT software emulated posit custom datatype
#
# Possible values:
# - ON: enable BYODT posit, requires setting UNIVERSAL_PATH
# - OFF: disable BYODT posit
#
# set(UNIVERSAL_PATH /path/to/stillwater-universal) for ON
set(USE_BYODT_POSIT OFF)

# Whether use BLAS, choices: openblas, atlas, apple
set(USE_BLAS none)

# Whether to use MKL
# Possible values:
# - ON: Enable MKL
# - /path/to/mkl: mkl root path
# - OFF: Disable MKL
# set(USE_MKL /opt/intel/mkl) for UNIX
# set(USE_MKL ../IntelSWTools/compilers_and_libraries_2018/windows/mkl) for WIN32
# set(USE_MKL <path to venv or site-packages directory>) if using `pip install mkl`
set(USE_MKL OFF)

# Whether use DNNL library, aka Intel OneDNN: https://oneapi-src.github.io/oneDNN
#
# Now matmul/dense/conv2d supported by -libs=dnnl,
# and more OP patterns supported in DNNL codegen(json runtime)
#
# choices:
# - ON: Enable DNNL in BYOC and -libs=dnnl, by default using json runtime in DNNL codegen
# - JSON: same as above.
# - C_SRC: use c source runtime in DNNL codegen
# - path/to/oneDNN：oneDNN root path
# - OFF: Disable DNNL
set(USE_DNNL OFF)

# Whether use OpenMP thread pool, choices: gnu, intel
# Note: "gnu" uses gomp library, "intel" uses iomp5 library
set(USE_OPENMP none)

# Whether use contrib.random in runtime
set(USE_RANDOM ON)

# Whether use NNPack
set(USE_NNPACK OFF)

# Possible values:
# - ON: enable tflite with cmake's find search
# - OFF: disable tflite
# - /path/to/libtensorflow-lite.a: use specific path to tensorflow lite library
set(USE_TFLITE OFF)

# /path/to/tensorflow: tensorflow root path when use tflite library
set(USE_TENSORFLOW_PATH none)

# Required for full builds with TFLite. Not needed for runtime with TFLite.
# /path/to/flatbuffers: flatbuffers root path when using tflite library
set(USE_FLATBUFFERS_PATH none)

# Possible values:
# - OFF: disable tflite support for edgetpu
# - /path/to/edgetpu: use specific path to edgetpu library
set(USE_EDGETPU OFF)

# Possible values:
# - ON: enable cuDNN with cmake's auto search in CUDA directory
# - OFF: disable cuDNN
# - /path/to/cudnn: use specific path to cuDNN path
set(USE_CUDNN OFF)

# Whether use cuBLAS
set(USE_CUBLAS OFF)

# Whether use MIOpen
set(USE_MIOPEN OFF)

# Whether use MPS
set(USE_MPS OFF)

# Whether use rocBlas
set(USE_ROCBLAS OFF)

# Whether use contrib sort
set(USE_SORT ON)

# Whether to use Arm Compute Library (ACL) codegen
# We provide 2 separate flags since we cannot build the ACL runtime on x86.
# This is useful for cases where you want to cross-compile a relay graph
# on x86 then run on AArch.
#
# An example of how to use this can be found here: docs/deploy/arm_compute_lib.rst.
#
# USE_ARM_COMPUTE_LIB - Support for compiling a relay graph offloading supported
#                       operators to Arm Compute Library. OFF/ON
# USE_ARM_COMPUTE_LIB_GRAPH_EXECUTOR - Run Arm Compute Library annotated functions via the ACL
#                                     runtime. OFF/ON/"path/to/ACL"
set(USE_ARM_COMPUTE_LIB OFF)
set(USE_ARM_COMPUTE_LIB_GRAPH_EXECUTOR OFF)

# Whether to build with Arm Ethos-N support
# Possible values:
# - OFF: disable Arm Ethos-N support
# - path/to/arm-ethos-N-stack: use a specific version of the
#   Ethos-N driver stack
set(USE_ETHOSN OFF)
# If USE_ETHOSN is enabled, use ETHOSN_HW (ON) if Ethos-N hardware is available on this machine
# otherwise use ETHOSN_HW (OFF) to use the software test infrastructure
set(USE_ETHOSN_HW OFF)

# Whether to build with Arm(R) Ethos(TM)-U NPU codegen support
set(USE_ETHOSU OFF)

# Whether to build with CMSIS-NN external library support.
# See https://github.com/ARM-software/CMSIS_5
set(USE_CMSISNN OFF)

# Whether to build with TensorRT codegen or runtime
# Examples are available here: docs/deploy/tensorrt.rst.
#
# USE_TENSORRT_CODEGEN - Support for compiling a relay graph where supported operators are
#                        offloaded to TensorRT. OFF/ON
# USE_TENSORRT_RUNTIME - Support for running TensorRT compiled modules, requires presense of
#                        TensorRT library. OFF/ON/"path/to/TensorRT"
set(USE_TENSORRT_CODEGEN OFF)
set(USE_TENSORRT_RUNTIME OFF)

# Whether use VITIS-AI codegen
set(USE_VITIS_AI OFF)

# Build Verilator codegen and runtime
set(USE_VERILATOR OFF)

#Whether to use CLML codegen
set(USE_CLML OFF)
# USE_CLML_GRAPH_EXECUTOR - CLML SDK PATH or ON or OFF
set(USE_CLML_GRAPH_EXECUTOR OFF)

# Build ANTLR parser for Relay text format
# Possible values:
# - ON: enable ANTLR by searching default locations (cmake find_program for antlr4 and /usr/local for jar)
# - OFF: disable ANTLR
# - /path/to/antlr-*-complete.jar: path to specific ANTLR jar file
set(USE_ANTLR OFF)

# Whether use Relay debug mode
set(USE_RELAY_DEBUG OFF)

# Whether to build fast VTA simulator driver
set(USE_VTA_FSIM OFF)

# Whether to build cycle-accurate VTA simulator driver
set(USE_VTA_TSIM OFF)

# Whether to build VTA FPGA driver (device side only)
set(USE_VTA_FPGA OFF)

# Whether use Thrust
set(USE_THRUST OFF)

# Whether use cuRAND
set(USE_CURAND OFF)

# Whether to build the TensorFlow TVMDSOOp module
set(USE_TF_TVMDSOOP OFF)

# Whether to build the PyTorch custom class module
set(USE_PT_TVMDSOOP OFF)

# Whether to use STL's std::unordered_map or TVM's POD compatible Map
set(USE_FALLBACK_STL_MAP OFF)

# Whether to enable Hexagon support
set(USE_HEXAGON ON)
set(USE_HEXAGON_SDK "/local/mnt/workspace/Qualcomm/Hexagon_SDK/5.2.0.0")

# Whether to build the minimal support android rpc server for Hexagon
set(USE_HEXAGON_RPC ON)

# Hexagon architecture to target when compiling TVM itself (not the target for
# compiling _by_ TVM). This applies to components like the TVM runtime, but is
# also used to select correct include/library paths from the Hexagon SDK when
# building runtime for Android.
# Valid values are v65, v66, v68, v69.
set(USE_HEXAGON_ARCH "v68")

# Whether to use QHL library
set(USE_HEXAGON_QHL OFF)

# Whether to use ONNX codegen
set(USE_TARGET_ONNX OFF)

# Whether enable BNNS runtime
set(USE_BNNS OFF)

# Whether to use libbacktrace
# Libbacktrace provides line and column information on stack traces from errors.
# It is only supported on linux and macOS.
# Possible values:
# - AUTO: auto set according to system information and feasibility
# - ON: enable libbacktrace
# - OFF: disable libbacktrace
set(USE_LIBBACKTRACE AUTO)

# Whether to install a signal handler to print a backtrace on segfault. This
# may replace existing signal handlers specified by other libraries.
set(BACKTRACE_ON_SEGFAULT OFF)

# Whether to build static libtvm_runtime.a, the default is to build the dynamic
# version: libtvm_runtime.so.
#
# The static runtime library needs to be linked into executables with the linker
# option --whole-archive (or its equivalent). The reason is that the TVM registry
# mechanism relies on global constructors being executed at program startup.
# Global constructors alone are not sufficient for the linker to consider a
# library member to be used, and some of such library members (object files) may
# not be included in the final executable. This would make the corresponding
# runtime functions to be unavailable to the program.
set(BUILD_STATIC_RUNTIME OFF)

# Caches the build so that building is faster when switching between branches.
# If you switch branches, build and then encounter a linking error, you may
# need to regenerate the build tree through "make .." (the cache will
# still provide significant speedups).
# Possible values:
# - AUTO: search for path to ccache, disable if not found.
# - ON: enable ccache by searching for the path to ccache, report an error if not found
# - OFF: disable ccache
# - /path/to/ccache: use specific path to ccache
set(USE_CCACHE AUTO)

# Whether to enable PAPI support in profiling. PAPI provides access to hardware
# counters while profiling.
# Possible values:
# - ON: enable PAPI support. Will search PKG_CONFIG_PATH for a papi.pc
# - OFF: disable PAPI support.
# - /path/to/folder/containing/: Path to folder containing papi.pc.
set(USE_PAPI OFF)

# Whether to use GoogleTest for C++ unit tests. When enabled, the generated
# build file (e.g. Makefile) will have a target "cpptest".
# Possible values:
# - ON: enable GoogleTest. The package `GTest` will be required for cmake
#   to succeed.
# - OFF: disable GoogleTest.
# - AUTO: cmake will attempt to find the GTest package, if found GTest will
#   be enabled, otherwise it will be disabled.
# Note that cmake will use `find_package` to find GTest. Please use cmake's
# predefined variables to specify the path to the GTest package if needed.
set(USE_GTEST AUTO)

# Enable using CUTLASS as a BYOC backend
# Need to have USE_CUDA=ON
set(USE_CUTLASS OFF)

# Enable to show a summary of TVM options
set(SUMMARIZE OFF)

# Whether to use LibTorch as backend
# To enable pass the path to the root libtorch (or PyTorch) directory
# OFF or /path/to/torch/
set(USE_LIBTORCH OFF)

# Whether to use the Universal Modular Accelerator Interface
set(USE_UMA OFF)

# Set custom Alloc Alignment for device allocated memory ndarray points to
set(USE_KALLOC_ALIGNMENT 64)

sample program that I am trying to run :

import tvm

import tvm.contrib.hexagon

import numpy as np

from tvm import te, tir

from tvm.ir.module import IRModule

# Size of the matrices.

N = 32

target = tvm.target.hexagon('v66', hvx=0)

dtype = 'int16'

# Construct the TVM computation.

A = te.placeholder((N, N), name='A', dtype='int16')

B = te.placeholder((N, N), name='B', dtype='int16')

k = te.reduce_axis((0, N), name='k')

C = te.compute((N,N), lambda i, j: te.sum(A[i][k] * B[k][j], axis=k), name='C')

# Create the schedule.

func = te.create_prim_func([A, B, C])

func = func.with_attr("global_symbol", "main")

ir_module = IRModule({"main": func})

func = tvm.build(ir_module, target=target, name='mmult')

# Prepare inputs as numpy arrays, and placeholders for outputs.

ctx = tvm.hexagon(0)

a = tvm.nd.array(np.random.rand(N, N).astype(dtype), ctx)

b = tvm.nd.array(np.random.rand(N, N).astype(dtype), ctx)

c = tvm.nd.array(np.zeros((N, N), dtype=dtype), ctx)

func(a, b, c)

evaluator = func.time_evaluator(func.entry_name, dev, number=1)

print("Baseline: %f" % evaluator(a, b, c).mean)

sanirudh · January 17, 2023, 5:28am

In order to run on hexagon simulator, you need to setup RPC. The detailed instructions on how to build and run tests on hexagon simulator is mentioned in this readme

At a high level you need to perform these steps (all of which are detailed in the Readme above):

Build TVM with hexagon SDK
Build HexagonLauncherRPC
Setup RPC tracker
Run the test with the tracker enabled and starting the session and server in your code.

kausthubtm · January 18, 2023, 9:08am

Thank you! Was able to run the first two steps. Could you tell how to get the port number and the host IP address for this step ? An example would be really helpful.

# Run RPC Tracker in the background
export TVM_TRACKER_HOST="Your host IP address or 0.0.0.0"
export TVM_TRACKER_PORT="Port number of your choice."
python -m tvm.exec.rpc_tracker --host $TVM_TRACKER_HOST --port $TVM_TRACKER_PORT&

Also is the ‘&’ at the end of the last command required ?

sanirudh · January 18, 2023, 10:19am

If you’re running the tracker and the test on the same machine you’re working on, then TVM_TRACKER_HOST value should just be TVM_TRACKER_HOST=0.0.0.0.
The port number can be technically any unused port number which would now be bound to the TVM tracker. For example I use TVM_TRACKER_HOST=0.0.0.0 and TVM_TRACKER_PORT=9190.
The & at the end of the python command is just so that the tracker starts running in the background allowing you to keep using the shell. That way you wouldn’t need to keep the tracker running in one window and open a new command line window to run the test.

kausthubtm · January 18, 2023, 10:57am

Thank you! I have ran into another error :

I’m getting this error when I run the command :

python -m tvm.exec.rpc_tracker --host $TVM_TRACKER_HOST --port $TVM_TRACKER_PORT & pytest tests/python/contrib/test_hexagon/test_launcher.py

My command while building TVM with hexagon SDK was :

cmake -DUSE_LLVM="/usr/lib/llvm-14/bin/llvm-config" \
        -DUSE_RPC=ON \
        -DCMAKE_CXX_COMPILER="/usr/bin/c++" \
        -DUSE_HEXAGON_SDK="/local/mnt/workspace/Qualcomm/Hexagon_SDK/5.2.0.0" \
        -DUSE_HEXAGON=ON ..

Could you help me out on this ?

sanirudh · January 18, 2023, 3:01pm

Looks an error in the build as the error says neither hexagon target nor llvm target is supported. Could please mention the full cmake command you used and which commit of TVM you’re using. Is it the latest main branch?

One other thing, and I’m sorry I should have mentioned this earlier, when you start the TVM tracker, it prints the host:port combination like INFO bind to 0.0.0.0:9193 in your screenshot above. The printed port should be the same as what you’ve mentioned in $TVM_TRACKER_PORT, because if the port you mentioned is not available, the tracker tries a different port and prints that. So in that case, you would need to check the printed port number and modify the $TVM_TRACKER_PORT before running the test.

kausthubtm · January 18, 2023, 3:42pm

Every time I run the command the port no increases by 1 and now im getting a new error stating ValueError: cannot bind to any port in [9193, 9199). Do you have the command to free the port ? Also the error mentioned in the previous post is there even when the port nos match.

The cmake command that i used was for tvm build with hexagon :

cmake -DUSE_LLVM="/usr/lib/llvm-14/bin/llvm-config" \
        -DUSE_RPC=ON \
        -DCMAKE_CXX_COMPILER="/usr/bin/c++" \
        -DUSE_HEXAGON_SDK="/local/mnt/workspace/Qualcomm/Hexagon_SDK/5.2.0.0" \
        -DUSE_HEXAGON=ON ..

the cmake command for HexagonLauncherRPC was :

cmake -DANDROID_ABI=arm64-v8a \
        -DANDROID_PLATFORM=android-28 \
        -DUSE_ANDROID_TOOLCHAIN="/local/mnt/workspace/Qualcomm/Hexagon_SDK/5.2.0.0/tools/android-ndk-r19c/build/cmake/android.toolchain.cmake" \
        -DUSE_HEXAGON_ARCH=v68 \
        -DUSE_HEXAGON_SDK="/local/mnt/workspace/Qualcomm/Hexagon_SDK/5.2.0.0" \
        -DUSE_HEXAGON_TOOLCHAIN="/local/mnt/workspace/Qualcomm/Hexagon_SDK/5.2.0.0/tools/HEXAGON_Tools/8.6.05/Tools" \
        -DUSE_OUTPUT_BINARY_DIR="/home/hp/Desktop/tvm/build/hexagon_api_output" ..

I have attached the git commit history. I believe its the latest main branch.

I do have a doubt with the export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:"/usr/lib/llvm-14/lib/clang/14.0.0/lib" also. The README.md file tells
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:"Path to 'llvm-clang/lib' sub-directory. Currently we use LLVM-13 in TVM CI." but couldnt find the llvm-clang/lib folder so I used the closest matching one. Not sure if its the right one.

sanirudh · January 18, 2023, 5:22pm

Every time I run the command the port no increases by 1 and now im getting a new error stating ValueError: cannot bind to any port in [9193, 9199) . Do you have the command to free the port ? Also the error mentioned in the previous post is there even when the port nos match.

The port numbers change because every time you run a new Tracker is created. all those could be on your background list of jobs, so you can kill them by just finding their port numbers with jobs -p and then killing the ones you don’t want with kill command.

As for your cmake command, I’m not sure what is going wrong, but could you please make sure that you’ve setup TVM properly after building by setting both $PYTHONPATH to /path/to/tvm_src/python directory and $LD_LIBRARY_PATH to contain path to /path/to/build_dir which should contain libtvm.so.

If llvm target is not available, then that only points to some build/setup issue.

Alternatively you could try the docker image first, which can be used by just running the below commands to download and build TVM with hexagon.

docker/bash.sh ci_hexagon ./tests/scripts/task_config_build_hexagon.sh build
docker/bash.sh ci_hexagon ./tests/scripts/task_build.py --sccache-bucket tvm-sccache-prod
docker/bash.sh ci_hexagon ./tests/scripts/task_build_hexagon_api.sh

The latest set of commands can be found any one jenkins runs as part of TVM CI

kausthubtm · January 19, 2023, 3:33am

Thank you! Tried the Docker method. I think this is the expected output. (?)

sanirudh · January 19, 2023, 4:25am

Yes I think that works. There are lots of test files in that test_hexagon directory, and you can use those as an example to write something you would like to test. Let me know if you face any other issues when writing tests.

One last thing is that, if you wish to move out of the docker, you could just use the same commands used in docker to setup a build outside docker. You just need to make sure some of those dependencies are setup properly, and you can get that from the docker and the environment as well by running env command to check the important environment variables to reproduce locally.

kausthubtm · January 19, 2023, 5:17am

I’m trying to create a program that executes a simple sigmoid function. Took a few references from the test_hexagon/test_sigmoid.py but not sure on how to create hexagon_session. I tried something like :

rpc_info = {
    "rpc_tracker_host" : '0.0.0.0',
    "rpc_tracker_port" : '9190'
}
launcher = HexagonLauncherSimulator(rpc_info=rpc_info)
launcher.start_server()
with launcher.start_server() as hexagon_session:
     ...

but I get an error :

Commands that I ran :

Open Terminal at /tvm/
./docker/bash.sh ci_hexagon
python3 sigmoid.py

Do I need to setup RPC tracker before I run the python file since I’m already mentioning this in the python file? But even if I run the RPC tracher set up commands I get the same error.

sigmoid.py :

import sys
sys.path.append('./tests/python/contrib/test_hexagon')

import numpy as np

import tvm
import tvm.testing
from tvm import te
from tvm import tir
from tvm import topi
from tvm.contrib.hexagon.session import Session
from tvm.contrib.hexagon import allocate_hexagon_array
from tvm.contrib.hexagon.build import HexagonLauncherSimulator

from infrastructure import get_hexagon_target


dtype = "float16"


def sigmoid_compute(sigmoid_input):
    return topi.sigmoid(sigmoid_input)


def sigmoid_stir_schedule(sigmoid_input, sigmoid_output):
    sigmoid_func = te.create_prim_func([sigmoid_input, sigmoid_output])
    sch = tir.Schedule(sigmoid_func, debug_mask="all")
    block = sch.get_block("compute")

    (n,) = sch.get_loops(block)
    sch.vectorize(n)
    return sch


rpc_info = {
    "rpc_tracker_host" : '0.0.0.0',
    "rpc_tracker_port" : '9190'
}
launcher = HexagonLauncherSimulator(rpc_info=rpc_info)
launcher.start_server()
with launcher.start_server() as hexagon_session:

    # numpy implementation
    input_np = np.random.uniform(low=-8.0, high=8.0, size=(64,)).astype(dtype)
    ref_output_np = 1 / (1 + np.exp(-input_np))


    input_tensor = te.placeholder((64,), name="input_tensor", dtype=dtype)
    output_tensor = sigmoid_compute(input_tensor)

    tir_s = sigmoid_stir_schedule(input_tensor, output_tensor)
    input_data = allocate_hexagon_array(hexagon_session.device,data=input_np,)
    output_data = allocate_hexagon_array(hexagon_session.device,tensor_shape=ref_output_np.shape,dtype=ref_output_np.dtype,)

    func_name = "sigmoid"
    with tvm.transform.PassContext(opt_level=3):
        runtime_module = tvm.build(tir_s.mod, target=get_hexagon_target("v69"), name=func_name)

    assert "hvx_sigmoid" in runtime_module.get_source("asm")
    assert "vmin" in runtime_module.get_source("asm")
    assert "vmax" in runtime_module.get_source("asm")
    mod = hexagon_session.load_module(runtime_module)

    mod(input_data, output_data)
    output_np = output_data.numpy()

    print(output_np)

sanirudh · January 19, 2023, 7:36am

Yes you need to start the rpc tracker as well for the example to work. I also noticed a couple of small mistakes.

The hexagon_session should be created as with launcher.create_session() as hexagon_session:.

Notice the create_session instead of start_server.

Then, you need to mention the "rpc_tracker_port" as an integer without the quotes like "rpc_tracker_port" : 9190

And finally, add a launcher.stop_server() outside the with scope so that the code stops executing instead of having to kill it manually.

kausthubtm · January 19, 2023, 8:56am

Thank you very much! I’ll work on more such examples and if there are any doubts I’ll start a new thread.

MrJungle1 · June 17, 2024, 5:39am

@sanirudh hello! Seeing so many replies from you, I think you should be very familiar with how to use metaschedule tune hexagon code. I am now facing the same problem as him Metaschedule with hexagon stuck on N/A latency. Do you know what is going on?

sanirudh · June 20, 2024, 3:28pm

Hi @MrJungle1,

Sorry for the late reply. It’s been a while since I used the metaschedule on Relay flow as we’ve mostly been trying to move to relax for our flow. Anyway, I’ll try out the example and let you know if I find the problem.