I fount that these two line occupied 6% cpu in android arm cpu for 3x112x112. It seem a little bit too high for mobile device.
NDArray inputNdArray = NDArray.empty(new long[]{1, IMG_CHANNEL, MODEL_INPUT_SIZE, MODEL_INPUT_SIZE}, new TVMType("float32"));;
inputNdArray.copyFrom(imgRgbTranValues);
I found that it finally call memcpy to do image copy, is there any method to optimize the high cpu usage of the two line?
void VTAMemCopyFromHost(void* dst, const void* src, size_t size) {
memcpy(dst, src, size);
}