Supporting Faster RCNN and Mask RCNN models

masahi · April 1, 2019, 4:29pm

Hi, I’m interested in running Faster RCNN and Mask RCNN models with TVM.

Thanks to @vinx13, we now have ROIPooling, ROIAlign, Proposal, and box related ops. With @Laurawly’s PR we will have argsort and AdaptiveAvgPooling. It seems we have all pieces needed to run Faster RCNN and Mask RCNN models from GluonCV. The only thing missing that I could find is the relay frontend for gather_nd op, which is easy to add.

Are there any other mxnet ops that are missing? @vinx13 @Laurawly @kevinthesun

vinx13 · April 1, 2019, 5:20am

I have run some faster-rcnn variants. Ops are all supported. Proposal Op may have some performance issue with sorting

Laurawly · April 5, 2019, 6:26pm

We have all ops supported. Gluncv models recently use deconvolution for rcnn models which causes performance issues since we don’t have very optimized deconv schedule in tvm.

kbyran · April 8, 2019, 2:04pm

odd-even sort seems not that good at proposal op. The reasons may be,

sync is expensive for each loop;
share memory is missed in some case.

vinx13 · April 9, 2019, 5:33am

agree, replacing it with external cuda lib function might be a solution

kbyran · April 12, 2019, 8:00pm

Is it possible to use Thrust in TVM?

vinx13 · April 15, 2019, 2:19am

Yes, you can wrap it into a packed function

Jiungyao · May 2, 2019, 11:16am

Hi, what is target for this discussion of Faster RCNN discussion? GPU? Are these discussions work for CPU or LLVM? Thanks

masahi · May 2, 2019, 12:38pm

I think we are having GPU in mind here. But extending support for CPU is not difficult.

Jiungyao · May 2, 2019, 12:52pm

Great, thank you for your prompt reply.

Jiungyao · May 2, 2019, 1:38pm

Yes, you are right. We also successfully compiled and run on CPU. Thank you for the tips.

tico · July 15, 2019, 12:14pm

Hi,

I would like to evaluate a Faster RCNN model on TVM. Is there any existing example doing this? Could someone share it here?

Thanks

javier · July 26, 2019, 9:01am

Hi,
I’m replying to this thread as I think the issues are closely related. I’m interested in running maskrcnn_benchmark in TVM (specifically e2e_mask_rcnn_X-152-32x8d-FPN-IN5k_1.44x_caffe2). I’ve tried with pytorch_tvm (using torch.jit.trace()), and also converting first to onnx (and using then relay.frontend.from_onnx()), to no success (missing operators in the latter case). I’ve also tried converting from pytorch to mxnet, but no luck either. Is there any plan on supporting the missing operators using onnx or pytorch_tvm? (btw my target would be CPU). Thx in advance.

tico · July 26, 2019, 9:21am

At least for Tensorflow there is some ongoing work to support NonMaxSuppression which is a key operator for these kind of models. A PR is expected soon by @yongwww

javier · July 26, 2019, 10:12am

Hi @tico, thx for the prompt reply! I’ll keep an eye on the mentioned PR, I hope it is uploaded soon

yongwww · July 26, 2019, 9:17pm

We are working on enabling mask-rcnn, fast-rcnn, faster-rcnn, ssd, etc support in TVM. Hopefully all of these models will be supported before end of this year. Welcome to contribute!

javier · July 29, 2019, 6:04am

Hi @yongwww,

Thanks for your reply! I’m glad to know that we will have a working implementation in the following months. About contributing, I understand that there is a group already working on this specific issue, could you please give some direction on how to contact them? Thx again for your support.

lklcf · November 4, 2019, 9:59am

Are ops supported in CPU?

yongwww · November 12, 2019, 9:12am

@tico @javier I have sent out the pr for dynamic nms for tensorflow - https://github.com/apache/incubator-tvm/pull/4312

tico · November 18, 2019, 7:05am

Hi @yongwww, awesome! Thanks for the efforts! Looking forward to give it try once merged in the master!