Caffe Frontend
Background & Motivation
Caffe is a deep learning framework made with expression, speed, and modularity in mind. Because of its simplicity, good scalability, fast and other characteristics, it is favored by many people. According to RISELab who makes statistics of papers collected in arxiv.org, Caffe is ranked in the top four in the deep learning framework, which shows to some extent that Caffe’s user base is still large, please refer to blog. In addition, according to our company’s research on the market, the demand for Caffe in the production environment is still strong, and many models based on Caffe need to be deployed. For example, existing deployment frameworks, such as MNN, NCNN, MACE, etc., directly support the deployment of Caffe.
TVM only supports caffe2 at present, and the difference between Caffe and caffe2 is quite large. At present, there are two ways to deploy Caffe model in TVM: one is to convert Caffe model to Tensorflow or Pytorch model, the other is to convert Caffe model to onnx and then to relay IR. The two methods are essentially the same. They are all indirectly converted to relay IR through the third-party transformation. However, the problem is that some ops will fail in the process of model transformation, and even the result of transfer out may be different.
Based on the above situation, we decided to open our Caffe frontend codes, hoping to enrich the use scenarios of TVM.
Implementation Approach
The whole process of Caffe front end importing model is divided into:
- Read Model:The model graph and related parameters are read through the protobuffer API of Caffe;
- Rebuild Graph:Traverse the graph, then replace the top of the in-place layer with the name of the layer, and update all related layers at the same time;
- Model Conversion:Read the parameters of each layer and convert them into corresponding TVM OP and parameters;
- Layer Fusion:fuse batchnorm and scale layers;
- Convert to Relay IR:It mainly includes its module, params and the real name of the output layer。
Finally, we can import the Caffe model as follows:
from google.protobuf import text_format
from tvm.relay.frontend import caffe_pb2 as pb
init_net = pb.NetParameter()
predict_net = pb.NetParameter()
# load model
with open(proto_file, 'r') as f:
text_format.Merge(f.read(), predict_net)
# load blob
with open(blob_file, 'rb') as f:
init_net.ParseFromString(f.read())
shape_dict = {'data': [1,3,224,224]}
dtype_dict = {'data': 'float32'}
mod, params, model_outputs = relay.frontend.from_caffe(init_net, predict_net, shape_dict, dtype_dict)
Work Done
All of the things that we have done are listed as following:
1. List of supported Ops
- BatchNorm
- Concat
- Convolution
- Crop
- Deconvolution
- Dropout
- Eltwise
- Flatten
- InnerProduct
- Input
- LRN
- Normalize
- Permute
- Pooling
- PReLU
- PriorBox
- proposal
- Python
- ReLU
- Reshape
- Resize
- ROIPooling
- Scale
- Sigmoid
- Slice
- Softmax
- TanH
- Upsample
2. List of supported complete models
- Alexnet
- Resnet50
- Mobilenetv1
- Mobilenetv2
- Inceptionv1
- Inceptionv3
- Inceptionv4
- Vgg16
- Squeezenetv1
- SSDMobilenetv1
- SSDMobilenetv2
- YOLOv3
- ENet
3. Caffe frontend test cases
4. Caffe frontend tutorial
TODO
- [ ] Support more ops and more complete models.
According to the above implementation scheme, based on the front-end framework we built, you can add any new op, you only need to: firstly, add a method in the operatorconverter class, which needs to include your extraction of the layer parameters and the logic of conversion to TVM OP, secondly, register the method to convert_ map.