It’s interesting to see this, we see lots of common requirements between machine learning and HPC,
There is significant overlap between requirements for deep learning and HPC applications: (1) abundant parallelism; (2) large data sets demanding optimizations to manage data movement; (3) a diversity of target architectures; and, (4) need for scalability.
but the applications in HPC is more complex, so the question is for a long term, will TVM head to HPC? if yes, what we may need in TVM stack?
There are several factors that makes tvm already a good fit, such as the ability to quickly support diverse target architectures. The other elements mainly depends on the application:
For normal ML, the application of interest usually runs single kernel on a single device and distribute the workloads around the nodes. This means quite a few optimizations can be done at the node level.
For scientific computing or certain apps that requires networking and dividing single op into multiple devices. There is a natural extension of the schedule layer to tackle network as a separate memory hierarchy and add efficient networking support to the runtime. This is something that is quite interesting, fits into the scope, but we have not done yet.
Assuming that HPC here refers scientific computing. It has very broad spectrum and always relies on existing compute library (not necessarily in high-performance) to execute. TVM can definitely play a role here, but we need people from this area to drive.
BTW, I think the target platforms for HPC is much more limited.
@tqchen@yidawang thanks for reply.
One problem for HPC we have seen is some complex operators are difficult to implement. Such as Cholesky decomposition is difficult to describle thru compute, we implemented it with hybrid, however, how to scheduling is a big challenge, the loop of this op is not regular type tvm can handle for now, it’s triangle.
@yidawang about the target, I think the core is how to do vectorization and parallization, we may need to bring more principle compiler tech, together with domain specific abstraction in language.