[DISCUSS] TVM Core Strategy for Emerging Needs

Great to see we bring this topic up again and people are converging that Unity should be merged into the main branch. As an active ex-contributor, I’ve been still closely following the updates in the community and always strongly supporting Unity as part of the main branch.

The recent trends on LLM inference and deployment would definite make Unity even more appealing than before. Particularly, there are already quite some data points that showed good performance compared to other open source projects. I’d like to see this happens in the community ASAP(favorably landed as T1), and I volunteer to help review PRs if it would make the process smoother.

1 Like

Thanks everyone for input so far. I would also encourage us to take a look at the strategy and bring thoughts on how can they help accelerate our future developments . As the original post mainly proposes as a community we adopt a different strategy of development moving forward:

  • Before: use build-centric for everything, offers no solution for emerging needs.
  • Proposed: take a per sub-area approach, use build-centric for some existing use-cases and enable on-demand shift to abstraction-centric, abstraction-centric approach for emerging needs, and accelerate new solutions(LLM) with abstraction centric.

This is the key change that we as a community can take and goes beyond simply set of features. That means different modules/area and their respective contributors can set the pace and technical approaches while keeping things scoped.

The “before” approach has not been empowering the members who would love to amplify thrusts in foundational models and emerging needs (that accounts for 90% per recent poll). These members had taken great strides and did ground work to preserve the opportunity for the TVM community to still have a chance in the LLM/genAI space, plus additional extra groundwork to maintain main modules and sync the branches. That is a lot of service to this community.

The “proposed” approach helps empower concrete groundworks to amplify foundational model thrusts. It offers a concrete solution to emerging needs and future growth with complexity management and scope isolation. The tradeoff is that we need to be open-minded about different approaches at meta-level and look at concrete cases collectively. Get the choice to the community and ground works.

2 Likes

Given there are quite some interest in the direction, just want to chime in about the timeline. First, it would be great to get the community to check and collectively have a strategy (aka proposed approach or others), this is the primary part of the initial post.

As for the transition timeline, we will leave at least one to two months so community members will be able to ask questions get get dived into the new modules if needed, they can be found in the unity category in the past year.

Then the ground work will hopefully happen as we continue to develop with a strategy for foundational models

1 Like

Will Integrating OpenAI triton’s method?

  1. THE TRITON LANGUAGE | PHILIPPE TILLET - YouTube
  2. 谈谈对OpenAI Triton的一些理解 - 知乎 (zhihu.com)

We might be able to leverage library dispatch support to enable triton and others. We will also incorporate some of the triton’s insights to improve the tensor level abstraction

2 Likes

Sorry for late participation, ARM China as a commercial user of TVM have done lots of work to meet the customer’s requirements, just like what tianqi have pointed.

1. The centric build approach can’t work for us

in the early time we try our best to add the customized pass or logic in Relay’s centric build flow, then we found we need pay lots of time to fix official failed cases and actually all these fixes are very specific for our business scenario, they aren’t general solutions, so they shouldn’t be contributed to community. Finally, as the business requirement grow, we choose to use our own build flow, this way can control the passes flexibly.

2. TVM Script (TensorIR) is the key to our next step DSL work

Besides the Relay/Relax graph level work, our next step work will be focus on DSL, in other words, we need a higher level and more abstract programming method than the traditional OpenCL C way, the key to archive this work is TVM script, because it resolves the expressive power issue of TE. So we need and look forward to the newest update of TVM script and relevant work.

Many TVM users may know TVM script and Relax are the two key point of Unity branch, some community contributors like @Hzfengsy @junrushao pay lots of extra time to keep the TVM script work synced between branch main and Unity, the same changes need to send two different PR to two branch will consume lots of precious energy and time of TVM contributors. So we agree use Unity as the main branch.

3. The work of Unity is the key of everyone’s next step success.

Everyone know the work of Unity is focus on LLM/AIGC, if a company can’t keep up with this technical revolution, then we can see it can’t be success in the future. Use Unity as the main branch will make the downstream organization of TVM community like us easier to use the newest work like dynamic shape, Relax and so on.

4. The transition will be smooth for Relay users.

As previous discuss pointed out, the Unity will still keep all work of Relay, so the downstream like us can transition smoothly, this is very important for our customers, so very thanks to community contributors for keep this point.

4 Likes

I’ve passed all mainline tests in the unity branch. I think it’s a good time to consider migration :slight_smile:

5 Likes

Thanks everyone for chiming in and @Hzfengsy for efforts making sure all existing modules supported. Now that we are in new year and genAI becomes ever more important. It is a good time to do the proposed transition. We will open up a vote in the incoming week.

5 Likes

Formal voting thread [VOTE] Transition Main to Unity · Issue #16368 · apache/tvm · GitHub

thanks to everyone, unity is now main. Here is a followup post to bring document to align with the core strategies discussed in this post [DISCUSS] TVM Unity Transition Docs Refactor