Introducing MLC-LLM: Running large language model on any device

Hello, community,

We are excited to share with folks about the project we released recently: MLC-LLM, a universal solution that allows any language model to be deployed natively on a diverse set of hardware backends and native applications, plus a productive framework for everyone to further optimize model performance for their own use cases. Everything runs locally with no server support and is accelerated with local GPUs on your phone and laptop .

We have a runnable demo app on iPhone that you can try out. And you are welcome to check out our GitHub repo for more details.

This project is only possible thanks to the shoulders open-source ecosystems that we stand on. We want to thank the Apache TVM community and developers of the TVM Unity effort.