Introducing and modernize FFI system

As we capture several year’s of lessons, it is important for us to summarize and bring solid modules for our future development. We are happy to share that we reached a milestone of modernizing the FFI foundation of the project and introduce a new minimal and lightweight module. Specifically, we bring a new minimal and lightweight module tvm ffi based on our lessons in the past few years. It implements a modern version of the Unified Packed and Object RFC that unifies the packed function call and object systems.

Summary of the change:

  • A dedicated clean Any/AnyView that can store strong and weak references of items
  • Function(previously PackedFunc) system built on top of the Any/AnyView
  • A minimal C API that backs the overall calls. We are stabilizing the API with a goal to bring clean, stable FFI conventions for both compiled and registered code
  • A rewrite of core python binding and generated code based on the module
  • Update existing code and test cases to the new module
  • Latest dlpack support

The new module brings many benefits thanks to the cleaner design, to name a few:

  • Any can support both POD types(int) and object types.
  • Containers (e.g. Array) can now also contain Any value, e.g. now Array<int> is supported, no need for boxed types
  • Error handling now upgrades to object-based, allowing cleaner traceback across languages
  • Map now preserves insertion orders
  • Path toward isolated stabilize minimum core ABI/API foundation module
  • Type traits based design that cleanly defines how values interact with Any system
  • Automatic conversion of different types based on traits if needed

Because FFI upgrade is at heart of the project, the change touches every component of the system. Importantly, this is an upgrade of the ABI so the change is not backward compatible. The code compiled under the old FFI won’t work under the new one. We did provide example ABI translation (e.g. LegacyTVMArgValueToFFIAny) functions for compatibility. The PR tries to leave files in their old places while creating redirections. The goal is to have the first milestone landed and infrastructure in place, so we can do further refactors to complete features and cleanup legacy code as trackable PRs. As of now, python binding and compiled code are under the new convention while RPC and some other bindings still relies on legacy ABI translation. We will work on upgrades in the coming PRs, including areas such as reflection, phasing out legacy redirections etc.

Please checkout the PR here, feel free to bring up questions, we will also update this post with some of the latest updates on the ffi

The changes started from an initial code module implementation in collaboration with @junrushao one year ago.

Then it independently evolved driven by the needs of a full upgrade in tree, while keeping things minimal and lightweight. We will work on upgrades in the coming PRs, including areas such as reflection, phasing out legacy redirections etc.

Upgrade Note

Some upgrade note of the dependent code

  • For general containers like Map<ObjectRef, ObjectRef>, consider use Map<Any, Any> instead
  • Use ffi::Function in place of PackedFunc
  • Any now requires explicit cast cast<T> or .as<T>() (returns optional) for better type safety
    • Likely one can insert args[i].cast<T>() to explicitly cast to T, or use typed version
  • Checkout test cases for some of the example usages
  • For places where some form of boxing are needed, the attributes mostly becomes POD type, e.g. we now use bool and int64_t for int and bool attributes

There will also be followup PRs to remove some of the legacy files and redirections. the first commit contains some of the redirections (e.g. using PackedFunc=ffi::Function) so it might be useful to first rebase to the first PR commit, then add followup PRs

Followup on DLPack based enhancement, we now can take in torch tensors as arguments https://github.com/apache/tvm/pull/17927