VTA for new FPGA

Dear community members:

I had read throught the vta paper 1807.04188v3 and would like to ask a few questions about hardware customization for new FPGA.

----- quote -----
p.1: a microcode-ISA which implements a wide variety of operators with
single-cycle tensor-tensor operations.

p.5: Architectural knobs include
GEMM hardware intrinsic shape, data types, number of parallel
arithmetic units in the tensor ALU, ALU operations, BRAM
distribution between on-chip memories.

----- questions ----

  1. Can I change the GEMM / ALU cores to be multi-cycle, in order to enable the pipeline inside the GEMM / ALU cores and increase the FMax?

  2. It’s said the number of parallel ALU can be customized. How about the number of parallel GEMM cores? Suppose I have a very large FPGA with one million of LUT-6 andthousands of h/w MAC, how can I maximize the utilization of such resouces by implementing multiple parallel GEMM cores?

  3. How can I change the source code to use new type of offchip memory ? e.g. DDR3 -> GDDR6 ?
    How to specify the DDR controller and DDR memory characteristics such as bus throughput and latency?

  4. Shall I change the tsim code as well, in order to reflect the changes in new FPGA, so that the tsim simulation can accurately represent the actuall hardware performance and fuction ?

Thanks very much!

Kevin