I use -ftime-report
option to do investigate this issue. I find that the bottleneck time is parsing part, not optimize part. So if we could write the tvm_dev_mblob
array into object file directly, we should resolve it.
Execution times (seconds)
phase setup : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 1408 kB ( 0%) ggc
phase parsing : 78.28 (89%) usr 47.62 (99%) sys 125.91 (91%) wall14727874 kB (100%) ggc
phase lang. deferred : 3.02 ( 3%) usr 0.00 ( 0%) sys 3.01 ( 2%) wall 0 kB ( 0%) ggc
phase opt and generate : 6.81 ( 8%) usr 0.24 ( 1%) sys 7.06 ( 5%) wall 5 kB ( 0%) ggc
phase finalize : 0.00 ( 0%) usr 0.04 ( 0%) sys 1.80 ( 1%) wall 0 kB ( 0%) ggc
garbage collection : 3.02 ( 3%) usr 0.00 ( 0%) sys 3.01 ( 2%) wall 0 kB ( 0%) ggc
callgraph construction : 6.81 ( 8%) usr 0.24 ( 1%) sys 7.06 ( 5%) wall 5 kB ( 0%) ggc
preprocessing : 14.37 (16%) usr 24.75 (52%) sys 39.85 (29%) wall 23 kB ( 0%) ggc
parser (global) : 63.91 (73%) usr 22.87 (48%) sys 86.06 (62%) wall14727850 kB (100%) ggc
TOTAL : 88.11 47.98 137.87 14729306 kB
And I also test tcc
compiler, whose parsing speed is very fast. And the compiling time only need 5.04s.