Rebooting android device during tuning

Hello,

I’m using TVM for tuning models. I’m using Android RPC Java app, some of my phones works correctly, so the tuning process completes, other phones crash and reboot. I would like to debug it more, but I have no idea how to deep dive into TVM. I have grabbed some of “reboot logs” from /sys/fs/pstore/console-ramoops-0, and I can see following logs:

[20727.518618] [2022-03-15 22:00:52 GMT+1] QCOM-STEPCHG: handle_vbatt_limit: FCC=550000, vbat=4391010
[20731.324307] [2022-03-15 22:00:56 GMT+1] kgsl kgsl-3d0: kgsl: possible gpu syncpoint deadlock for context 2 timestamp 0
[20731.324355] [2022-03-15 22:00:56 GMT+1] kgsl kgsl-3d0:   context[2]: queue=228798, submit=228792, start=228792, retire=228792
[20731.324380] [2022-03-15 22:00:56 GMT+1] kgsl kgsl-3d0:   possible deadlock. Context 2 might be blocked for itself
[20731.324415] [2022-03-15 22:00:56 GMT+1] kgsl kgsl-3d0:   context[2]: submit times: 4.274 1.646 134.329 110.266 136.517 19.123 16.209 
[20731.324440] [2022-03-15 22:00:56 GMT+1] kgsl kgsl-3d0:       pending events:
[20731.324464] [2022-03-15 22:00:56 GMT+1] kgsl kgsl-3d0:        [0] FENCE kgsl-timeline kgsl-3d0_11-ndroid.systemui(230: 204186
[20731.324489] [2022-03-15 22:00:56 GMT+1] kgsl kgsl-3d0:        [0] FENCE kgsl-timeline kgsl-3d0_16-ache.tvm.tvmrpc(658: 220422
[20731.324512] [2022-03-15 22:00:56 GMT+1] kgsl kgsl-3d0: --gpu syncpoint deadlock print end--
(...)

[22587.836130] [2022-03-15 22:31:52 GMT+1] kgsl kgsl-3d0: kgsl: possible gpu syncpoint deadlock for context 2 timestamp 0
[22587.836180] [2022-03-15 22:31:52 GMT+1] kgsl kgsl-3d0:   context[2]: queue=229488, submit=229482, start=229482, retire=229482
[22587.836209] [2022-03-15 22:31:52 GMT+1] kgsl kgsl-3d0:   possible deadlock. Context 2 might be blocked for itself
[22587.836246] [2022-03-15 22:31:52 GMT+1] kgsl kgsl-3d0:   context[2]: submit times: 30.44 31.870 26.725 4.277 1.645 39.308 54.989 
[22587.836270] [2022-03-15 22:31:52 GMT+1] kgsl kgsl-3d0:       pending events:
[22587.836293] [2022-03-15 22:31:52 GMT+1] kgsl kgsl-3d0:        [0] FENCE kgsl-timeline kgsl-3d0_11-ndroid.systemui(230: 205356
[22587.836315] [2022-03-15 22:31:52 GMT+1] kgsl kgsl-3d0: --gpu syncpoint deadlock print end--
(...)

[22639.548412] [2022-03-15 22:32:44 GMT+1] kgsl kgsl-3d0: |adreno_drawctxt_detach| Wait for global ctx=18 ts=14523 type=2 error=-110
[22639.548645] [2022-03-15 22:32:44 GMT+1] kgsl kgsl-3d0: mrpc:RPCProcess[7685]: gpu fault ctx 18 ctx_type CL ts 159 status 00E61015 rb 02c2/052b ib1 00000007FF6EF000/0000 ib2 00000007FFE51DEC/0000
[22639.548701] [2022-03-15 22:32:44 GMT+1] kgsl kgsl-3d0: mrpc:RPCProcess[7685]: gpu fault rb 2 rb sw r/w 02c2/052b
[22639.560822] [2022-03-15 22:32:44 GMT+1] kgsl kgsl-3d0: |kgsl_iommu_fault_handler| GPU PAGE FAULT: addr = 7FF943A00 pid= 0 name=unknown
[22639.560863] [2022-03-15 22:32:44 GMT+1] kgsl kgsl-3d0: |kgsl_iommu_fault_handler| context=gfx3d_user TTBR0=0x1d0001492e4000 CIDR=0x1e05 (write translation fault)
[22639.645852] [2022-03-15 22:32:44 GMT+1] kgsl: kgsl_snapshot_push_object: snapshot: Can't find entry for 0x00000007FF6EF000
[22639.647893] [2022-03-15 22:32:44 GMT+1] kgsl: kgsl_snapshot_push_object: snapshot: Can't find entry for 0x00000007FFE51DEC
[22639.647910] [2022-03-15 22:32:44 GMT+1] kgsl kgsl-3d0: |kgsl_device_snapshot| GPU snapshot created at pa 0x00000001ec400000++0xdfb20
[22639.648111] [2022-03-15 22:32:44 GMT+1] kgsl kgsl-3d0: |kgsl_snapshot_save_frozen_objs| snapshot: Active IB1:00000007ff6ef000 not dumped
[22644.652133] [2022-03-15 22:32:49 GMT+1] platform 506d000.qcom,rgmu: RGMU CX gdsc off timeout
[22646.652107] [2022-03-15 22:32:51 GMT+1] kgsl kgsl-3d0: CP initialization failed to idle