'debug' executor produce different predict results from vm/graph executor

Description:

Hi all, For the same model and input, ‘debug’ executor produced different predicted results from the ‘graph’ / ‘vm’ executor, How strange it is?

What causes this? Thanks for any comments in advance!

The reproducible script:

import tvm
from tvm import relay
from tvm.ir.transform import Sequential
from tvm.contrib import graph_runtime
import numpy as np

var_0 = relay.var("var_0", dtype = "int64", shape = (5, 7, 9, 5)) # shape=(5, 7, 9, 5)
var_1 = relay.ones_like(var_0) # shape=(5, 7, 9, 5)
var_2 = relay.round(var_1) # shape=(5, 7, 9, 5)
const_3 = relay.const([[[[[699,894,823,381,897],[873,263,659,756,83],[321,639,937,145,940],[570,255,627,939,43],[287,564,26,732,670],[753,225,228,374,509],[73,72,402,247,453],[651,119,67,309,874],[501,981,864,437,477]]]]], dtype = "int64") # shape=(1, 1, 1, 9, 5)
var_4 = relay.left_shift(var_2, const_3) # shape=(1, 5, 7, 9, 5)
tuple = relay.Tuple([var_4])
F = relay.Function([var_0], tuple)
mod = tvm.IRModule()
mod['main'] = F
mod = relay.transform.InferType()(mod)
print(mod.astext(show_meta_data=False))
intrp1 = relay.build_module.create_executor('debug', mod, tvm.device('llvm',0),'llvm')
intrp2 = relay.build_module.create_executor('graph', mod, tvm.device('llvm',0),'llvm')
input_0= np.array([[[[197,900,595,369,502],[225,178,41,364,348],[585,489,771,301,933],[148,493,300,610,439],[702,893,815,433,495],[486,853,162,478,490],[635,26,742,581,394],[243,805,923,635,168],[622,572,8,392,872]],[[292,540,716,592,501],[507,645,393,673,77],[239,158,930,753,635],[771,387,12,512,319],[757,107,475,680,741],[994,653,312,1,45],[536,292,936,603,235],[436,109,879,180,781],[956,419,938,237,523]],[[924,7,261,287,871],[579,44,977,405,75],[717,398,727,381,750],[123,916,41,58,518],[276,845,979,506,25],[759,813,795,697,49],[317,972,408,929,259],[278,507,302,606,263],[728,322,660,454,54]],[[761,577,321,801,986],[191,428,831,169,934],[207,927,746,1,975],[147,669,947,906,597],[557,535,455,210,140],[717,937,813,728,390],[867,488,318,187,640],[304,377,420,134,897],[705,340,824,802,692]],[[150,948,712,448,205],[660,4,739,466,213],[230,534,501,43,261],[891,261,100,208,799],[91,863,528,510,996],[424,566,687,599,368],[378,749,315,441,548],[872,100,552,610,565],[116,192,98,617,586]],[[710,859,846,161,418],[996,252,281,523,113],[628,299,679,315,897],[46,44,997,712,485],[545,935,584,448,897],[501,915,88,950,883],[673,660,741,870,172],[511,865,775,143,740],[888,770,390,918,84]],[[286,315,480,635,378],[964,531,313,899,978],[209,399,244,296,349],[127,320,360,219,189],[531,81,405,306,223],[496,545,993,885,462],[428,523,776,907,157],[153,222,687,817,121],[16,25,519,259,672]]],[[[219,737,991,578,308],[531,461,388,936,766],[611,431,310,955,316],[123,382,838,898,641],[346,402,862,32,219],[982,399,595,853,9],[267,71,746,257,1],[53,788,461,792,75],[578,402,505,887,356]],[[172,361,90,361,610],[82,58,363,943,89],[581,277,839,528,129],[848,794,551,945,50],[551,997,189,363,140],[263,940,542,768,178],[249,291,538,690,4],[147,771,61,510,714],[502,442,990,340,969]],[[470,539,762,20,483],[164,923,831,352,285],[971,615,225,864,734],[754,464,376,292,154],[379,790,924,440,651],[989,293,93,330,632],[413,151,523,527,171],[5,690,93,836,393],[729,158,7,305,21]],[[92,59,484,468,702],[989,198,491,265,989],[142,253,281,586,935],[265,998,85,787,524],[607,791,565,51,978],[958,780,135,964,84],[507,408,494,343,875],[195,331,72,686,595],[61,179,200,693,764]],[[134,957,113,570,743],[637,529,886,201,579],[215,510,710,702,474],[146,560,233,639,902],[459,834,233,530,519],[827,942,697,378,987],[812,863,943,924,433],[38,912,313,275,113],[243,841,974,953,542]],[[447,98,102,31,736],[3,489,569,587,371],[439,766,664,487,143],[650,298,6,945,574],[790,982,485,102,608],[597,344,448,571,648],[990,369,97,443,752],[833,797,240,753,384],[610,544,501,274,30]],[[643,275,680,1000,219],[253,789,552,737,242],[159,686,938,607,608],[585,948,328,682,742],[79,866,538,671,970],[273,280,513,773,905],[895,416,180,926,415],[398,178,556,950,266],[797,460,303,86,66]]],[[[910,23,365,590,704],[106,20,921,644,690],[242,916,322,755,41],[226,1,456,757,926],[222,507,455,777,808],[72,926,267,375,11],[685,636,33,49,577],[88,155,597,8,150],[638,602,417,959,356]],[[457,537,708,264,645],[985,486,151,439,614],[310,510,539,577,236],[550,261,872,934,661],[448,22,167,396,381],[668,34,982,85,344],[689,893,880,396,509],[525,380,994,27,170],[959,337,32,498,265]],[[267,399,525,490,332],[537,290,353,56,685],[86,723,70,67,159],[414,756,404,645,503],[912,521,235,257,548],[404,215,884,435,712],[148,54,462,24,895],[794,912,184,498,967],[221,583,42,290,650]],[[200,55,757,603,700],[259,866,220,493,122],[767,897,337,2,683],[400,501,88,862,876],[983,655,140,166,152],[106,386,735,147,28],[736,699,82,492,653],[781,750,519,353,595],[640,119,843,976,473]],[[525,728,325,965,589],[553,947,595,692,112],[746,149,850,832,648],[229,567,346,310,410],[998,443,512,868,795],[106,508,265,948,835],[89,824,562,414,140],[502,318,86,96,9],[550,194,157,751,377]],[[804,979,296,501,640],[705,851,82,216,718],[228,673,577,845,972],[764,933,148,677,698],[287,179,15,373,626],[23,274,819,532,24],[548,335,2,843,836],[641,899,38,75,467],[755,302,139,684,146]],[[463,447,431,610,123],[128,248,301,495,972],[927,869,245,97,400],[620,644,87,621,486],[274,614,737,311,688],[203,417,341,341,100],[839,155,546,269,764],[669,748,364,969,242],[335,247,111,932,696]]],[[[510,551,339,948,172],[177,221,137,913,531],[176,115,948,516,807],[47,706,962,593,974],[77,613,722,440,933],[315,127,532,425,58],[227,287,608,917,234],[131,445,807,267,357],[337,442,823,636,310]],[[982,35,15,295,979],[341,371,591,414,163],[523,728,289,54,505],[346,632,143,953,549],[728,84,993,534,702],[702,223,496,876,858],[157,209,244,171,503],[222,863,226,812,276],[740,687,356,28,92]],[[860,725,76,354,29],[624,81,464,968,967],[166,669,189,661,897],[398,817,105,642,339],[960,215,202,185,27],[829,924,65,184,303],[508,395,27,583,748],[55,558,829,871,526],[795,36,546,335,48]],[[442,84,216,899,77],[906,858,292,459,42],[670,288,317,734,823],[619,593,218,997,176],[965,403,733,145,273],[258,939,660,156,625],[707,597,709,274,495],[785,180,352,428,638],[745,97,277,61,182]],[[100,31,775,317,27],[302,633,430,34,778],[54,644,68,714,151],[693,420,747,753,46],[594,889,225,297,317],[214,42,765,843,454],[947,942,485,73,610],[863,374,242,644,759],[371,698,754,439,411]],[[904,483,182,3,235],[579,596,475,803,892],[143,369,285,908,211],[739,206,504,575,278],[465,437,3,58,433],[761,429,130,867,219],[892,770,701,425,772],[287,4,367,113,158],[611,256,878,247,515]],[[440,337,72,295,911],[701,759,700,703,169],[132,463,949,613,329],[167,856,451,219,632],[222,505,987,941,969],[145,903,224,374,149],[90,814,486,161,108],[396,861,219,447,563],[739,930,378,687,894]]],[[[58,205,101,860,423],[733,82,279,719,374],[247,215,276,823,589],[776,912,754,261,425],[861,9,637,431,807],[552,521,737,929,207],[982,338,763,83,198],[537,815,631,167,885],[356,766,100,983,588]],[[40,758,851,793,19],[627,5,379,264,436],[185,815,956,273,95],[515,255,432,629,337],[981,166,151,963,332],[35,670,97,486,652],[36,525,410,887,317],[780,513,322,158,128],[109,694,942,416,967]],[[36,282,221,820,911],[557,152,76,59,467],[759,445,136,208,931],[788,595,455,549,833],[772,328,698,445,837],[825,553,530,767,968],[848,154,250,68,973],[160,976,125,587,34],[591,697,479,78,256]],[[409,217,851,863,765],[683,986,444,732,430],[632,557,982,514,323],[302,361,476,903,781],[801,414,756,277,1000],[790,219,48,620,296],[656,380,513,506,594],[629,540,580,73,272],[361,704,180,695,569]],[[854,996,930,681,898],[710,481,311,465,757],[662,606,975,61,577],[623,716,956,135,573],[902,763,465,833,187],[88,193,243,267,887],[811,120,234,740,800],[483,801,633,145,618],[741,806,575,716,219]],[[152,690,934,459,824],[859,360,938,323,192],[125,410,737,367,676],[623,177,147,209,269],[298,691,69,282,836],[686,23,993,261,90],[563,412,779,497,222],[954,707,582,891,29],[125,15,438,861,733]],[[465,836,910,963,44],[178,612,86,246,894],[273,284,916,266,544],[357,180,307,135,28],[528,440,734,461,330],[762,938,697,551,798],[429,367,985,338,329],[28,515,941,466,113],[186,738,396,453,355]]]], dtype='int64')
res1 = intrp1.evaluate()(input_0)[0].asnumpy()
res2 = intrp2.evaluate()(input_0)[0].asnumpy()
np.testing.assert_allclose(res1,res2, atol=1e-3, rtol=1e-3)

The result of the script:

image

I remember you have asked one similar question in here and some guys replied that it’s caused by incomplete specification for right_shift in TVM. Is it the same reason for left_shift in your example?

@Haoyang I don’t think they are the same root cause. That inconsistency in Operator `right_shift` obtains different results in different devices is caused by different backend devices(i.e., ‘llvm’ and ‘cuda’). However, this inconsistency is due to different execution methods. I guess this inconsistency is caused by the internal differences of different executors.

@kparzysz @wrongtest @Haoyang Could you give me some advices? Thanks!

Hi, I try your script but it just finish without errors :rofl:

Could you also kindly provide more platform informations?

@wrongtest Thanks for your reply!

I run this program using the latest TVM version(TVM0.9) on two different servers, and Both can reproduce the crash.

Platform1 info:

  • OS: centos
  • cpu architecture: x86_64
  • GPU : rtx 3090

Platform2 info:

  • OS: Ubuntu 16.04.6
  • cpu architecture: x86_64
  • GPU : GTX 1080

Same for me! I cannot reproduce the bug using Mac M1 but on x86 linux it truly encountered the bug.

BTW, I encountered this weird inconsistency between Mac M1 and Linux once in here. I think it may be caused by TVM’s minor problem.

Yeah in general all of these are due to platform specific differences when dealing with edge conditions of some sort. Technically most of these edge conditions we are looking at have undefined behavior so platform differences are allowed without it violating the underlying interface.

You can try mod, _ = relay.build_module.optimize(mod, target="llvm") before build;

Since debug is implemented via interpreter, it do not optimize the graph as graph or vm mode does. I think thats the difference, for undefined behaviors different optimization could create different results.

1 Like