data:image/s3,"s3://crabby-images/3de46/3de4693e697c47cd537c7068c1084aa7c358c80d" alt="Code blocks target uses an invalid compiler run aborted"
data:image/s3,"s3://crabby-images/165b9/165b937a618c78a790fbbda7865becb92388a779" alt="code blocks target uses an invalid compiler run aborted code blocks target uses an invalid compiler run aborted"
Floating point operations result in a slight amount of rounding error with the amount of error being effected by the precision, choice to operations (such as FMA), optimizations, order of operations, etc.
data:image/s3,"s3://crabby-images/0b1a2/0b1a20c463a6b6d7283c73150c9f632dfbad5f35" alt="code blocks target uses an invalid compiler run aborted code blocks target uses an invalid compiler run aborted"
data:image/s3,"s3://crabby-images/3dd91/3dd91bfeb063d5a2ca4fd7551d3426d5ef6ad429" alt="code blocks target uses an invalid compiler run aborted code blocks target uses an invalid compiler run aborted"
It’s best to think of all floating point computation as always “wrong” in that almost all floating point values can’t be represented exactly. The result isn’t necessarily wrong, just different. Sans this error, you could use an atomic to accomplish the same thing, but reductions are more performant so should be used in this case. This is a known compiler error that will be fixed in the upcoming 20.11 release. bitcast with differing sizes -4 (Acc_factorial.cpp: 12) Reductions can be performed in parallel by having each thread gather a partial reduction and then launch a second kernel to gather the partial reductions into a final reduction. Generating implicit reduction(*:factorial)ħ, Generating implicit copy(factorial) In this case, you can use a reduction and in looking at the compiler feedback messages, you can see the the compiler is implicitly adding a reduction for you: % nvc++ fact.cpp -acc -Minfo=accel -V20.7ĩ, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */ My professor told me there is a shared memory (non-deterministic) and the factorial inside the printf won’t know what is the last value it received from the loop, and he said you should use atomic pragma with it,
data:image/s3,"s3://crabby-images/ca1c7/ca1c788370780b795e371a81d403db0555bc31c3" alt="code blocks target uses an invalid compiler run aborted code blocks target uses an invalid compiler run aborted"
My argument is that why when I provide a smaller number it gives me the correct answer and when I put huge numbers it gave me the wrong result, is that related to my GPU model or something else, and does atomic work for these types of issue or not? When I increase the setprecision it gave me the same issue as the previous factorial code, and my professor said it is the same issue that happens in factorial.
data:image/s3,"s3://crabby-images/3de46/3de4693e697c47cd537c7068c1084aa7c358c80d" alt="Code blocks target uses an invalid compiler run aborted"