-
Notifications
You must be signed in to change notification settings - Fork 696
Determinism with non-NaN results? #1385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Yes, it's correct. Wasm's floating-point is deterministic, other than NaN bit patterns. |
@sunfishcode "and agrees across all platforms"? Remember, this includes weird shit like negative zero and infinity and denormals, none of which are NaN. My guess is that they intended for this to be the case, but no one has actually tested it very well. Hopefully I'm wrong. Hopefully they also didn't make the same mistake as GCC, which on 32-bit platforms had inconsistent floating-point behavior due to accidental mixing of 64-bit and 80-bit precisions. |
Yes. The spec testsuite has tests for negative zero, infinity, and subnormals, cases which would round differently in 80-bit precision, cases which would round differently on x87's double-precision mode, and more. |
@sunfishcode OK I'm 90% convinced, so thank you for the detailed feedback. This stuff really matters for scalability and accuracy, but I don't need to discuss all that unless someone wants to know. I did notice that there's a square root instruction (fsqrt), where it wouldn't be unreasonable to expect cross-platform inconsistencies in some cases, depending upon the algorithm used under the hood to converge on an answer. I have deep suspicions of any instruction that returns rational approximations of irrational numbers. My concern is not unfounded. In 2014 (see below), it was discovered that Intel's sine (FSIN) function might be accurate to only 4 bits of significance in certain cases. Under such a scenario, the right way to fix it would probably be to leave the existing bad behavior as-is for the sake of consistency, then add another function which actually works. In theory, square roots are trivially easy to compute, but things could go awry if one is using a Taylor or binomial series which might require much more dynamic range than the old divide-and-average method, or if the circuit designer didn't have a rigorous comprehension of the error term. For these and other reasons, it would be useful to define fsqrt (WebAssembly) in terms of FSQRT (X86, stating even the CPU model if necessary). That way, if something were to go wrong with some inputs, there would be no spec violation. Instead, it could be fixed with "fsqrt2" or something that doesn't have the problem. In no event does it return a "square root". But this is just documentary pedanticism. The real concern here is just that I wonder whether non-NaN inputs to fsqrt generate the same result on Intel, AMD, nVidia, ARM, etc. One could sweep the entire 32-bit input space in a short time. That would at least provide substantial confidence that the 64-bit space is also correct. I assume that's been done because it's so obvious, but I don't know, and it's less obvious that any comparison would have been done cross-platform. I also wonder whether divide-and-average has swept the same space for sake of comparison with fsqrt. If they produce identical results, then all is well. If they produce trivially different results, then the documentary pedanticism above actually matters. If they sometimes produce wildly different results, then we have another manifestation of the FSIN problem (really unlikely, but who knows). https://randomascii.wordpress.com/2014/10/09/intel-underestimates-error-bounds-by-1-3-quintillion |
Interesting question. Here's what we know:
Sweeping the entire 32-bit input space to compare between hardware platforms, or to compare against divide-and-average are good ideas. I don't know that anyone has done either of these. If anyone does, please post the results somewhere!
I appreciate the idea here, however it wouldn't be practical for Wasm because, by definition, it'd be impossible to implement this kind of FSQRT on any non-x86 chip. Running on multiple architectures is a big part of the point of WebAssembly, so we wouldn't be able to use such an instruction at all. So the tradeoff is having a usable |
@sunfishcode Brilliant deep answer. Let me see if I can squeeze in some time to do the divide-and-average comparison... |
@sunfishcode I just realized that this is a harder question than it seems. On the old X87 FPU, FSQRT operated at full precision (generally, 80 bits, but programmable to 32 or 64). Assuming 80 bits, then truncating down to 32 or 64 would almost certainly provide the correct result subject to the rounding policy. However, various flavors of AMD64 SIMD, such as AVX, provide their own square root functions (e.g. SQRTPS) which are defined to produce only those lesser precisions (which surely are not taken from 80 bits). So in order to really test this, one would need to compare the imprecise SIMD results to the equivalently-downconverted results from 80-bit FSQRT. I could hash all those downconverted 32-bit results. I could publish that hash here. But then someone else would need to run the equivalent WASM code on various platforms, then compare the hash. |
Promoting 32-bit or 64-bit inputs to 80-bit, computing |
@sunfishcode I see your point. Then the only solution here is divide-and-average (or square-and-average). I'd be glad to run that test and spit out 2^32 32-bit floats to a file, take its hash, and publish it here. (One could even use a high-precision library to make errors even less likely.) That would make it easy to track down any disagreements between my own results and those of someone running WASM on some other hardware. (One could just binary search for the offset of disagreement by truncating the file at different points.) If you or anyone else is willing to participate, I'll do it.
|
I believe the answers above are as much as we can say at this time. If anyone does any experiments, or if anyone learns of a floating-point implementation bug anywhere, please do post about it, here or in a new issue in this repo. |
As I read the spec: if neither the inputs to nor the output from a floating-point instruction is NaN, then the output is deterministic (and agrees across all platforms). Clearly, this isn't the case with transcendental functions like sine, but you don't seem to implement those, I think this assertion is correct. Is it? If so, perhaps you could make that clearer in the spec.
The text was updated successfully, but these errors were encountered: