Skip to content

Floating point exact representation of integers

There are two main floating point formats: single-precision (float in Java) which stores a value in 4bytes, and double-precision (double) using 8bytes.

The question is what range of integers can be represented exactly by these floating point formats?
In other words, what is the maximum value for which such a statement holds:

long v;
(long)(float) v == v;
(long)(double) v == v;

float represents integers exactly up to 2^24 (16,777,216), while double represents them exactly up to 2^53 (9,007,199,254,740,992). These numbers are consistent with the mantissa size being 23bits for single precision and 52bits for double precision.

Post a Comment

Your email is never published nor shared. Required fields are marked *