Skip to content

Synergetic Stupidity in Java

In the Java “class” file format specification, the length of the bytecode of any method is limited to 64KB. This arbitrary limitation is bad in itself, but see below what it does in conjunction with the other stupider Java ‘feature’.

In any normal language, a literal array constant (used for array initialization) is represented as a compact block of data. For example when initializing an array with 1000 byte values, you’d expect the initialization data to take up 1KB (1000 x 1B). But not so in Java, not at all.

Let’s consider a long array of bytes, initialized with some data.
static final byte B[] = {10, 20, 30, 40, 50, 60, 70, 80, …};

In the “class” format there is no provision for storing the array initialization data in a compact format. Instead, bytecode instructions are generated for filling-in the array, a small block of bytecodes for each entry of the array data:

dup          //1 byte; duplicate the array address on the stack
sipush 300 //3 bytes; the array index where to store the value
bipush 99   //2 bytes; the value to store
bastore      //1 byte; do the store

This adds up to 7 bytes for each entry of the array. So if you initialize an array of 5000 bytes, this would generate 35KB of bytecode instead of the 5KB that would normally be enough to store the data.

For an array of int, float or larger types, the bytecode structure is a bit different, in that the integer value is not inlined in the bytecode, but instead read from a constant pool.

dup          //1 byte
sipush 300 //3 bytes; the array index
ldc_w  400 //3 bytes; load int from constant pool position
iastore      //1 byte

Let’s put in a small table the amount of data Java uses for initializing arrays of different types:

Type Bytes / entry Stupidity boost
byte[] 7 7 x
short[] 8 4 x
int[], float[] 8 + 4 3 x
long[] 8 + 8 2 x

This form of doing array initialization is in my opinion the stupidest miss of the “class” file format.

But lets see where Stupidity Synergy comes into play. Synergy means that sometimes, when you put together one stupid feature with another stupid feature, you get a result that is larger than the sum of the parts. Java achieves synergy by combining the 64KB method size limit with the peculiar way of compiling the array initialization.

As we’ve seen, bytecode (i.e. method code) is needed for any array initialization. Even initializing a static array at the class level produces an invisible method that contains the code for array initialization. Combining this with the method size limit produces very restrictive limits on the size of array initializations.

Type Array initialization size limit
byte 9K
short 8K
int, float 8K

Nice powerful language, where you get a “code too large” compilation error when you try to initialize a byte array with more than 9K entries…

{ 4 } Comments

  1. ldso | 2009-03-25 at 13:35 | Permalink

    Hahaha yikes. That is surprising to me. Well, not really. I’ve been working on a Java ME project for about a year, and most of the work has been writing test code to see which cases the api actually works for, because most of them don’t. But I have all of the functionality I was trying to get so far. Good work figuring out why your “code too large” errors were happening. Arbitrary limits like that are not surprising in desktop software, but that is wrong. Thanks for making me laugh, Stupidity Boost made my day :)

  2. Neil Harding | 2010-02-01 at 17:07 | Permalink

    I wrote a bytecode optimizer for J2ME when I worked at I-Play (Digital Bridges at that time). Because of the cost of adding a new class, it makes sense to do store objects inside int arrays.

    public final static int X = 0;
    public final static int Y = 1;
    public final static int z = 2;
    public final static int IMAGE_INDEX = 3;
    public final static int SPRITE_SIZE = 4;

    So that you can do draw(sprites[index + X],sprites[index + Y],sprites[index + IMAGE_INDEX])

    However, Javac is very bad compiler and actually generates code index + 0 (rather than saying +0 is not actually much use and removing it).

    So I removed that code, and some other optimizations and one of them I did was to replace code like
    static final byte B[] = {10, 20, 30, 40, 50, 60, 70, 80, …};

    with static final byte B[] = Bytecode::ArrayFromString(“encodedString”,offset), the encoded string was generated via the optimizer and occurred once in the class pool and had simple run length encoding as well (the other option would have been to load the data from a resource, but the string encoding was the simplest since I didn’t need to worry about it not being available).

  3. brezel | 2010-02-02 at 19:23 | Permalink

    this article is nonsense.

    public class Main {
    static int entries = 100000000;
    static byte[] barr = new byte[entries];

    static {
    System.err.println(Runtime.getRuntime().totalMemory());
    for (int i = 0; i < entries; i++) {
    barr[i] = new Byte("1");
    }
    System.err.println(barr.length);
    System.err.println(Runtime.getRuntime().totalMemory());
    }

    public static void main(String[] args) {

    }
    }

  4. wow | 2011-04-15 at 08:15 | Permalink

    brezel,
    the article is very far from nonsense. it is very accurate. your comment is nonsense and shows that you don’t understand the difference between run-time and compile-time. FAIL

{ 1 } Trackback

  1. [...] Berlin, Heidelberg, 2005. Preda, Mihai: Synergetic Stupidity in Java. Mihai Mobile Blog. 2009. http://blog.javia.org/synergetic-stupidity-in-java/ MoreLike this:LikeBe the first to like this [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *