Author: Mikhail Vorontsov
Why reduce memory usage
This article will give you general advice on optimizing Java memory consumption.
Memory usage optimization is important in Java. System performance is mostly limited to memory access performance rather than CPU frequency, otherwise, why would CPU manufacturers implement all these L1, L2, and L3 caches? This means that by reducing your application's memory footprint, you will most likely increase your program's data processing speed by letting the CPU wait for a smaller amount of data. That is: saving memory improves performance!
Java memory layout
Let's start by reviewing the memory layout of Java objects in second grade: any Java Object
occupies at least 16 bytes, of which 12 bytes are occupied by the Java object header. In addition, all Java objects are aligned on 8-byte boundaries. This means, an object with 2 fields: int and byte: will take 17 bytes (12 + 4 + 1), not 24 bytes (17 aligned by 8 bytes).
If the Java heap is below 32G and the option XX:+UseCompressedOops
is enabled (UseCompressedOops is enabled by default since JDK6_u23), each Object reference occupies 4 bytes. Otherwise, the Object reference occupies 8 bytes.
All primitive data types occupy their exact size in bytes:
byte, boolean | 1 byte |
---|---|
short, char | 2 bytes |
integer, float | 4 bytes |
long, double | 8 bytes |
Essentially, this information is sufficient for Java memory optimization. But it will be more convenient if you know Array/String
the memory consumption of the numeric wrapper.
Most Common Java Types of Memory Consumption
Arrays consume 12 bytes plus their length times their element size (plus, of course, the extra occupancy of 8-byte alignment).
As of Java 7 build 06, String
, contains 3 fields - a char[] int field with string data plus 2 fields with 2 hash codes computed by different algorithms . This means that String itself takes 12 (header) + 4 ( char[]reference) + 4 2 (int) = 24 bytes (as you can see, it fits perfectly into 8-byte alignment). Besides that, char[] with String data takes 12 + length 2 bytes (plus alignment). This means that String takes 36 + length*2 bytes aligned 8 bytes (by the way, this is 8 bytes less than the memory consumption before Java 7 build 06 String).
Number wrappers take 12 bytes plus the size of the underlying type. Byte/Short/Character/Integer/Long
is cached by the JDK, so for values in the range -128~127, the actual memory consumption may be smaller. Regardless, these types can be a source of severe memory overhead in collection-based applications:
Byte, Boolean | 16 bytes |
---|---|
Short, Character | 16 bytes |
Integer, Float | 16 bytes |
Long, Double | 24 bytes |
General Java Memory Optimization Tips
Armed with all this knowledge, it's not hard to give general Java memory optimization tips:
- Prefer primitive types over their Object wrappers. The main reason to use wrapper types is JDK Collections, so consider using one of the primitive types collection frameworks like Trove.
- Controls the number of Objects you have. For example, prefer array-based structs over pointer-based structs like: ArrayList/ArrayDeque/LinkedList
Java memory optimization example
Here is an example. Suppose you have to create a map from an int to a string 20 characters long. The size of this map is equal to one million, and all maps are static and predefined (eg saved in some dictionary).
The first method is to use Map<Integer, String>
one of the standard JDKs. Let's roughly estimate the memory consumption of this structure. Each Integer
occupies 16 bytes plus 4 bytes for Integer
mapped references. Every 20 characters long String occupies 36 + 20*2 = 76 bytes (see above String
description), aligned to 80 bytes. Plus 4 bytes for reference. The total memory consumption is approximately (16 + 4 + 80 + 4) * 1M = 104M .
A better approach is to wrap the Part 1 UTF-8 encoding with a String and replace it with byte[] (see Converting Characters to Bytes article ). Our Map will be Map<Integer, byte[]>
. Assume that all string characters belong to the ASCII set (0-127), which is the case in most English-speaking countries. byte[20] occupies 12 (header) + 20*1 = 32 bytes, which conveniently fits into 8-byte alignment. The entire Map will now occupy (16 + 4 + 32 + 4) * 1M = 56M , 1 and a half less than the previous example.
Now let's use Trove TIntObjectMap<byte[]>
. int[] stores key-values normally compared to wrapper types in JDK collections. Each key will now occupy 4 bytes. The total memory consumption will drop to (4 + 32 + 4) * 1M = 40M .
The final structure will be more complex. All String
values will be stored byte[] one after the other (we're still assuming we have a text-based ASCII string) with a byte 0 as a separator in between. The overall byte[] will occupy (20 + 1) * 1M = 21M . Our Map will store the offset of the string, byte[]
instead of the string itself. We will use Trove's TIntIntMap
for this purpose. It will consume (4 + 4) * 1M = 8M. The total memory consumption in this example would be 8M + 21M = 29M . By the way, this is the first example that relies on the invariance of this dataset.
Can we achieve better results? Yes, we can, but at the cost of CPU consumption. The obvious "optimization" is to store the value into a large byte[]. Now we can store the key in an int[] and use a binary search to find the key. If a key is found, its index multiplied by 21 (remember, all strings have the same length) will give us a value in byte[]. Compared to the lookup in the hashmap case, this structure" Only" takes 21M + 4M (for int[]) = 25M , at the cost of the lookup complexity going from O(1) to O(log N).
Is this the best we can do? Do not! We forget that all values are 20 characters long, so we don't really need the separator between byte[]. This means that if we agree to do lookups in O(log N), we can use 24M of memory to store our "Map". Compared to the theoretical data size, there is absolutely no overhead and almost 4.5 times less than what the original solution ( Map<Integer, String>
) required! Who told you that Java programs are memory-hungry?
Summarize
Prefer primitive types over their Object wrappers. The main reason to use wrapper types is JDK collections, so consider using one of the primitive types collection frameworks like Trove.
Minimize the number of Objects you have. For example, favor array-based structs over pointer-based structs like. ArrayList/ArrayDeque/LinkedList
Recommended reading
If you want to learn more about clever data compression algorithms, it's worth reading "Programming Pearls" (Second Edition) by Jon Bentley. This is a wonderful collection of very unexpected algorithms. For example, in Section 13.8, the authors describe how Doug McIlroy managed to install a 75,000-word spell checker in 64 KB of RAM. That spell checker keeps all the needed information in such a small amount of memory and doesn't use disk! It might also be worth noting that Programming Pearls is one of the recommended prep books for Google SRE interviews.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。