源码注释解读—HashMap

纸上得来终觉浅,绝知此事要躬行。

为了更贴近作者的实现意图,以及JDK中每个类的功能特点,决定从源码的注释中和实现来窥探其真谛。字斟句酌、查缺补漏;顺带提高英文阅读能力;首先从HashMap入手:

Hash table based implementation of the Map interface. This implementation provides all of the optional map operations, and permits null values and the null key. The HashMap class is roughly equivalent to Hashtable, except that it is unsynchronized and permits nulls. This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.

Hash table based implementation of the Map interface.
哈希表基于Map接口实现。

This implementation provides all of the optional map operations, and permits null values and the null key.
这个实现提供了所有可选择的map操作,允许null的键和值。

The HashMap class is roughly equivalent to Hashtable, except that it is unsynchronized and permits nulls.
HashMap类大致上和Hashtable等价,除了它是非线程安全的和允许null键值。

This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.
这个类不保证map的顺序;特别是它也不能保证顺序会随时间保持不变。

This implementation provides constant-time performance for the basic operations (get,put), assuming the hash function disperses the elements properly among the buckets. Iteration over collection views requires time proportional to the "capacity" of the HashMap instance (the number of buckets) plus its size (the number of key-value mappings). Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.

This implementation provides constant-time performance for the basic operations (get,put), assuming the hash function disperses the elements properly among the buckets.
假定哈希函数将元素适当的分散在各个桶中,这个实现为基本的操作(读、写)提供了恒定时间的性能。

Iteration over collection views requires time proportional to the "capacity" of the HashMap instance (the number of buckets) plus its size (the number of key-value mappings).
集合视图的迭代需要时间与HashMap实例的容量(桶的数量)加上每个桶的大小(键值对的数量)成比例。

Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.
因此,如果看中迭代性能的话,不要设置初始容量太大或者负载因子太小,这点很重要的。

An instance of HashMap has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.

An instance of HashMap has two parameters that affect its performance: initial capacity and load factor.
HashMap实例有两个参数会影响其性能:初始化容量和负载因子。

The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created.
容量即hash表中桶的数量,那初始化容量就是hash表被创建时的容量。

The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased.
负载因子是在哈希表容量自动增长之前,哈希表所允许达到的最大容量的度量。

When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.
当哈希表中实体的数量超过负载因子和当前容量的乘积时,哈希表会被rehash(即内部数据结构重建)这样哈希表的桶数量大约是原来的两倍。

As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.

As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs.
一般来说,默认的负载因子0.75在时间和空间成本上提供了很好的权衡。

Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put).
更大的值减少了空间开销但是增加了查找成本(会反应在HashMap类的大多数操作中,包括get和put)。

The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations.
在设置它的初始化容量时,应该考虑到预期的map中的条目数量和它的负载因子,来最小化rehash操作的数量。

If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.
如果初始化容量大于条目最大数量除以负载因子,就不会发生rehash操作。
entry count < capacity * loadfactor 不会发生rehash (即capacity>count/loadfactor 不会发生rehash)

If many mappings are to be stored in a HashMap instance, creating it with a sufficiently large capacity will allow the mappings to be stored more efficiently than letting it perform automatic rehashing as needed to grow the table. Note that using many keys with the same hashCode is a sure way to slow down performance of any hash table. To ameliorate impact, when keys are Comparable, this class may use comparison order among keys to help break ties.

If many mappings are to be stored in a HashMap instance, creating it with a sufficiently large capacity will allow the mappings to be stored more efficiently than letting it perform automatic rehashing as needed to grow the table.
如果有很多的键值对要存到hashmap的实例中,创建一个足够大的容量的hashmap将会使得键值对的存储比让它根据需要自动做rehash以增长表更加有效。

Note that using many keys with the same hashCode is a sure way to slow down performance of any hash table.
注意,使用具有相同hashCode值的多个键确实会拖慢任何哈希表的性能。

To ameliorate impact, when keys are Comparable, this class may use comparison order among keys to help break ties.
为了减轻碰撞,当键有可比性时,这个类可以通过键之间的比较顺序来断绝关系。

If no such object exists, the map should be "wrapped" using the Collections.synchronizedMap method. This is best done at creation time, to prevent accidental unsynchronized access to the map.

If no such object exists, the map should be "wrapped" using the Collections.synchronizedMap method.
如果没有这样的对象存在,map应该通过Collections.synchronizedMap方法来封装。(所有方法都加上synchronized)

This is best done at creation time, to prevent accidental unsynchronized access to the map.
最好是在创建的时候完成,来防止意外的非线程安全的访问map。

The iterators returned by all of this class's "collection view methods" are fail-fast: if the map is structurally modified at any time after the iterator is created, in any way except through the iterator's own remove method, the iterator will throw a ConcurrentModificationException. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.

The iterators returned by all of this class's "collection view methods" are fail-fast: if the map is structurally modified at any time after the iterator is created, in any way except through the iterator's own remove method, the iterator will throw a ConcurrentModificationException.
这个类的所有集合视图方法的迭代器的返回都遵循fail-fast策略: 如果map在创建完迭代器之后的任何时机结构发生改变,除了通过迭代器自己的remove方法外,迭代器将会抛出ConcurrentModificationException。

Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.
因此,当并发修改的情况下,迭代器会快速干净的失败而不是在将来某个不确定的时间冒着任意的、不确定行为的风险。

Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast iterators throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: the fail-fast behavior of iterators should be used only to detect bugs.

Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification.
注意,迭代器本身的fail-fast行为不能被保证,通常来说,在非线程安全的并发修改存在的情况下,不可能做任何硬性的保证。

Fail-fast iterators throw ConcurrentModificationException on a best-effort basis.
迭代器的fail-fast机制抛出ConcurrentModificationException是最佳的处理方式。

Therefore, it would be wrong to write a program that depended on this exception for its correctness: the fail-fast behavior of iterators should be used only to detect bugs.
因此,编写依赖于这个异常的程序来保证其正确性是错误的做法:迭代器的fail-fast行为仅仅应该用来探测bugs。

核心词汇:
provide:提供
permit:允许
roughly:大体上、大致上
equivalent to:等于
except除了
guarantee:保证、担保
constant: 常数,常量、不变的事物
assume:假定、认为、假设
disperses:分散
proportional:成比例的
capacity:容量
measure:度量、测量
exceed:超过
internal:内部的
structure:结构
approximately:大约
tradeoff:折衷
overhead:开销、费用
expected:期望的、预期的
taken into account:考虑到
divide:分、划分、除以
sufficiently:足够地、充分地
ameliorate:改善、改良、减轻
impact:碰撞、影响
comparison:比较
ties:结、关系
multiple:并发的、多重的
structurally:在结构上的
externally:在..外部、在..外面
typically:通常、典型地
accomplish:完成
naturally:自然地、合理地、当然、不用说
encapsulate:封装、概述
prevent:预防、阻止
accidental:意外的、偶然的
arbitrary:随意的、任性的、随心所欲的
non-deterministic:非确定的
undetermined:未确定的
generally speaking:通常来说
presence:出席
best-effort basis:尽力而为、尽最大努力
correctness:正确性
detect:查明、发现、洞察

阅读 629

推荐阅读