Java Collection Common Knowledge Points & Interview Questions Summary (Part 1), the latest version in 2022!

Hello, my name is Guide. The autumn recruitment is coming (early approval has already started), I have refactored and improved the content of JavaGuide, and the official account will synchronize the latest update, hoping to help you.

You can also read it online on the website ( javaguide.cn ), the reading experience will be better!

The first 3 articles:

Java Basics Common Knowledge Points & Interview Questions Summary (Part 1), the latest version in 2022!
Java basic common knowledge points & interview questions summary (in), the latest version of 2022
Java Basic Common Knowledge Points & Interview Questions Summary (Part 2), the latest version in 2022!

Collection overview

Overview of Java Collections

Java collections, also known as containers, are mainly derived from two interfaces: one is the Collection interface, which is mainly used to store a single element; the other is the Map interface, which is mainly used for for storing key-value pairs. For the Collection interface, there are three main sub-interfaces: List , Set and Queue .

The Java Collections Framework is shown in the following figure:

Note: The figure only lists the main inheritance and derivation relationships, and does not list all relationships. For example, abstract classes such as AbstractList , NavigableSet and other auxiliary classes are omitted. If you want to know more, you can view the source code yourself.

Talk about the difference between List, Set, Queue, Map?

List (good helper for dealing with order): The stored elements are ordered and repeatable.
Set (focus on unique properties): The stored elements are unordered and non-repeatable.
Queue (calling machine with queuing function): The order is determined according to the specific queuing rules, and the stored elements are ordered and repeatable.
Map (experts who search by key): use key-value pair (key-value) storage, similar to the mathematical function y=f(x), "x" for key, "y" for value, key are unordered and non-repeatable, value is unordered and repeatable, and each key maps to at most one value.

Summary of the underlying data structure of the collection framework

Let's first take a look at the collection under the Collection interface.

List

Arraylist : Object[] array
Vector : Object[] array
LinkedList : Doubly linked list (circular linked list before JDK1.6, JDK1.7 cancels the cycle)

Set

HashSet (unordered, unique): based on HashMap , the bottom layer uses HashMap to save elements
LinkedHashSet : LinkedHashSet is a subclass of HashSet , and it is internally implemented by LinkedHashMap . It's a bit similar to what we said before LinkedHashMap which is internally based on HashMap the same implementation, but there is still a little difference
TreeSet (ordered, unique): red-black tree (self-balancing sorted binary tree)

Queue

PriorityQueue : Object[] array to implement binary heap
ArrayQueue : Object[] array + double pointer

Let's take a look at the collection under the Map interface.

Map

HashMap : Before HashMap is composed of array + linked list, the array is the main body of HashMap , and the linked list is the main body to solve the hash conflict. ("zipper method" for conflict resolution). After JDK1.8, there have been major changes in resolving hash conflicts. When the length of the linked list is greater than the threshold (the default is 8) (it will be judged before converting the linked list into a red-black tree, if the length of the current array is less than 64, it will be selected. When expanding the array first, instead of converting to a red-black tree), convert the linked list into a red-black tree to reduce search time
LinkedHashMap : LinkedHashMap inherits from HashMap , so its bottom layer is still based on a zipper hash structure consisting of an array and a linked list or a red-black tree. In addition, LinkedHashMap adds a doubly linked list based on the above structure, so that the above structure can maintain the insertion order of key-value pairs. At the same time, by performing corresponding operations on the linked list, the access sequence related logic is realized. Details can be viewed: "LinkedHashMap Source Code Detailed Analysis (JDK1.8)"
Hashtable : composed of array + linked list, the array is the main body of Hashtable , and the linked list exists mainly to resolve hash conflicts
TreeMap : red-black tree (self-balancing sorted binary tree)

How to choose a collection?

It is mainly selected according to the characteristics of the set. For example, when we need to obtain the element value according to the key value, we select Map the set under the interface, and select TreeMap when sorting is not required. Select HashMap , if you need to ensure thread safety, select ConcurrentHashMap .

When we only need to store the element value, we choose to implement Collection a collection of interfaces, and when we need to ensure that the elements are unique, choose to implement Set a collection of interfaces such as TreeSet HashSet LinkedList choose to implement List ArrayList you don't need it .

Why use collections?

When we need to save a set of data of the same type, we should use a container to save it. This container is an array. However, using an array to store objects has certain disadvantages.
Because we are in actual development, the types of data stored are various, so there is a "collection", which is also used to store multiple data.

The disadvantage of an array is that once declared, the length is immutable; at the same time, the data type when the array is declared also determines the type of data stored in the array; moreover, the data stored in the array is ordered, repeatable, and unique. .
However, collections improve the flexibility of data storage. Java collections can not only be used to store objects of different types and numbers, but also store data with mapping relationships.

List of Collection sub-interfaces

Difference between Arraylist and Vector?

ArrayList is the main implementation class of --- List ---, the bottom layer uses Object[ ] storage, suitable for frequent search work, thread unsafe;
Vector is an ancient implementation class of --- List ---, the bottom layer uses Object[ ] storage, thread-safe.

Difference between Arraylist and LinkedList?

Whether thread safety is guaranteed: ArrayList and LinkedList are not synchronized, that is, thread safety is not guaranteed;
The underlying data structure: Arraylist The bottom layer uses the LinkedList dda2a4361051205af82e9264c97dfba2 Object array ; 7 The cycle is canceled. Note the difference between a doubly linked list and a doubly circular linked list, which will be introduced below!)
Are insertions and deletions affected by element position:
- ArrayList uses array storage, so the time complexity of inserting and deleting elements is affected by the element position. For example: when executing the add(E e) method, ArrayList will by default append the specified element to the end of the list, in which case the time complexity is O(1). But if you want to insert and delete elements at the specified position i ( add(int index, E element) ), the time complexity is O(ni). Because when the above operation is performed, the (ni) elements after the i-th element and the i-th element in the collection must perform the backward/forward one-bit operation.
- LinkedList采用链表存储，所以，如果是在头尾插入或者删除元素不受元素位置的影响（ add(E e) 、 addFirst(E e) 、 addLast(E e) , removeFirst() , removeLast() ), the time complexity is O(1), if you want to insert and delete elements at the specified position i ( add(int index, E element) ， remove(Object o) ), the time complexity is O(n), because it needs to move to the specified position before inserting.
Whether to support fast random access: LinkedList does not support efficient random element access, while ArrayList . Fast random access is to quickly obtain the element object through the element's serial number (corresponding to the get(int index) method).
Memory space occupation: ArrayList The space waste is mainly reflected in that a certain capacity space is reserved at the end of the list list, while the space cost of LinkedList is reflected in that each element needs to consume more than ArrayList space (because to store direct successor and direct predecessor and data).

We generally do not use LinkedList in the project, and we need to use LinkedList in almost all scenarios where ArrayList can be used instead, and the performance will usually be better! Even Josh Bloch, the author of LinkedList says he never uses LinkedList .

In addition, don't subconsciously think that LinkedList as a linked list is the most suitable for the scene of adding and deleting elements. I also said above, LinkedList only when inserting or deleting elements at the head and tail, the time complexity is approximately O(1), and the time complexity of adding and deleting elements in other cases is O(n).

Supplementary Content: Doubly Linked List and Doubly Circular Linked List

Doubly linked list: contains two pointers, a prev points to the previous node, and a next points to the next node.

In addition, I recommend an article that makes the doubly linked list clear: https://juejin.cn/post/6844903648154271757

双向链表

Doubly circular linked list: the next of the last node points to the head, and the prev of the head points to the last node, forming a ring.

双向循环链表

Added content: RandomAccess interface

 public interface RandomAccess {
}

Looking at the source code we find that in fact RandomAccess there is nothing defined in the interface. So, in my opinion RandomAccess the interface is just an identifier. Identify what? Identifies that classes implementing this interface have random access capabilities.

In the binarySearch（) method, it needs to judge whether the incoming list is an instance of RandomAccess d2dc1c510979eded6d555b52b55f8200---, if so, call the indexedBinarySearch() method, if not, then call iteratorBinarySearch() method

 public static <T>
    int binarySearch(List<? extends Comparable<? super T>> list, T key) {
        if (list instanceof RandomAccess || list.size()<BINARYSEARCH_THRESHOLD)
            return Collections.indexedBinarySearch(list, key);
        else
            return Collections.iteratorBinarySearch(list, key);
    }

ArrayList implements the RandomAccess interface, while LinkedList does not. why? I think it is still related to the underlying data structure! ArrayList the bottom layer is an array, and LinkedList the bottom layer is a linked list. Arrays naturally support random access, and the time complexity is O(1), so it is called fast random access. The linked list needs to be traversed to a specific position to access the element at a specific position, and the time complexity is O(n), so fast random access is not supported. , ArrayList realizes the RandomAccess interface, which shows that he has fast random access function. RandomAccess interface is just a logo, it does not mean that ArrayList Implementation RandomAccess interface has the fast random access function!

Let's talk about the expansion mechanism of ArrayList

For details, please refer to this article by the author: Analysis of ArrayList Expansion Mechanism

Collection sub-interface of Set

Difference between comparable and Comparator

comparable the interface is actually from java.lang package it has a compareTo(Object obj) method for sorting
comparator interface is actually from the java.util package which has a compare(Object obj1, Object obj2) method for sorting

Generally, when we need to use custom sorting for a collection, we need to rewrite the compareTo() method or the compare() method. When we need to implement two sorting methods for a collection, such as a If the song title and artist name in the song object adopt a sorting method, we can rewrite the compareTo() method and use the self-made Comparator method or use two Comparators to realize the song title Sorting and sorting of singer names, the second means that we can only use the two-parameter version Collections.sort() .

Comparator custom sorting

 ArrayList<Integer> arrayList = new ArrayList<Integer>();
        arrayList.add(-1);
        arrayList.add(3);
        arrayList.add(3);
        arrayList.add(-5);
        arrayList.add(7);
        arrayList.add(4);
        arrayList.add(-9);
        arrayList.add(-7);
        System.out.println("原始数组:");
        System.out.println(arrayList);
        // void reverse(List list)：反转
        Collections.reverse(arrayList);
        System.out.println("Collections.reverse(arrayList):");
        System.out.println(arrayList);

        // void sort(List list),按自然排序的升序排序
        Collections.sort(arrayList);
        System.out.println("Collections.sort(arrayList):");
        System.out.println(arrayList);
        // 定制排序的用法
        Collections.sort(arrayList, new Comparator<Integer>() {

            @Override
            public int compare(Integer o1, Integer o2) {
                return o2.compareTo(o1);
            }
        });
        System.out.println("定制排序后：");
        System.out.println(arrayList);

Output:

 原始数组:
[-1, 3, 3, -5, 7, 4, -9, -7]
Collections.reverse(arrayList):
[-7, -9, 4, 7, -5, 3, 3, -1]
Collections.sort(arrayList):
[-9, -7, -5, -1, 3, 3, 4, 7]
定制排序后：
[7, 4, 3, 3, -1, -5, -7, -9]

Override the compareTo method to sort by age

 // person对象没有实现Comparable接口，所以必须实现，这样才不会出错，才可以使treemap中的数据按顺序排列
// 前面一个例子的String类已经默认实现了Comparable接口，详细可以查看String类的API文档，另外其他
// 像Integer类等都已经实现了Comparable接口，所以不需要另外实现了
public  class Person implements Comparable<Person> {
    private String name;
    private int age;

    public Person(String name, int age) {
        super();
        this.name = name;
        this.age = age;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }

    /**
     * T重写compareTo方法实现按年龄来排序
     */
    @Override
    public int compareTo(Person o) {
        if (this.age > o.getAge()) {
            return 1;
        }
        if (this.age < o.getAge()) {
            return -1;
        }
        return 0;
    }
}

 public static void main(String[] args) {
        TreeMap<Person, String> pdata = new TreeMap<Person, String>();
        pdata.put(new Person("张三", 30), "zhangsan");
        pdata.put(new Person("李四", 20), "lisi");
        pdata.put(new Person("王五", 10), "wangwu");
        pdata.put(new Person("小红", 5), "xiaohong");
        // 得到key的值的同时得到key所对应的值
        Set<Person> keys = pdata.keySet();
        for (Person key : keys) {
            System.out.println(key.getAge() + "-" + key.getName());

        }
    }

Output:

 5-小红
10-王五
20-李四
30-张三

What is the meaning of disorder and non-repeatability

1. What is disorder? Disorder is not equal to randomness. Disorder means that the data stored in the underlying array is not added in the order of the array index, but is determined by the hash value of the data.

2. What is non-repeatability? Non-repeatability means that when the added element is judged according to equals(), it returns false, and the equals() method and the HashCode() method need to be rewritten at the same time.

Compare the similarities and differences of HashSet, LinkedHashSet and TreeSet

HashSet 、 LinkedHashSet TreeSet 1f9e08e60c8fa882324e0fb07647df5c---都是---abc01520c5e2a3c5678491d0ac4386c4 Set接口的实现类，都能保证元素唯一，并且都不是线程安全的。
The main difference between HashSet , LinkedHashSet and TreeSet is the underlying data structure. The underlying data structure of HashSet is a hash table (implemented based on HashMap ). The underlying data structure of LinkedHashSet is a linked list and a hash table, and the insertion and removal order of elements satisfies the FIFO. TreeSet The underlying data structure is a red-black tree, the elements are ordered, and the sorting methods include natural sorting and custom sorting.
Different underlying data structures lead to different application scenarios of the three. HashSet Used for scenarios where the insertion and removal order of elements do not need to be guaranteed, LinkedHashSet Used for scenarios where the insertion and removal order of elements satisfies FIFO, TreeSet used for Scenarios that support custom sorting rules for elements.

Queue of Collection sub-interface

The difference between Queue and Deque

Queue is a single-ended queue, which can only insert elements from one end and delete elements from the other end. The implementation generally follows the first-in-first- out (FIFO) rule.

Queue extends the interface of Collection , which can be divided into two types according to the different processing methods after the operation fails due to capacity problems : one will throw an exception after the operation fails, The other will return a special value.

`Queue` Interface	Throw an exception	return special value
insert at the end of the queue	add(E e)	offer(E e)
delete leader	remove()	poll()
Query the head element of the queue	element()	peek()

Deque is a double-ended queue, and elements can be inserted or deleted at both ends of the queue.

Deque extended the interface of Queue , and added the methods of inserting and deleting at the head and tail of the team, which are also divided into two categories according to the different processing methods after failure:

`Deque` Interface	Throw an exception	return special value
Insert head of line	addFirst(E e)	offerFirst(E e)
insert at the end of the queue	addLast(E e)	offerLast(E e)
delete leader	removeFirst()	pollFirst()
delete tail	removeLast()	pollLast()
Query the head element of the queue	getFirst()	peekFirst()
query tail element	getLast()	peekLast()

In fact, Deque also provides other methods such as push() and pop() which can be used to simulate stacks.

Difference between ArrayDeque and LinkedList

ArrayDeque and LinkedList both implement the Deque interface, both have the function of queue, but what is the difference between the two?

ArrayDeque is implemented based on variable-length arrays and double pointers, while LinkedList is implemented through linked lists.
ArrayDeque does not support storing NULL data, but LinkedList does.
ArrayDeque was introduced in JDK1.6, while LinkedList was already existed in JDK1.2.
ArrayDeque There may be an expansion process during insertion, but the amortized insertion operation is still O(1). Although LinkedList does not need to be expanded, new heap space needs to be applied for each time data is inserted, and the average performance is slower.

From a performance point of view, it is better to use ArrayDeque to implement the queue than LinkedList . Additionally, ArrayDeque can also be used to implement stacks.

Talk about PriorityQueue

PriorityQueue was introduced in JDK1.5, the difference from Queue is that the element dequeue order is related to the priority, that is, the element with the highest priority is always out first team.

Here are some key points about it:

PriorityQueue is implemented by using the binary heap data structure, and the bottom layer uses variable-length arrays to store data
PriorityQueue Through the floating and sinking of heap elements, the time complexity of inserting elements and deleting the top elements of the heap is realized in O(logn).
PriorityQueue is not thread-safe and does not support storing objects of NULL and non-comparable .
PriorityQueue default is a small top heap, but it can receive a Comparator as a construction parameter to customize the priority of elements.

PriorityQueue In the interview, it may appear more in the hand tearing algorithm. Typical examples include heap sorting, finding the K-th largest number, traversing the weighted graph, etc., so you need to be proficient in using it. .

postscript

Focus on Java original dry goods sharing, the junior open source JavaGuide ("Java Learning + Interview Guide" covers the core knowledge that most Java programmers need to master. Prepare for Java interviews, JavaGuide is the first choice!), currently has 120k+ Stars.

If this article is helpful to you, please like and share, it is very important for me to continue to share & create high-quality articles. Thanks 🙏🏻