java - Online interviewer | ByteDance side - 个人文章

Friends, happy new year, today I will share with you the experience of ByteDance Douyin e-commerce, I hope it will be helpful to my friends~

Interviewer : Hello, I'm ByteDance's interviewer xxx, is this Da Bin?

: Hello, interviewer, I am Dabin

: Is it convenient for an interview now?

: Hmm, yes

Interviewer : Let's start the interview now

Interviewer : read on your resume that you are familiar with collections. Do you know HashMap? Talk about the put method of HashMap?

Monologue: Sure enough, it came up with HashMap...

: HashMap implements the Map interface and is used to store key-value pair mappings. The bottom layer is implemented using array + linked list + red-black tree (JDK1.8 added the red-black tree part).

: Its put method process is as follows:

If the table is not initialized, the initialization process is performed first
Calculate the index of the key using the hash algorithm
Determine if there is an element at the index, and insert it directly if not
If there is an element at the index, traverse and insert, there are two cases, one is a linked list form, it is directly traversed to the end to insert, the other is a red-black tree, which is inserted according to the red-black tree structure
If the number of linked lists is greater than the threshold of 8, it must be converted into a red-black tree structure
After the addition is successful, it will check whether expansion is required

Interviewer : Well, you just mentioned the expansion of HashMap, can you elaborate?

Monologue: emm, dig a hole for myself...

: Take JDK1.8 as an example, when adding elements to HashMap, if the number of elements is greater than the threshold, the capacity will be expanded, and an array with twice the capacity will be used instead of the original array.

: Since the capacity of the array is expanded by the power of 2, when an Entity expands, the new position is either at the original position of , , or at the position of the original length of + the original position of .

: The reason is that the length of the array has become twice the original, which in binary is that has an extra high-order bit involved in the calculation of the array subscript .

: That is to say, in the process of element copying, there is no need to recalculate the position of the element in the array, just check whether the new bit of the original hash value is 1 or 0. If it is 0, the index does not change, yes If it is 1, the index becomes "original index + oldCap" (judging by e.hash & (oldCap - 1) == 0 ).

: This saves the time to recalculate the hash value, and since the new 1bit is 0 or 1 can be considered random, the resize process will evenly distribute the previous conflicting nodes to the new bucket .

Interviewer : , the basics are not bad. Seeing that you are proficient in MySQL on your resume, how about the index structure of MySQL?

Monologue: Damn, I will never dare to write proficient in the future... Fortunately, I recited the interview manual of Dachang yesterday, and now I am not panic at all. If you need it, you can go to the public account [Programmer Dabin] to reply to the [Manual] in the background. Get interview materials

: The most used index type in MySQL database is BTREE index, and the bottom layer is implemented based on the B+ tree data structure.

: B+ tree is implemented based on B tree and leaf node sequential access pointer. It has the balance of B tree, and improves the performance of interval query through sequential access pointer.

: When performing a search operation, first perform a binary search on the root node, find the pointer where the key is located, and then recursively search on the node pointed to by the pointer. Until the leaf node is found, then perform a binary search on the leaf node to find the data item corresponding to the key.

Interviewer : Why does the index use a B+ tree instead of a binary tree?

: B+ tree has a characteristic, that is, it is short enough and fat enough, which can effectively reduce the number of visits to nodes and improve performance.

: Although the binary tree also has good search performance log2N, when N is relatively large, the depth of the tree is relatively high. The data query time mainly depends on the number of disk IOs. The greater the depth of the binary tree, the more search times and the worse the performance. The worst case degenerates into a linked list. Therefore, B+ tree is more suitable as a MySQL index structure.

Interviewer : Then why not use B-trees?

Monologue: Now the interview is too busy, this is to build a rocket...

: Because the branch nodes of the B-tree store data, we need to perform an in-order traversal to scan in order to find the specific data. Since the data of the B+ tree is stored in the leaf nodes, the leaf nodes are all indexes, which is convenient for scanning the database, and only needs to scan the leaf nodes once. Therefore, B+ trees are more suitable for range queries, and range-based queries in the database are very frequent, so B+ trees are more suitable for database indexes.

Interviewer : Do you know clustered indexes?

: Strictly speaking, a clustered index is not an index type, but a data storage method, and the specific details depend on its implementation. For example, the leaf nodes of the innodb clustered index store the row records of the entire table.

: Clustered index-like dictionary-like pinyin directory. The data in the table is stored according to the rules of the clustered index. Just like the Xinhua Dictionary, the entire dictionary is arranged in the order of AZ. This is why a table can only have one clustered index.

Interviewer : What advantages does a clustered index have over a non-clustered index?

: 1. Data access is faster, because the clustered index keeps the index and data in the same B+ tree, so getting data from the clustered index is faster than the non-clustered index.
: 2. The storage of the leaf nodes of the clustered index is logically continuous, so the sorting and range searching of the primary key will be faster.

Interviewer : Well, ask something else, do you know about the thread pool?

: Thread pool, as the name suggests, is a pool for managing threads.

Interviewer : Then why use thread pool?

: There are three main reasons for using the thread pool:

Reduce resource consumption . Reduce the cost of thread creation and destruction by reusing already created threads.
Improves responsiveness . When a task arrives, it can be executed immediately without waiting for the thread to be created.
Improve thread manageability . Unified management of threads to prevent the system from creating a large number of threads of the same type and running out of memory.

Interviewer : Well, how about the parameters of the thread pool?

Monologue: The old eight-legged essay, hehe~

: Let's take a look at the general constructor of ThreadPoolExecutor:

public ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue, ThreadFactory threadFactory, RejectedExecutionHandler handler);

: There are 7 parameters. They are corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue, threadFactory, handler

: corePoolSize. When there is a new task, if the number of threads in the thread pool does not reach the basic size of the thread pool, a new thread will be created to execute the task, otherwise the task will be put into the blocking queue. When the number of surviving threads in the thread pool is always greater than corePoolSize, you should consider increasing corePoolSize.

: maximumPoolSize. When the blocking queue fills up, if the number of threads in the thread pool does not exceed the maximum number of threads, a new thread is created to run the task. Otherwise the new task is processed according to the rejection policy. Non-core threads are similar to temporarily borrowed resources. These threads should exit after the idle time exceeds keepAliveTime to avoid wasting resources.

: BlockingQueue. Stores tasks waiting to run.

: keepAliveTime. Non-core thread keep alive time after idle, this parameter is only valid for non-core thread. Set to 0, indicating that redundant idle threads will be terminated immediately.

: TimeUnit. Time unit, as follows:

TimeUnit.DAYS
TimeUnit.HOURS
TimeUnit.MINUTES
TimeUnit.SECONDS
TimeUnit.MILLISECONDS
TimeUnit.MICROSECONDS
TimeUnit.NANOSECONDS

: ThreadFactory. Whenever the thread pool creates a new thread, it does so through the thread factory method. There is only one method newThread defined in ThreadFactory, which is called whenever the thread pool needs to create a new thread.

public class MyThreadFactory implements ThreadFactory {
    private final String poolName;
    
    public MyThreadFactory(String poolName) {
        this.poolName = poolName;
    }
    
    public Thread newThread(Runnable runnable) {
        return new MyAppThread(runnable, poolName);//将线程池名字传递给构造函数，用于区分不同线程池的线程
    }
}

: RejectedExecutionHandler. When both the queue and thread pool are full, new tasks are processed according to the rejection policy.

AbortPolicy：默认的策略，直接抛出RejectedExecutionException
DiscardPolicy：不处理，直接丢弃
DiscardOldestPolicy：将等待队列队首的任务丢弃，并执行当前任务
CallerRunsPolicy：由调用线程处理该任务

Interviewer : Okay. Do you know Spring AOP?

: AOP, in fact, is aspect-oriented programming, which encapsulates some common logic (transaction management, logging, cache, etc.) into aspects and separates them from business code, which can reduce the repetitive code of the system and reduce the coupling between modules . Aspects are those common logics that are not related to business, but are called by all business modules.

: Spring AOP is implemented through dynamic proxy technology.

Interviewer : Oh, what are the implementation methods of dynamic proxy?

: There are two ways to implement dynamic proxy technology:

Interface-based JDK dynamic proxy.
Inheritance-based CGLib dynamic proxy. In Spring, if the target class does not implement the interface, Spring AOP will choose to use CGLIB to dynamically proxy the target class.

Interviewer : You just mentioned the CGlib dynamic proxy, can you describe it in detail?

: CGLIB, the Code Generator Library, is a powerful, high-performance code generation library that is widely used in AOP frameworks to provide method interception operations.

: CGLIB proxy mainly introduces an indirection level for objects through the operation of bytecode to control the access of objects.

: dynamic proxy is much less limited than JDK dynamic proxy, and the target object does not need to implement the .

Interviewer : Yes, prepare for the second face~

It is not easy to code words. If you think it is helpful to you, you can it and to encourage it!

Online interviewer | ByteDance side

程序员大彬

引用和评论

设计规则：模块化的力量

Java8的新特性

Java11的新特性

Java5的新特性

Java9的新特性

Java13的新特性

Java7的新特性