How did the architects who support tens of millions of concurrency evolve step by step?

The large-scale websites or architectures we see now are all developed step by step from small websites and simple architectures. Of course, some are built based on existing distributed architectures, and it depends on the development of the business. . In the process of iterative evolution of the architecture, many problems will be encountered, just like upgrading monsters. The higher the level, the stronger the monsters encountered.

A student asked me before, what is an architecture. That's how I answered. For example, if we want to build a house, we must have an architectural drawing before building a house. This drawing describes the shape, internal structure, materials, equipment and other information of the building. When the project is implemented, it will be constructed based on this drawing. The same is true for the software architecture. The software architecture is equivalent to a design drawing of the software system. This drawing describes the connection between the various components and the communication mechanism between the components in detail. In the implementation phase, programmers refine these abstract drawings into actual components, such as specific interface definitions, class definitions, etc.

Then we will simulate a simple case based on a purely technical point of view to see the problems and solutions brought about by the architecture iteration. Through such an iteration, we will let everyone understand the architecture more clearly. Throughout the process, the focus is on the changes in the amount of data and the amount of visits that bring about the changes in the architecture. No specific focus on business functions

Start with an e-commerce website

In order to better understand, we use e-commerce website as an example. As a transaction type website, it will have

User (user registration, user management), product (product display, product management), transaction (order, payment) these functions

If we only need to support these basic functions, then our initial architecture should be like this

It should be noted in this place that each functional module interacts through method calls inside the JVM, while the application and the database are accessed through JDBC.

Single machine load alarm, separation of database and application

With the opening of the website, the number of visits continues to increase, so the load of the server at this time is bound to continue to rise, and it is necessary to have some measures to deal with it. Regardless of the replacement of the machine and optimization of various software levels, some adjustments will be made from the structure of the architecture. We can divide the database and application from one machine to two machines

change:

The website has changed from one to two. This change has very little impact on us. In the case of a single machine, our application uses JDBC to connect to the database. Now the database is separated from the application. We only need to change the address of the database from the local machine to the ip address of the database server in the configuration file. No impact on development, testing, or deployment

After the adjustment, we can alleviate the current system pressure, but as time goes by and the number of visits continues to increase, our system still needs to be modified

so divided? From the perspective of the computer itself, the performance bottleneck will only be: CPU, file IO, network IO, memory, and other factors. However, these latitudes in a computer have performance bottlenecks. If a certain resource is consumed too much, it will usually cause the system to respond slowly. Therefore, adding a machine makes the IO and CPU resources of the database monopolize the machine and increase performance.

Inserting a digression in this place is to briefly talk about the reasons for the consumption of various resources.

CPU/IO/ Memory :

Mainly the context switching, because each CPU core can only execute one thread at the same time, and there are several ways of CPU scheduling, such as preemptive and polling. Taking preemptive as an example, each thread will allocate a certain amount Execution time, when the execution time is reached, there is IO blocking in the thread, or there is a high-priority thread to execute. The CPU will switch to execute other threads. In the process of switching, it is necessary to store the execution state of the current thread and restore the state of the thread to be executed. This process is context switching. For example, IO, lock wait and other scenarios will also trigger context switching. When there are too many context switches, the kernel will occupy more CPU.
File IO, such as frequent log writing, slower processing speed of the disk itself, will cause IO performance problems
Network IO, insufficient bandwidth
Memory, including memory overflow, memory leak, and insufficient memory

In fact, whether it is the tuning of the application layer or the upgrade of the hardware. In fact, it is nothing more than the adjustment of these factors.

Application server complex alarm, how to make the application server to the cluster

If the pressure on the application server becomes heavier at this time, according to the test results of the application, you can optimize the areas where the performance pressure is high. Here we consider optimization through horizontal expansion, turning a single machine into a cluster

The application server has changed from one to two. There is no direct interaction between the two application servers. They both rely on the database to provide services to the outside world. Then two problems will be raised at this time

The choice of the end user to access the two application servers

For this problem, DNS can be used to solve, or load balancing equipment can be used to solve

The problem with the session?

Horizontal and vertical expansion

For large-scale distributed architectures, we have been pursuing a simple and elegant way to cope with the increase in access and data volume. And this method usually means that there is no need to change the software program, and it can be solved only by hardware upgrades or adding machines. And this is the scaling design under the distributed architecture

There are two types of expansion: vertical expansion and horizontal expansion

Vertical Scaling: represents a way to support the growth of access and data volume by upgrading or increasing the hardware of a single machine. The advantage of vertical scaling is that the technical difficulty is relatively low, and the operation and modification costs are relatively low. But the disadvantage is that the machine performance is a bottleneck, while upgrading a high-performance minicomputer or mainframe, the cost is very large. This is also one of the reasons why Ali went to IOE

increases the number of CPU cores: increases the number of CPU cores, the service capacity of the system can be greatly increased, such as response speed, the number of threads that can be processed at the same time. But the introduction of CPU will also bring some significant problems

1. Lock competition is intensified; multiple threads run at the same time to access a certain shared data, then lock competition is involved. When the lock competition is fierce, many threads are waiting for the lock, so even if the CPU is increased, the threads cannot be processed faster. . Of course, there are tuning methods here, and you can reduce lock competition through tuning methods*
2. The number of threads supporting concurrent requests is fixed, so if the CPU is increased immediately, the service capacity of the system will not be improved*
3. For single-threaded tasks, multi-core CPUs are not very useful*

*Increase memory: increase memory can directly contribute to the response speed of the system, of course, it may not achieve the effect, that is, if the JVM heap memory is fixed.

Horizontal scaling: supports the growth of visits and data volume by adding machines, which becomes horizontal scaling. In theory, horizontal scaling has no bottleneck, but the disadvantage is that the technical requirements are relatively high, and it also brings greater challenges to operation and maintenance.

Both vertical scaling and horizontal scaling have their own advantages. We will combine the two in actual use. On the one hand, we must consider the cost of hardware upgrades, and on the other hand, we must consider the cost of software transformation.

Introduce load balancing equipment

Service routing, based on load balancing equipment

After the load balancer is introduced, it will bring session-related problems

Load balancing algorithm

Round Robin method

Distribute requests to the back-end servers in turn, and treat each server in a balanced manner, regardless of the actual number of connections to the server and the current system load

Disadvantages : When the server hardware configuration in the cluster is different, and the performance difference is large, it cannot be treated differently

Random method

Through the system random function, one of them is randomly selected for access according to the size of the backend server list. As the number of calls increases, the actual effect is getting closer and closer to the even distribution of traffic to each server in the background, that is, the effect of the polling method.

Advantages : Simple to use, no additional configuration and algorithms are required.

Disadvantages : The characteristic of random numbers is that balance can only be guaranteed when the amount of data is large, so if the amount of requests is limited, it may not meet the requirements of balanced load.

Source address hashing

According to the IP address of the client requested by the service consumer, a hash value is calculated through a hash function, and the hash value is modulo the size of the server list, and the result is the serial number of the server address to be accessed. The source address hashing method is used for load balancing. If the server list of the same IP client remains unchanged, it will be mapped to the same backend server for access.

Weight Round Robin (Weight Round Robin) method

Different backend servers may have different machine configurations and current system loads, so their stress resistance capabilities are also different. Assign higher weight to machines with high configuration and low load so that they can handle more requests, while machines with low configuration and high load are assigned lower weights to reduce their system load. Weighted polling is very useful. Deal with this problem well, and distribute the requests to the backend in order and according to the weight

Least connection method

The first several methods are to maximize the utilization of the server through a reasonable allocation of the number of requests, but in fact, the balance of the number of requests does not represent the balance of the load. Therefore, the minimum connection number method is introduced. It is precisely based on the current connection status of the back-end servers, dynamically selecting the server with the least number of connections in the current backlog to process the current request, increasing the utilization of the back-end server as much as possible, and diverting the load to each server reasonably.

session problem

When we open a webpage, we basically need the browser and the web server to interact many times. We all know that the Http protocol itself is stateless. This is also the original intention of the http protocol design. The client only needs to request the server to download certain files. , Neither the client nor the server need to record each other's past behaviors. Each request is independent, just like the relationship between a customer and a vending machine.

In fact, many of our scenarios require stateful features, so we smartly introduced the session+cookie mechanism to remember the session for each request.

At the beginning of the session, a unique session ID (sessionid) is assigned to the current session, and then this ID is told to the browser through the cookie. In the future, each time the request is made, the browser will bring the session ID to tell the web server that the request belongs to Which session. On the web server, each session has an independent storage to save the information of different sessions.

If you encounter a situation where cookies are disabled, the general approach is to put the session identifier in the URL parameters.

After our application server changes from one to two, we will encounter session problems

Session sharing in a distributed environment

Session sharing is not a new topic in the current Internet background, and there are actually many very mature solutions for how to solve session sharing.

The session replication or session sharing implemented by the server. This type of shared session is closely related to the server.

We have increased the synchronization of session data between the web servers, which ensures the consistency of session data between different web servers. General application containers support Session Replication mode

There is a problem:

Synchronizing session data causes network bandwidth overhead. As long as the session data changes, the data needs to be synchronized to all other machines. The more machines, the greater the network bandwidth overhead caused by synchronization.
Each Web server must save all Session data. If the entire cluster has a lot of Session data (many people access the website at the same time), the content occupied by each machine for saving Session data will be very serious.

This solution relies on the application container to complete the copy of the Session to solve the problem of the Session, and the application itself does not care about this matter. This solution is not suitable for scenarios with a large number of cluster machines.

Use mature technology for session replication, such as gemfire used by 12306, such as common memory databases such as Redis

Session data is not saved to the local machine and is stored in a centralized storage place. Modifying the Session also occurs in the centralized storage place. The Web server uses Session to read from the centralized storage. This ensures that the Session data read by different Web servers are the same. The specific way to store Session can be database

There is a problem:

Reading and writing Session data introduces network operations. Compared with the data reading of this machine, the problem lies in the delay and instability. However, our communication basically occurs in the intranet, and the problem is not big.
If there is a problem with the machine or cluster that centrally stores the Session, it will affect our application.

Compared with Session Replication, when the number of Web servers is relatively large and the number of Sessions is relatively large, the advantages of this centralized storage solution are very obvious.

Maintain the session on the client

It is easy to think of using cookies, but the client is at risk, the data is not secure, and the amount of data that can be stored is relatively small, so maintaining the session on the client must encrypt the information in the session.

Our session data is placed in a cookie, and then the corresponding session data is generated from the cookie on the web server. It's like we bring our own bowls and chopsticks with us every time, so that we can go to that restaurant and choose at will. Compared with the previous centralized storage solution, it does not rely on an external storage system, and there is no network delay or instability for obtaining and writing Session data from an external system.

There is a problem:

safety. Session data is originally server-side data, and this solution is to allow these server-side data to the external network and the client, so there is a security problem. We can encrypt the session data of the written cookie, but for security, it is safe to not be physically accessible.

The database pressure becomes larger, so read and write are separated

As the business continues to grow, the amount of data and access continue to increase. For large websites, there are many businesses that read more and write less, and this situation will also be directly fed back to the database. So for this situation, we can consider using read-write separation to optimize the pressure of the database

This structural change will bring two problems

how the data synchronization

We hope to share the pressure of reading on the main library by reading the library, so the first thing we need to solve is the problem of how to copy to the reading library. The database system generally provides the function of data replication, and we can directly use the mechanism of the database system itself. Different database systems have different support. For example, Mysql supports the structure of Master+slave and provides a data replication mechanism.

application route the data source

For applications, adding a read library has a certain impact on structural changes, that is, our application needs to choose different database sources according to different situations

Search engine is actually a reading library

A search engine can actually be understood as a reading library. Our products are stored in the database, and the website needs to provide users with the function of real-time retrieval, especially in the area of product search. For such read requests, if all the libraries are read, there will actually be a bottleneck in performance. Using a search engine can not only greatly increase the retrieval speed. It can also reduce the pressure of reading the database

The most important job of a search engine is to build an index based on the data being searched, and as the data being searched changes, the index needs to change accordingly.

The use of the search cluster is the same as the use of the read library, but the process of constructing the index basically needs to be implemented by ourselves. You can plan the way the search engine builds the index from two latitudes, one is divided according to full/incremental. One is based on real-time/non-real-time division.

The full method is used to create an index for the first time, which may be new or rebuilt. The incremental approach is to continuously update the index on a full basis.

Real-time and non-real-time refer to the index update time, real-time is the best, non-real-time mainly considers the protection of the data source

In general, search engine technology solves the problem of reading in certain scenarios when searching on the site, and provides better query efficiency.

A tool for accelerating data reading-caching and distributed storage

In large websites, it is basically solving storage and computing problems, so storage is a very important support system. In the initial stage of website construction, we all started with relational databases, and in many cases, for convenience, we would put some business logic in the database to do, such as triggers and stored procedures. Although the problem can be solved easily in the early stage, it will bring a lot of troubles in the future development process. For example, after the amount of data is large, the database and table operations must be performed. At the same time, after the business has developed to a certain volume , The storage requirements cannot be fully satisfied by relational databases

Distributed file system

For some pictures and large texts, it is not appropriate to use a database, so we will use a distributed file system to realize file storage. There are many products in distributed file system, such as Taobao's TFS and Google's GFS. And the open source HDFS

NoSQL

NoSQL can be understood as Not Only SQL, or No SQL. Both meanings are meant to express that in large-scale websites, relational databases can solve most problems, but the storage requirements for different content characteristics, access characteristics, and transaction characteristics are different. NoSQL is positioned between the file system and the SQL relational database.

Data caching is for better service

Some data caches are used internally in large websites, which are mainly used to share the pressure of reading the database. The cache system is generally used to store and query key-value pairs. The application system generally puts hot data into the cache, and the filling of the cache should also be completed by the application system. If the data does not exist, the data is stored separately from the database and then put into the cache. Over time, when the cache capacity is not enough to clear the data, the data that has not been accessed recently will be cleared. Another way is to actively put the data into the cache system after the data in the database changes. The advantage of this is that the cached data can be updated in time when the data changes, and it will not cause read failure.

Page cache

In addition to data caching, we can also cache pages. Data caching can speed up the number of data reads when the application responds to requests, but the pages that are finally displayed to users are some dynamically generated pages or pages with particularly high traffic. , We will cache the page or content.

Make up for the shortcomings of relational databases and introduce distributed storage

We mainly use relational databases, but in some scenarios, relational databases are not very suitable. So we will introduce distributed storage systems, such as redis, mongoDB, cassandra, HBase, etc.

According to different scenarios and data structure types, choosing a suitable distributed storage system can greatly improve performance. The distributed system provides a high-capacity, high-concurrent access, data redundancy and debt financing support through the cluster.

After read-write separation, the database encountered a bottleneck again

Through the separation of read and write and the replacement of relational databases with distributed storage systems in some scenarios, the pressure on the main database can be reduced and data storage problems can be solved. However, with the development of business, our main database will also encounter bottleneck. Deduction to the present, the various modules of our website: transactions, commodities, and user data are still stored in a database. Despite the increase in the way of caching and separation of read and write, the pressure on the database is still increasing, so we can split the data vertically and horizontally to solve the problem of database pressure

Special database dedicated, data split vertically

Vertical split means to split different business data in the database into different databases, then according to the example we deduced, separate the data of users, transactions, and commodities

The data of different businesses is split from the original one database to multiple databases, so you need to consider how to deal with the original single-machine cross-business transactions

Use distributed transaction solution
Get rid of the affairs or do not pursue the support of strong affairs

After the data is split vertically, the pressure problem of putting all business data in one database is solved, and more optimizations can be made according to the characteristics of different businesses

After vertical splitting, when a bottleneck is encountered, the data is split horizontally

Corresponding to the vertical split is the horizontal data split. The data horizontal split is to split the data of the same table into two databases. The reason for the horizontal data split is the amount of data in the data table of a certain business or The update volume has reached the bottleneck of a single database. At this time, the table can be split into two or more databases.

The difference between horizontal data splitting and read-write separation is that read-write separation solves the problem of high read pressure, and does not work for large amounts of data or large updates.

The difference between horizontal data splitting and vertical data splitting is that vertical splitting splits different tables into different databases, while horizontal splitting splits the same table into different databases.

We can further split the user table into two databases, they have exactly the same structure of the user table, and the user table in each database only covers a part of the user, the user of the two databases together is equivalent to no User table before split

The impact of horizontal split

SQL routing problem, you need to determine which database the current request is sent to based on a condition
The processing of the primary key, can not use self-increment id, need global id

Because the data of the same business is split into different databases, some queries need to be obtained across two databases. If the amount of data is too large and needs to be paged, it will be more difficult to handle

Challenges faced by the application after the database problem is solved

The aforementioned separation of reads and writes, distributed storage, vertical splitting and horizontal splitting of data are all solving data problems. Next, we will look at the changes in applications.

With the development of business, the functions of applications will become more and more, and the applications will become larger and larger. We need to think about how to prevent the application from growing continuously. This requires disassembling the application from one application to two or even Is more than one.

The first way

Split the application according to the characteristics of the business. In our example, the main business functions are divided into three parts, users, commodities, and transactions. We can split the original application into two applications that focus on transactions and commodities respectively. For transactions and commodities, there will be places where users are designed to use. We let these two systems complete the work involving users by themselves, and similar to user registration Basic user tasks such as, login, etc., can be temporarily handed over to one of the two systems to complete

We can also split into three systems according to user registration, user login, user information maintenance, etc. However, after such splitting, there will be some similar codes in different systems, such as user-related codes. How can we ensure this? The consistency of part of the code and how to provide reuse for other modules are also issues that need to be resolved. Moreover, there is no direct mutual call between the new systems split in this way

The road to service

Let’s take a look at the practice of servicing. We divide the application into three layers. At the top is the web system, which is used to complete different business functions. In the middle are some service centers. Different service centers provide different business services. ; At the lowest level is the business database

Compared with before, there are several important changes. First, the access between business functions is not only the method call inside the stand-alone machine, but also the remote service call. Second, the shared code is no longer scattered in different applications, these implementations are placed in various service centers. Finally, some changes have taken place in the connection of the database. We put the interactive work of the database in the service center, so that the front-end web application pays more attention to the interactive work with the browser instead of paying too much attention to business logic. The task of linking the database is handed over to the responding business service center, which can reduce the number of database connections.

The service center not only centrally manages some codes that can be shared, but also makes these codes better maintained.

service-based approach will bring many benefits . First of all, from the structural point of view, the system architecture is clearer and more three-dimensional than the original architecture. From the standpoint of stability, some codes scattered in multiple application systems become services and are uniformly maintained by a dedicated team. On the one hand, the quality of the code can be improved, and on the other hand, the basic core modules are relatively stable, modified and released. Relative to the business system, the frequency will be much less, which will also improve the stability of the entire architecture. Finally, the lower-level resources are managed uniformly by the service layer, and the structure is clearer, which greatly improves the team development efficiency

The service-oriented approach will also have a great impact on research and development. The previous development model was that several large teams were responsible for several large applications. With the implementation of service-oriented, the number of our applications will grow rapidly, and the internal dependencies of the system It will also become intricate and complicated, and the team will also be split. Each small team focuses on a specific service or application, and iterative efficiency will be higher.

Copyright statement: All articles in this blog, except for special statements, adopt the CC BY-NC-SA 4.0 license agreement. Please indicate the reprint from Mic takes you to learn architecture! If this article is helpful to you, please help me to follow and like. Your persistence is the motivation for my continuous creation. Welcome to follow the WeChat public account of the same name for more technical dry goods!

How did the architects who support tens of millions of concurrency evolve step by step?

Start with an e-commerce website

Single machine load alarm, separation of database and application

Application server complex alarm, how to make the application server to the cluster

Horizontal and vertical expansion

Introduce load balancing equipment

Load balancing algorithm

Round Robin method

Random method

Source address hashing

Weight Round Robin (Weight Round Robin) method

Least connection method

session problem

Session sharing in a distributed environment

Maintain the session on the client

The database pressure becomes larger, so read and write are separated

Search engine is actually a reading library

A tool for accelerating data reading-caching and distributed storage

Distributed file system

NoSQL

Data caching is for better service

Page cache

Make up for the shortcomings of relational databases and introduce distributed storage

After read-write separation, the database encountered a bottleneck again

Special database dedicated, data split vertically

After vertical splitting, when a bottleneck is encountered, the data is split horizontally

Challenges faced by the application after the database problem is solved

The road to service

跟着Mic学架构

`引用和评论`

【Java面试】大厂裁员，小厂倒闭，如何搞定面试官Java SPI是什么？有什么用？

Java8的新特性

Java11的新特性

Java5的新特性

Java9的新特性

Java13的新特性

Java7的新特性