Author
Chen , participated in the research and development of the basic platform of Guosen Securities (DevOps, Microservice Governance, Serverless)
Guosen Securities is a national large-scale comprehensive securities company with 57 branches and 185 sales offices in 118 cities and regions. Core indicators such as net capital, operating income, and net profit rank among the top in the industry.
Guosen Securities began to contact TiDB in June 2020, and it took about half a year from technical pre-research to the first business launch. The first online business was Golden Sun Billing, which was subsequently applied in data center, service observation and other systems. From the deployment of TiDB only in the main room in Dongguan to the realization of the deployment of TiDB in multiple computer rooms in September 2021, and the start of the pilot work of the domestic Haiguang x86 server, Guosen Securities has accumulated rich practical experience in the exploration and application of open source NewSQL databases. At present, Securities has a total of 7 TiDB clusters with 109 nodes and a maximum table of 10 billion, supporting custody, brokerage and self-operated businesses.
from 0 to 1, Guoxin Golden Sun introduced TiDB
Guoxin Golden Sun provides securities trading, wealth management and information-related services. The main function of our securities software is trading. When making a transaction, we will pay more attention to the yield and when to buy or sell stocks. At present, the number of users of Guoxin Golden Sun is about 10 million. There is a big data platform in Guoxin, which stores the transaction data of all users over the years. The inventory data of bills is more than 10 billion, and the amount of stock clearance data is about one billion. Billing and liquidation stock increment data is about two billion per year.
The following picture is the approximate service structure of Guoxin Golden Sun. The Golden Sun App is directly connected to the Golden Sun back-end, and the back-end architecture is based on the gRPC microservice framework developed by Guoxin. Like most brokerage systems, Golden Sun’s back-end is connected to the counter system, which includes various back-end applications such as transactions, accounts, and clearing. All data will eventually be pushed to the data center, and the data center will run batches on a daily or weekly basis. After running the batch, the relevant data will be pushed to the front-end database, and the App will query the database through the query service to obtain the relevant bills and stock clearance data.
Figure: Guoxin Golden Sun Data Service Architecture
The whole bill has gone through three versions, bill 1.0 we can call it the era of slash and burn. In version 1.0, only one-year billing query is supported, and single-database single-table SQL Server is used. How is it achieved? First, the data center will synchronize the data to a temporary table every day, and then perform data conversion. The converted data of this temporary table will be appended to the formal table, which stores the amount of data for one year. In the case of single database and single table, all the data of one year is compressed into a certain column through an intermediate program, which is a JSON string, which stores the data of 365 days in a year. What if there is new data on a new day? A scheduled task will start in the background to push the data pushed from the new day into this JSON, and then push it backwards for 365 days, clear the data of the earliest day before, and finally write it into the official table. As you can see, this is actually a very primitive implementation. Why use the JSON format? Because we have more than 10 million data users, if we store one row per day, we cannot hold it with a single database and single table. So at the beginning, we used a very tricky way to combine more than 360 rows into one row. In this case, the data volume of the entire table is nearly ten million, and the query efficiency is OK.
Figure: Billing 1.0 Implementation of Single Warehouse and Single Table
The problem faced by this method is: in business, users want to query data for a longer period of time, such as five years. If a single table is used, this requirement is difficult to meet. Technically, data query and subsequent updates are under a lot of pressure and are difficult to expand. Sometimes data update errors occur. When users come to query the next day, the data found will be inaccurate. In order to cope with these business and technical difficulties, Guosen uses sharding-jdbc in the billing version 2.0 to do sub-database and sub-table. In the introduction of this technology, we also heard TiDB, taking into account the securities business stability demanding , there was some concern TiDB of stability, so I chose sharding-jdbc. After sub-database and sub-table, it can support 5-year billing query, using 16 MySQL, a total of 512 tables. How are data centers and sub-databases and sub-tables synchronized? The data center is still the same as before, first write the data to the temporary table, the conversion service will configure the rules of sub-database and sub-table, fetch data from the temporary table, and finally write it to the formal table. The data center has an ETL tool, but it doesn't support scaling, so it doesn't write directly to the formal table.
Figure: Implementation of sub-database and sub-table in Billing 2.0
After running for about two years, we found new problems. Although the sub-database and sub-table can meet the business needs, there are great constraints in terms of scalability. These constraints include: First, the field expansion is difficult, we divided 512 sheets Table, if there is a new business, you need to add a new field. At this time, the DBA will be very painful, and you need to add a new field to each sub-table. Second, it is extremely troublesome to expand the capacity. If the data is not accurately estimated at the beginning, the rules for sub-database and sub-tables must be changed later, from 512 tables at the beginning to multiplied by 2, to more than 1,000 tables. DBA migrations are complex and error-prone. Third, the synchronization also requires an intermediate table, so the data synchronization time is still the same, and the system online time is limited. Fourth, the regular creation and cleaning of sub-tables is also cumbersome. Every day, some daily tables will be deleted, such as the table five years ago, and then the table for the next day will be created. When developing, it is always necessary to use This timer does the cleanup and creation. Fifth, in terms of operation and maintenance, it is necessary to operate and maintain multiple databases.
The sub-database and sub-table system brings additional burdens to development and operation and maintenance, so we introduced TiDB into Guosen, and used the NewSQL database to support 5-year bill query, solving the problems caused by sub-database and sub-table, and the bills entered 3.0 era. Under this architecture, the data center directly pushes the data into the formal table every day, the query service directly queries TiDB, and a cache is added on top of the query service. After the introduction of TiDB, the efficiency of synchronizing warehousing every day has improved by about 70%. At present, TiDB has been deployed in the three computer rooms of Guoxin, and the domestic Haiguang x86 server has been piloted in the Dongguan computer room recently.
Figure: Billing 3.0 TiDB distributed database implementation
Next, let's talk about the experience of using TiDB for more than a year. From the development point of view, the first is to delete large amounts of data. There is no experience at the beginning. It is still according to the old routine. For example, to delete the data of a specified day, it is directly DELETE SQL WHERE = "a certain day", it was Saturday, The operation and maintenance alarm shows that TiDB machines hang one by one. After investigation, it is found that the amount of data involved in DELETE SQL is too large. Subsequently, the transaction size was adjusted to 10G, and the memory of the TiDB machine was expanded to 64G. This part is an expansion at the system level. On the other hand, we also made corresponding transformations on the application side to delete them in batches. In the case of large data deletion, consider using the Range partition table and directly truncate or drop the partition.
The second experience is that for the new TiDB business, try to use AUTO-RANDOM as the primary key. For the continuous large-scale insertion scenario, in most cases, the insertion hotspot can be avoided. For data synchronization in multiple computer rooms, TiDB requires a primary key or a unique index. No primary key or unique index will cause OOM of the synchronization program. When there is a large amount of data in the table, if you want to add this primary key, the whole process will be more troublesome.
of high availability disaster recovery architecture in three places
At the beginning, TiDB was only deployed in the Dongguan host room of Guoxin as a pilot project. Subsequent operation and maintenance required TiDB to do disaster recovery deployment related work, and the application must achieve high availability and multi-activity in the three places. In the past, the application of each computer room was to access its own local TiDB, and disaster recovery drills were conducted every quarter to verify the availability of disaster recovery in Shanghai and Futian in the same city after the failure of the entire main computer room in Dongguan.
The teacher of PingCAP gave three solutions at the beginning. The first solution is the simplest and most straightforward, deploying a separate TiDB cluster in all three computer rooms. The Dongguan computer room reads and writes, and uses TiCDC or binlog to synchronize the corresponding data to the disaster recovery clusters of the other two computer rooms. The advantage of this solution is that it is relatively simple. When doing disaster recovery drills, if the main computer room in Dongguan is down, the applications in the other two computer rooms basically do not need to do anything, and can still be used. The problem with this solution is that the number of replicas is relatively large, 9 replicas are required, and the synchronization delay may be larger.
The second solution is the more classic two-place three-center . This solution has higher requirements for the network, and if the Dongguan computer room hangs, the Futian computer room needs to be manually restored. The third one is , a dual-center in the same city, which treats Dongguan and Futian as one cluster and synchronizes data to the Shanghai disaster recovery cluster, that is, two clusters. However, this plan will be more complicated to restore during disaster recovery drills.
Figure: Comparison of multi-room solutions
After comparing the three schemes, Guosen finally adopted the simplest binlog synchronization scheme, deploying a TiDB cluster in each computer room. Of course, this is also achieved according to business characteristics. Guoxin's business basically uses query, and there will be no simultaneous writing in multiple computer rooms, so the simplest method is finally adopted. In the process of multi-computer room deployment and implementation, we did some migration and import work: at the beginning, TiDB was only deployed in Dongguan computer room. Because of the unfamiliarity with the use of TiDB, some business tables did not have primary keys or unique indexes. After building a new TiDB cluster in Futian computer room, we found that when synchronizing the two clusters, the synchronizer was directly OOM, which was caused by no primary key or unique index. At that time, the largest table had reached more than 6 billion. It was impossible to add a primary key or a unique index directly to the table.
So how do we do it? First, use Dumpling to export the table to a CSV file. The CSV file is named with the name of the table. After the export is complete, create a new table on the original database, add the primary key or unique index, and then export the exported table. The two CSV files are renamed the same as the new table, and then the data is imported into the new table through Lightning, and finally the old table and the new table are renamed, the new table is named as the official table, and the official table is renamed Named as a backup table, in this way, the impact on the business can be minimized. In the process of importing and exporting, users are basically indifferent. figure: table processing without primary key
Service
Finally, talk about the exploration of the observable After the application uses the microservice architecture, there will be a lot of nodes deployed, and the whole process of the call chain will be very complicated. At this time, it will be very complicated to locate a problem. According to the currently popular concept of "service observable" in the industry, we do An application was developed to assist in problem location development. This "service observable application" mainly consists of three parts, one is the log, the second is the indicator, and the last is the tracking chain. We made enhancements to the log part, converted the system request and response logs into TiDB through ETL tools, and then did visualization-related work.
Figure: Observability of Golden Sun Service
visualization of and 161f20ab939e77 is a requirement that the development has always mentioned. The collected logs are generally viewed in the Kibana of ELK, and they are all text, which is very unintuitive. The optimization we made is to import the request and response logs of each microservice into TiDB, and use the "service observable" application to display. If the customer has any problem, enter the customer's mobile phone number to see what the customer has done in a certain period of time, so that the problem can be quickly located. At the same time, we also visualize the content and response, which looks more convenient. At that time, I also encountered some problems with TiDB warehousing, because the characteristics of this business were different from that of bills. The bills were basically inserted every night, and users performed query operations during the day, but the log visualization was during the opening of the market (9 in the morning). From 3:00 p.m. to 3:00 p.m.), it will do a large number of continuous insertions, and there will be fewer query operations. This is an operation and maintenance related system, so it didn't use a good disk. After the system went online, it was found that the entire TiDB became very stuck. At first, I thought that there was a problem with the inserted program or the query program. We did a lot of optimization and found that it still didn't work. , and finally, after upgrading the disk, it is found that the entire performance has been directly improved. The experience we have gained is that if you use TiDB, you must choose a good disk to ensure processing efficiency.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。