Since the China Securities Regulatory Commission issued the "Management Measures for Client Transaction Settlement Funds", the main mode of operation of securities companies has changed from the branch business department model to the headquarters centralized model. Under the background of the construction of the centralized trading system, the importance of relational databases has become more prominent. Oracle The database products represented by , DB2 occupy almost all kinds of core business systems of the entire securities company.
The core systems of the securities and fund industry can be divided into two categories as a whole, one is online business centered on stock trading, and the other is "T + 1" business centered on fund/asset management. For databases, the former emphasizes extreme low latency and high availability; the latter usually involves batch processing, large transactions, complex queries, backup and recovery, and has high requirements on the comprehensive capabilities of the database.
This article focuses on describing the overall solution and implementation practice of OceanBase in the fund/asset management scenario in recent years.
The business structure and database practice of fund asset management
business model
The business structure of a typical asset management scenario is as follows:
Customer inquiry system: end users execute requests for buying, selling, and inquiries through the direct sales channels or Internet distribution channels of securities companies/funds;
TA registration and transfer system: register the user's transaction request, in each transaction link, calculate the user's actual buying share and selling amount according to the asset valuation; batch processing in the two dimensions of user and transaction;
Valuation system: calculate the net value of assets, interest income, and batch processing of asset dimensions;
Investment trading system: transaction core accounting processing; high concurrency and low latency, satisfying ACID;
Investment analysis system: statistical analysis, including a large number of large transactions and complex queries.
Disaster Recovery Requirements
- Computer room switching drill:
Online transaction business has high requirements for business continuity. Compared with the core of the bank, in the event of a disaster, the core of the bank's accounting needs to give priority to RPO to ensure data consistency. Under this premise, the system can accept a certain period of time. However, the core securities trading system needs to give priority to ensuring the availability of business, and the loss of system unavailability during its trading period is often calculated in seconds. Therefore, the securities industry needs to do various high-availability switching drills on a regular basis.
- Fast data recovery:
The T+1 type of transaction business has higher requirements on data consistency. Once an error occurs in batch running, the data needs to be restored to a certain point in time for re-running, and batch tasks need to be completed within the batch running time window. Therefore, on the one hand, it is necessary to Run batches stably and try not to make mistakes. Once an error occurs, the data needs to be restored at a reliable and fast point in time.
Database solution practice for business model
- Customer inquiry system:
In the context of the expected growth of their own sales channels, in order to support high concurrency scenarios in the future, some leading companies have started to specialize in dedicated databases, extracting the customer query module from the data warehouse platform and running it separately on a distributed relational database.
In the customer query module of the asset management system, although the SQL basically contains information such as "user ID" as an equivalent condition, its SQL is often not a simple check, but includes various calculations such as de-reordering and more. Moderately complex SQL for table joins. In this case, OceanBase's partition table + TableGroup + replication table can try to put the same data in the join on the same node to reduce the cross-machine RPC request brought by the distributed system. At the same time, the integrated design of OceanBase can avoid computing as much as possible. Data merge in the process. Combining various features, OceanBase can provide high-concurrency and low-latency query capabilities in such scenarios.
For the write load, if the data comes from the data warehouse platform, it is recommended to clean the data on the data warehouse platform before importing it; if you choose to clean the data directly in the customer query system, it is recommended to control the transaction size, and try to avoid one SQL processing one time. Data for a whole period of time (such as one month), although OceanBase 3.X already supports large transactions, in order to maintain the robustness of the system for continuous long-term operation, it is recommended to fix the transaction size, otherwise as the business grows, the transaction size by time dimension will also be Continued growth, after reaching a critical point, you still need to go back and adjust the business logic.
- TA registration and transfer system:
The introduction of a distributed database and its scalability can better cope with the demand for a substantial increase in the number of fund users and transactions in the future while keeping the business structure unchanged.
The system involves two dimensions: user records and transaction records. At the same time, part of the clearing must be completed in a serial manner. Therefore, a better solution in terms of architecture is unitization, that is, the business is divided into several data units that can independently complete key business processes. Interaction between DBs is avoided, different business units are deployed in tenant or user dimensions, and their master nodes are placed on different hosts, thereby improving overall batch processing efficiency and ensuring hardware resource utilization.
Within each unit, it is necessary to ensure that its data is relatively uniform and the execution efficiency is equivalent, because the overall processing time of the system is essentially the time when the last unit completes the liquidation, which requires the database to have good tenant resource isolation capabilities.
Compared with the sharding unit, it is the aggregation unit. The aggregation unit has all the data and is responsible for carrying the global analysis load. This requires the database to have good AP/TP load isolation capabilities.
- Valuation System:
The system only involves the capital dimension, and its batch processing logic is relatively simple. Through the hot large table partition + TableGroup, the load can be evenly distributed to each node to achieve high performance. The current Oracle production environment runs in a set of high-end Oracle databases. , the batch running time is at the hour level, and the CPU reaches full load during the peak period. OceanBase uses twice as much data as the production for batch running pressure testing. In the 2-2-2 cluster environment of domestic chips, it can be completed within 6 minutes. .
- Investment Analysis System:
A common problem online is that the occasional execution plan goes wrong, which leads to a longer time for complex query SQL. This problem is also one of the common factors that affect performance in Oracle databases. OceanBase supports HINT to describe how to fix the plan, and has bindings. The ability to determine the execution plan goes a step further. The current OCP 3.2.1 version already supports the white screen binding history execution plan.
For the above scenario, if the SQL has been running normally before the execution plan is changed, the previous normal execution plan can be found. If the root cause is indeed the execution plan change after locating, the historical execution plan can be directly bound with one key. This capability greatly reduces the operational risk and complexity of tuning complex SQL plans in an online environment.
In addition, complex analysis SQL often runs in the PL environment in large quantities, and business logic is in stored procedures, functions, and anonymous blocks. The degree of support of the database itself to the PL language also determines the difficulty of business migration and transformation. OceanBase is compatible with The basic common syntax of PL, in this project, the PL code originally running in Oracle has almost no business logic modification.
Database Deployment Architecture
- In-cluster high availability:
The OceanBase cluster is deployed in the same computer room, and the high availability and effect it provides can be considered as an Oracle RAC with a higher upper limit of the expansion capacity of a group of nodes and no single point of storage risk: automatic failover without the participation of manual or external tools, multi-node Multi-active provides services and adds nodes to complete automatic resource expansion. OceanBase's high-availability base based on the Paxos majority consensus protocol has experienced the load test of Alibaba/Ant Group's "Double Eleven" level, and TableGroup, intra-cluster load balancing and other daily operation and maintenance methods also rely on high-availability switching masters. It can be considered that the high availability in the cluster is a mature and stable capability, which can effectively resist the business interruption caused by the unplanned failure of the infrastructure.
- Inter-cluster high availability:
The high-availability protocol based on the majority consensus cannot effectively resist the risks in the dual-machine room scenario. If the majority node is placed in the main room, the main room will fail to provide services; if the majority is placed in the backup room, the main machine room , it has the capability of failover, but if the standby computer room fails, the availability of the main computer room will be affected, and the overall system availability will not be substantially improved. In the absence of three computer rooms in the same city, if you want to use the high availability in the cluster to achieve "active-active" ” level availability, at least a third computer room needs to be introduced to store log copies for Paxos voting.
In this context, OceanBase provides cluster-level active and standby capabilities, and provides high-availability capabilities at the computer room level for dual-computer room scenarios. For the switchover drill in the computer room, when the switchover drill is performed with production data, the roles of the active and standby clusters can be exchanged through switchover. After the drill, the business load returns to the main engine room, and the switchover is executed again on the database side. When performing drills without production data, the second standby cluster is decoupled from the main cluster so that it becomes a separate cluster. After the test data is run, the standby cluster is rebuilt.
Compared with the well-known stand-alone database, OceanBase is not well known by the majority of technical personnel in terms of development, use and operation and maintenance management. At the same time, the product architecture and functional details are constantly iterating. The successful launch of many core businesses is inseparable from the trust of customers and partners. OceanBase will also focus on solving real business scenario problems, continuously improve product capabilities, and use technology to make data management and use easier.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。