头图
Author: Li Jingxia Source: Light Finance

In the era of "data is king", financial big data is known as a "gold mine to be tapped", and its value has become a consensus.

Since big data was first written into the government work report as a national strategy in 2014, financial institutions have continuously introduced big data platforms and built big data systems.

Nowadays, big data has become a key part of the core competitiveness of financial institutions. Among them, data middle-end and big data platforms have become the key to the comprehensive digital transformation of financial institutions. Financial institutions are increasingly serving customers, innovating products, and internal management. Depends on "numbers".

It is worth noting that in recent years, the rise of the data middle platform has become the king of topics in the financial industry, and the big data platform has been discussed relatively little. With the rise of cloud computing, AI and other technologies and the deepening of the integration of big data, the big data platform has stood at a new juncture.

01 New Pass

The application of big data technology and artificial intelligence and other technologies is turning the bank's data into a high-value asset of the bank, promoting technology empowerment and scene application innovation, and then promoting the reconstruction of internal IT systems and the transformation of the bank's organizational structure.

"Establish and improve an enterprise-level big data platform, and fully release the core value of big data as a basic strategic resource." The "FinTech Development Plan (2019-2021)" issued by the central bank once mentioned. What is a big data platform?

According to the definition of the "General Requirements for Financial Big Data Platforms" (hereinafter referred to as "Requirements") released on December 29, 2021, a financial big data platform is an enterprise-level, distributed, open, and unified big data platform, which should include data access components related to data entry, data storage, data processing, data analysis and data services.

The overall goal of the financial big data platform is to help financial institutions to complete the development, deployment and management of financial big data applications more efficiently and quickly. More real-time data and Internet business challenges.

When it comes to big data computing technology, the open source big data suite Apache Hadoop cannot be avoided. After the Hadoop function incubation was complete in 2008, Cloudera (a commercial company) launched its own Hadoop distribution, CDH (Cloudera's Distribution Including Apache Hadoop). CDH is also open source, but it is more user-friendly in terms of stability, management, deployment, operation and maintenance, and brings help to the implementation of Hadoop.

Around 2011, Hadoop technology entered a mature stage, coupled with the rapid expansion of data volume with the rise of Internet finance, traditional data systems could no longer meet the needs of financial institutions, so Hadoop systems with distributed characteristics entered the selection list of these institutions.

It will take two years for financial institutions to intensively implement Hadoop-based big data platforms. For example, Agricultural Bank of China started to build an autonomous and controllable big data platform in 2013, and finally chose the MPP database + Hadoop mashup architecture; in 2014, ICBC officially built a big data platform based on Hadoop technology.

After 2015, the mobile Internet has accelerated the transformation of customer behavior patterns, and financial institutions have entered a new era of digital transformation. They not only deal with increasingly massive data, but also analyze customer data in response to changes in customer behavior patterns. Precision Marketing, etc. At this time, many organizations switched functions such as data analysis to the Hadoop system.

Statistics of the 40 or 50 big data platforms tested by the China Academy of Information and Communications Technology in 2019 show that more than 70% of them are products based on the secondary research and development of CDH and HDP community editions.

The current big data platform is standing at a new juncture.

On the one hand, Cloudera previously announced the end of CDH6 and HDP3 service support at the end of 2021 and March 2022, and instead launched a new product CDP. This means that the CDH and HDP systems used by financial institutions in the past are facing a comprehensive migration, and new alternative solutions are urgently needed.

On the other hand, under the wave of financial technology innovation, the localization of big data platforms of financial institutions is a trending choice. The Central Bank's "Fintech Development Plan (2022-2025)" requires that it is necessary to speed up the formulation and implementation of the financial industry's key software and hardware information infrastructure security plans, and effectively improve the financial industry's key software and hardware information infrastructure security capabilities.

In this context, where should the big data platform of financial institutions go? At this new juncture, domestic third-party financial technology manufacturers have stood up, relying on their own accumulated capabilities and experience over the years, to provide a wealth of financial institutions big data platform solutions.

02 New trends

In addition to changes in the industry environment, big data platform technology has also shown some new trends, making financial institutions put forward higher requirements and missions for big data platforms.

One is fusion. The integration of big data, cloud computing, AI and other technologies has made the deployment of platforms on the cloud a major trend. However, due to the financial industry's consideration of the risk and security of the use of public clouds, the current hybrid cloud architecture is more dominant. Cloudera's CDP is a hybrid cloud/multi-cloud big data platform.

The other is the integration with AI. For example, AI intelligent algorithms can be applied to big data. On the one hand, big data provides data support for AI; on the other hand, some conventional algorithms used by AI can be fed back to the big data platform to combine with big data The characteristics of the data can make accurate product recommendations to customers.

IDC China released the 2021H1 big data platform market share report, showing that the overall market size reached 5.42 billion yuan, with a growth rate of 43.5% compared to the same period last year. construction and policy-driven new infrastructure, etc.”

The second is real-time. After years of deployment of big data platforms by financial institutions, the infrastructure has gradually taken shape, and supporting the high efficiency of their business scenarios has become a new requirement. At present, with the deep integration of big data, cloud computing, AI and other technologies, the market also believes that "big data" is rapidly moving towards the era of "fast data". For financial institutions, it is to improve the "real-time nature" of big data.

For example, ICBC started to build big data high-efficiency scenarios in 2020, that is, in addition to batch computing, the big data platform also needs real-time computing, online analysis, data API and other platforms to shorten the end-to-end closed-loop time of data and form online High concurrent access capability improves the timeliness of data-enabled business.

The third is forward-looking. The big data platform supports financial institutions to better understand customers, and can also provide services for customers to make forward-looking layouts. It is also mentioned in the "Requirements" that the specific functional technologies of the financial big data platform can be divided into basic requirements and enhanced requirements. Among them, the enhancement requirements are proposed from the development trend of technology and the forward-looking needs of financial users. This means that financial institutions need to proactively improve the construction of big data platforms from the perspective of customer needs.

Lastly is security. Whether it is the autonomous and controllable security of the big data platform technology used, or the security requirements for the data itself, it has been raised to a higher level. This puts forward higher requirements for financial institutions to choose or build big data platform cooperation.

With the addition of third-party manufacturers, financial institutions have more choices at the level of independent and controllable technology. The localization trend ushered in a strategic opportunity period for third-party service providers.

Netease Shufan launched a data development and management platform - a one-stop big data management and development platform, including the two core parts of the big data platform and the data center, mainly covering big data development, task scheduling, data quality, data governance and data services.

The big data platform layer is essentially a Hadoop distribution. Compared with the community version, it integrates the latest version of Spark and has perfect permission control and auditing capabilities, which can greatly improve the efficiency of offline business ETL. In addition, Shufan has carried out a large number of functional enhancements and performance optimizations for Impala components to ensure the stability and performance during use.

What is worth paying attention to is whether localized products can meet the needs of financial institutions? How do financial institutions choose the new direction of big data platforms?

03 New options

To answer this question, we must first clarify what financial institutions currently need.

First, independent and controllable financial technology, data security and controllability, cost control, and rapid service response are the keywords that financial institutions currently demand for big data platforms. Finance is mainly about security, and its technical requirements for data security and business continuity assurance are usually higher than other industries.

For example, in terms of cost control, a financial institution has strong IT technology strength. It has more than a dozen clusters, and the number of nodes is expected to be hundreds. At this stage, the data platform has a cost of 2-3 million software cooperation. In addition, the CDH version is no longer updated, and a group of teams needs to be specially trained to be responsible for maintenance, which will also increase the cost.

This makes the basic software financial institutions of the big data platform often choose the products of third-party manufacturers. Faced with such a situation, financial institutions may continue to migrate to CDP, or choose the basic software of big data platform with localized technology for migration.

Second, no matter what product is chosen, financial institutions will pay attention to the "popularity" of big data platform products, that is, whether the underlying platform used is highly popular, such as Hadoop, Spark, etc. In addition, they prefer the product to have an open source nature.

"The dependence of financial institutions on the whole big data system is becoming more and more obvious." Jiang Hongxiang, head of the NetEase Shufan big data basic technology platform and senior architect, told Light Finance that the big data platform is based on a low-cost server. Above, it can be distributed and expanded infinitely, so its cost, scalability and stability are all good choices for financial institutions.

In addition to the product itself, financial institutions are paying more and more attention to the strength of third-party fintech companies and product services. Strong technical support, comprehensive ecological compatibility, timely response to bug fixes, and rapid update iterations are all capabilities that suppliers need to have.

Of course, judging from the current environment, the domestic big data platform has formed the following advantages: independent and controllable, and the control is in the hands of the enterprise; local service responds quickly and communicates smoothly; cooperation and co-creation, in-depth business, customized needs support.

Take NetEase Shufan's data development and management platform as an example, it has an open source base, and supports compatibility with CDH core component ecology. For example, in a standard product project, it can also support 20% to 30% of customized development requirements.

In the process of co-building a big data platform with a securities company, NetEase Shufan mainly promotes development and cooperation on several sub-modules such as data management, security center, data standards, and data quality, and will also customize according to the special needs of the securities industry. , such as the enhancement of user portraits, typical trading day scheduling, that is, data is only processed on trading days, etc., so as to form a platform solution that is more in line with industry characteristics.


NetEase Shufan Financial Big Data Solution Architecture

At the same time, NetEase Shufan also supports a one-stop data center and rich data products. Based on the underlying components of the big data distribution, users can selectively provide one-stop data middle-office services and rich data products to facilitate business out-of-the-box use. At present, NetEase Shufan has served a number of financial industry customers, including a financial technology subsidiary of a state-owned bank, Huatai Securities, Northeast Securities, Huaxia Wealth Management, Huafu Securities, etc., and the implementation has been fully verified.

The launch of products that meet the needs of current financial institutions at this time of the big data platform is also mainly due to NetEase Shufan’s years of deep cultivation in the field of big data, accumulating a complete big data R&D ecosystem and rich experience in production line operation and maintenance.

Before Hadoop came out, NetEase started to make its own distributed storage system in 2006. In 2011-12, the Hadoop system was introduced to support email, news and other businesses. In 2015, in order to solve the problem of scattered components and lack of unified management, NetEase began to develop big data platform tools and integrated platforms similar to CDH. In 2018, when big data was booming, NetEase Shufan developed a data center, which became a common tool for all BUs.

In the past 4 years, NetEase Shufan has also formed a set of methodology for the data center.

The research and development of big data technology needs the support of a strong team of scientific and technological talents. NetEase Shufan currently has hundreds of people in its big data platform and data middle-office team, and can provide a three-in-one service guarantee of technical support, customer operation and maintenance, and core R&D.

With its strong technology, strong product compatibility and service advantages, NetEase Shufan's big data platform products have attracted the attention of many financial institutions.

"Many financial customers prefer privatized deployment of cloud computing, so Shufan will be slightly slower in the scenario where the big data platform in the financial industry is deployed in the cloud. In the non-financial industry, we have actually moved towards the cloud platform. Turned. ‍‍" Facing the future trend of cloudification of big data platforms, Jiang Hongxiang said.

According to statista's calculations, the global Hadoop and big data market size in 2019 is about US$34 billion, with a 5-year compound annual growth rate of 28.5%. With the in-depth advancement of the digital transformation of the financial industry, financial institutions are becoming more and more dependent on big data, and the market cake of big data platforms will continue to grow.

Technology manufacturers with localized big data platforms entering the market with new products are an inevitable choice for the financial industry, and financial institutions that take the lead are expected to take the lead earlier.


网易数帆
391 声望550 粉丝

网易数智旗下全链路大数据生产力平台,聚焦全链路数据开发、治理及分析,为企业量身打造稳定、可控、创新的数据生产力平台,服务“看数”、“管数”、“用数”等业务场景,盘活数据资产,释放数据价值。