Preface
With the vigorous development of public clouds such as AWS, Alibaba Cloud, Azure, and Tencent Cloud, more and more companies are beginning to consider public clouds.
I have been doing architecture consulting over the years, and found that many traditional companies have no clear design ideas in the process of cloud infrastructure. They just buy cloud products and use cloud products, thinking that this is going to the cloud.
But in fact, if there is not a good cloud infrastructure architecture design, subsequent use of the cloud will become difficult to maintain, and the expected results will not be achieved, and the cost will increase.
In this article, I will share some public cloud design experience and ideas in past projects, and provide you with some references for how to design cloud infrastructure architecture in microservice-based scenarios. The cloud here refers to public clouds such as Alibaba Cloud and AWS.
Why go to the cloud
The benefits or value of going to the cloud have been discussed in major articles.
I will list a few benefits and values perfunctorily here.
- lower the cost
- Scalability
- Professional operation and maintenance
- speed
Reduce costs
Cost reduction is mainly to reduce two types of costs.
- Companies no longer need to hire professional operation and maintenance personnel to be responsible for maintaining servers. (Except for large companies, they usually need a large number of operation and maintenance personnel to maintain the company's cloud assets or build their own cloud computing)
- Companies no longer need specialized personnel to develop various tools for operation and maintenance. Nowadays, common cloud computing platforms contain rich functions and complete after-sales systems.
Scalability
Scalability cannot be achieved by traditional servers, which is a big reason why cloud computing is becoming more and more popular today.
We can expand service performance according to business needs, not only that, but we can also reduce service performance to save costs.
Professional operation and maintenance
After going to the cloud, it is not that there is no operation and maintenance, but the operation and maintenance is handed over to the professional people of the cloud computing platform. And you only need to care about how to build your own products based on these cloud infrastructures.
speed
Speed means that all aspects of speed have been improved. For example, you only need to spend 1 minute to create a new server, and you only need to spend 1 minute to expand a certain service. Due to the reduction of various setup and configuration time, the development time is also shortened. Shorten the trial and test time, and provide customers with usability faster.
Cloud infrastructure architecture maturity assessment
So how do we know that our cloud infrastructure architecture is good enough after going to the cloud?
Here we need a set of cloud infrastructure maturity assessment model.
I summarized this cloud infrastructure architecture maturity assessment model based on the architecture consulting work over the years and combined with multiple projects. it
It is mainly divided into 8 models and 5 levels.
The 8 models are:
- scalability -cloud infrastructure can scale freely according to business needs
- Reproducibility -Cloud infrastructure can be quickly replicated according to business needs
- recoverability -cloud infrastructure can be automatically or quickly recovered after it is down
- Availability -Cloud infrastructure is designed to ensure high availability of services
- Security -Cloud infrastructure design can have a very high security design
- can be quantitatively managed -cloud infrastructure should be quantitatively managed to optimize costs
- maintainability -cloud infrastructure should have simpler maintainability
- * *Cloud infrastructure will be combined according to business needs
The 5 levels are:
- original -no cloud infrastructure at all
- Basic level -tried some basic cloud infrastructure
- standard -all infrastructure is on the cloud
- mature level -all infrastructures are on the cloud and master the best practices of cloud infrastructure architecture
- leading level -self-built cloud computing
Combining the above model, we can get the following scoring.
Based on this scoring, we can get the following evaluation chart.
So next, let's talk about this architecture design.
Mainly talk about VPC design, access control design, security design and database design.
VPC design
The full name of VPC is Virtual Private Cloud, which is a logically isolated private network on the cloud.
The main purpose of using VPC is to isolate security and isolate different environments. One is to avoid environmental pollution, and the other is to ensure safety.
Therefore, as shown in the figure below, an enterprise usually designs the following VPC environments:
- Product environment
- test environment
- Development environment
- UAT environment
product environment is the VPC environment in which all our online products run, and only one environment that users can touch.
test environment is used for testing, and it is also an environment that most company developers and testers can contact.
development environment is the environment where daily work is located. It is usually connected to the office network. Here we will put our git, pipeline, mirror warehouse, product library, etc.
UAT environment is usually for customers to verify, for example, before going online, customers need to verify whether it meets expectations. The reason why this environment cannot use the test environment is because usually customers need to import some real data for testing, and the UAT environment needs to be guaranteed Clean.
In addition, if there are requirements for deploying systems in multiple regions, multiple VPCs need to be used, because VPCs are regional-level resources and cannot be cross-regional.
Access control design
Our cloud resources are not open to everyone, especially the access control to the product environment should be particularly strict, one is to prevent misoperation by insiders, and the other is to prevent hackers from intruding.
But I have encountered many customers, they do not have any access control design, all developers share one or several accounts. This is a very dangerous way of use, which is not conducive to management and also has many risks.
Today's public clouds have access control functions, usually called Resource Access Management.
The design of access control is mainly considered from the following latitudes:
User Management
- Users are divided into: real users, virtual users
- Real users are those real people. Such as employees and users.
- A virtual user is an account assigned to a certain system. For example, a certain system needs to have permission to upload pictures.
Read and write separation
- Usually some people should only have read-only access.
- Some accounts provided to the system should only have read-only permissions.
- For example, the account that accesses the user avatar in the object store should only have read-only permission.
- The administrator or upload image function needs to have write permission.
Role management
- Different people may have the same role.
- The same person may have different roles.
- The role determines which set of permissions we have.
Secondly, whether the access control of the cloud infrastructure needs to be integrated with the enterprise's own single sign-on is also a design that needs to be considered.
To sum up, access control should follow the principle of least privilege in order to maximize the security of the system.
Safe design
The misunderstanding of most people is that I already use the cloud, and then buy a firewall or something, it is very safe.
But in fact, the public cloud is completely exposed to the Internet, so it also requires a complete security design to ensure the security of the cloud infrastructure.
Hide what should be hidden into the private network and expose only the least information.
The security design of infrastructure mainly includes several aspects:
cyber security
- Including transmission security, such as how data is encrypted for transmission
- Whether the network is exposed
- Whether the network design is reasonable
Data Security
- Is the data adequately protected
- Is the data exposed
Permission security
- Is it designed in accordance with the least privilege
- Whether the read and write permissions are separated
To sum up, two principles need to be followed when designing:
- Zero trust network
- Least Privilege Design
Database Design
This is mainly related to the design of high availability and high performance of the database.
High-availability solutions usually have 3 directions:
Active and standby architecture
- Usually there are multiple nodes, and different nodes will be in different availability zones
Disaster tolerance
- Disaster tolerance is mainly divided into remote disaster tolerance and intra-city disaster tolerance
Backup and restore
- How to quickly and automatically recover after the database is hung up
- How to retrieve data lost during recovery
The high-performance design here does not include the design related to the sub-database and sub-table, because this is only about the architecture design of the infrastructure.
Usually need to consider:
- How to expand elastically?
Do you need to read and write separation?
- How to ensure data consistency after read and write separation?
- According to the business scenario, how many reading examples are needed?
- How to design the cache
The following is a general high-performance and high-availability database architecture for your reference.
Cloud infrastructure architecture design
To sum up, a common microservice-based cloud infrastructure design will probably look like the picture below.
When designing, we may need to consider far more than the things in the figure below.
For example, we have to consider:
- How multiple VPCs communicate
- How the cluster is orchestrated
- Database selection
- What tools are used for log collection to be easy to collect and search
- What tools can be used for operation and maintenance monitoring to comprehensively monitor and intelligently alert
- Does MQ satisfy some special scenarios?
- Are there any special requirements for third-party services
- Under this architecture, can we dynamically expand horizontally and vertically
to sum up
I hope that some of the above sharing can help you provide some references when designing cloud infrastructure architecture.
This kind of architecture design involves a very wide range of things, and different projects will have different design requirements.
Therefore, the final design must be combined with the actual situation of the project, and meeting business needs is a good design.
Remember one sentence, there is no correct design, only a design that just happens to be applicable.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。