Introduction to DataWorks function practice series, to help you analyze the pain points in the process of business realization and improve the efficiency of business functions!

Past review:

Through the introduction of the first two issues, you can understand the main knowledge points of using DataWorks for data synchronization: data synchronization scheme and resource group. In the actual application process, we often need to isolate the development and production environment, and the development environment is used for Data synchronization test, the production environment is used for the synchronization processing of production data. This issue will introduce you to the main knowledge points of DataWorks to realize the isolation of development and production environments. 功能实践速览03.png

Feature recommendation: Standard mode-isolation of development environment and production environment

In order to facilitate user production data with different security control requirements, DataWorks provides you with two workspace modes: simple mode and standard mode simple mode cannot set development environment and production environment, while standard mode provides at the same time The development environment and the production environment are separated, and you can perform data task processing in the development environment and the production environment respectively.

Part1: DataWorks workspace in simple mode and standard mode

First, I will introduce you the main difference between the two modes of workspace.

<span>simple mode</span> <span>standard mode</span>
16135d1936032e 1613519d1936032 (Or an EMR cluster, Hologres database, etc.), the environment is regarded as a production (PROD) environment. </span><span class="lake-card-margin-top lake-card-margin-bottom"><img src="https://ucc.alicdn.com/pic/developer-ecology/5905f40277b34b25b42339b43191595c.png" class="image lake-drag-image" alt="Simple working <span>Under the standard mode working space, a DataWorks space corresponds to the lower layer Two MaxCompute projects (or two EMR clusters, Hologres databases, etc.), one is regarded as a development (DEV) environment and the other is regarded as a production (PROD) environment. </span><span class="lake-card-margin-top lake-card-margin-bottom"><img src="https://ucc.alicdn.com/pic/developer-ecology/06025e5dcc2a4570bcea8a5c681a7e4a.png" class="image lake-drag-image" alt="standard working
It can be seen from the above that the standard mode workspace of DataWorks can isolate the development and production environments. Therefore, if you use the standard mode, the precautions for use in the development environment and the production environment are inconsistent when performing data access and permission control after using the standard mode. ## Part2: Data Access in Different Mode Workspaces You can set the data access mode of the workspace under different modes Workspace Configuration> Calculation Engine Information 工作空间数据访问.png production environment) </span>
<span> Calculation Engine Type </ span> <span> environment </ span> standard mode workspace simple mode workspace <span> (Development Environment
MaxCompute <span>Development environment</span> <span>The current login task (not optional): the person who performs the task by default</span> > <span>Page running task (not optional): The default is the person who performs the task (currently logged in)</span><span>Scheduled access identity (optional):</span><ul><li>< span>Alibaba Cloud main account</span></li><li><span>Alibaba Cloud RAM role</span></li></ul><ul><li><span>Task leader: task Owner account identity</span></li></ul>
<span>Production environment</span> <span>Scheduled access identity (optional):</span><ul>< li><span>Alibaba Cloud main account</span></li><li><span>Alibaba Cloud RAM user</span></li></ul><ul><li><span>Alibaba Cloud RAM role</span></li></ul>
<span>E-MapReduce</span> <span>Development environment</span> <ul><li><Hadoopspan> Access identity users in shortcut mode: unified use of users in the cluster. </span></li><li><span>Access identity in safe mode: task performer</span></li></ul> <ul><li><span>Access in shortcut mode Identity: Unified use of Hadoop users in the cluster. </span></li><li><span>Access identity in safe mode (optional):</span></li></ul><ul><li><ul><li><span >Task owner</span></li><li><span>Alibaba Cloud main account</span></li></ul></li></ul><ul><li><ul ><li><span>Alibaba Cloud RAM users</span></li></ul></li></ul>
<span>Production environment</span> <ul>< li><span> Access identity in shortcut mode: uniform use of Hadoop users in the cluster. </span></li><li><span>Access identity in safe mode (optional):</span></li></ul><ul><li><ul><li><span >Task owner</span></li><li><span>Alibaba Cloud main account</span></li></ul></li></ul><ul><li><ul ><li><span>Alibaba Cloud RAM user</span></li></ul></li></ul>
<span>Hologres</span> <span>Development environment </span> <span>Page running task (not optional): The default is the person who performs the task (the currently logged-in person). </span> <span>Page running task (not optional): The default is the person who performs the task (currently logged in)</span><span>Scheduled access identity (optional):</span><ul>< li><span>Alibaba Cloud main account</span></li><li><span>
<span>Production environment</span> span> <span>Scheduled access identity (optional):</span><ul><li><span>Alibaba Cloud main account</span></li><li><span>Alibaba Cloud RAM user </span></li></ul>
## Part3: Authorization Management Features of Different Mode Workspaces DataWorks adopts the RBAC permission model for users to manage all the visible functions of DataWorks pages and the usage permissions of APIs. At the same time, this permission system has a natural mapping relationship with MaxCompute's RBAC role system. For details, please refer to Member and Role Management and Member Roles and Authority relationship . The permission management features and advantages and disadvantages of different workspace types are inconsistent. The following table compares and introduces the permissions subdivision characteristics of the two space types.
<span> breakdown characteristics </ span> <span> Simple Mode </ span> <span> standard mode </ span>
<span> Permissions Overview</span> <span>In the simple mode space, the "development" role of DataWorks is mapped with the "Role_Project_Dev" role of the bound MaxCompute project, so DataWorks</span> <span>development role Naturally, it can read all the data in the MaxCompute project</span> <span>. </span> <span>In the standard mode space, the "development" role of DataWorks is mapped with the "Role_Project_Dev" role of the bound MaxCompute project (dev environment), so:</span><ul> <li><span>The DataWorks development role can naturally read all data in the MaxCompute project (dev environment). </span></li><li><span>Because there is no role mapping with MaxCompute project (PROD environment), DataWorks</span> <span>Development role has no MaxCompute (PROD environment) data by default Permission</span> <span>. </span></li></ul>
<span>Advantages</span> <span>Simple, convenient and easy to use</span> <span>. </span><span>All data warehouse development work can be completed only by authorizing the data developer "DataWorks development role". </span> <span>Safety, standard</span> <span>. </span><ul><li><span>It has a safe and standardized code release control process (including code review, code DIFF viewing, etc.) to ensure the stability of the production environment and avoid unnecessary dirty caused by code logic Unexpected situations such as data spread or task reporting errors. </span></li><li><span>Data access is effectively controlled and data security is guaranteed. </span></li></ul>
<span>Disadvantages</span> <span>exist</span> <span> unstable and unsafe</span> <span > The risk. </span><ul><li><span>Development roles can add and modify codes at any time without any approval, and submit them to the scheduling system, which will bring instability to the production environment. </span></li><li><span>When facing the MaxCompute computing engine, the development role defaults to have read and write permissions for all tables in the current MaxCompute project, and can add, delete, and modify tables at will. Data security exists risk. </span></li></ul> <span>The process is relatively complicated</span> <span> Generally, it is impossible to complete all data development and production processes by one person. </span>
## MaxCompute engine database table naming conventions in different modes In the simple mode, the development environment and the production environment are not distinguished, and the development library is the production library. In the standard mode, the development environment and the production environment are isolated. The database table names of the development environment and the production environment are different. If you need to access the database tables of the production environment in the development environment, please strictly distinguish the database table names according to the following naming conventions to avoid mistakes Operating production environment.
<span> environment type </ span> <span> standard mode </ span> <span> exemplary </ span>
<span> Development Environment < /span> <span>project name_dev.table name</span> <span> create a development library table user_info under the projectA project, then the database table name is: projectA_dev.user_info. </span>
<span>Production environment</span> <span>Project name. Table name</span> <span> Create a production table in the project library Auser The database table name is: projectA.user_info. </span>
For more differences between the simple mode and the standard mode, please go to the help center . ## Scenario practice: Authority management and standardized data development : Through this practice, you can understand the standard process and authority control suggestions for users to develop data when using standard mode workspaces. > Copyright statement: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users. The copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.

阿里云开发者
3.2k 声望6.3k 粉丝

阿里巴巴官方技术号,关于阿里巴巴经济体的技术创新、实战经验、技术人的成长心得均呈现于此。


引用和评论

0 条评论