Abstract: There are two technical paths to prevent data leakage. The first is authority management, which uses the principle of minimal authorization to authorize users and applications that use data. The other is data encryption, including encryption using SQL functions and transparent encryption.

This article is shared from the HUAWEI CLOUD community " [Security is no small matter] Data warehouse security you should know-encryption function ", the original author: zhangkunhn.

Preface

I recently encountered a customer scenario involving the permission of shared schema. The scenario can be simply described as: some users are data producers and need to create tables and write data in the schema; other users are data consumers and read the data in the schema for analysis. One way to implement this schema permission management is that the data producer informs the administrator user to use the grant select on all tables in schema syntax to grant consumer permissions every time a new table is created. This method has certain limitations. If the producer creates some new tables under the schema, in order to authorize the consumer to use these new tables, you need to inform the administrator that the user must use grant select on all tables in schema again to authorize. Is there a simple solution? The answer is yes, you can use Alter default privilege. Alter default privilege is used to grant or reclaim permissions for objects created in the future.
image.png

Grammar introduction

 ALTER DEFAULT PRIVILEGES
     [ FOR { ROLE | USER } target_role [, ...] ]
     [ IN SCHEMA schema_name [, ...] ]
     abbreviated_grant_or_revoke;

The abbreviated_grant_or_revoke clause is used to specify which objects are authorized or reclaimed. The syntax for table authorization is:

 GRANT { { SELECT | INSERT | UPDATE | DELETE | TRUNCATE | REFERENCES } 
     [, ...] | ALL [ PRIVILEGES ] }
     ON TABLES 
     TO { [ GROUP ] role_name | PUBLIC } [, ...]

Parameter Description

  • target_role

The name of an existing role. If FOR ROLE/USER is omitted, the default value is the current role/user.

Value range: the name of an existing role.

  • schema_name

The name of the existing pattern.

The target_role must have the CREATE permission of the schema_name.

Value range: the name of the existing mode.

  • role_name

The name of the role that was granted or revoked permissions.

Value range: the name of an existing role.

See ALTER DEFAULT PRIVILEGES syntax description for details

Example scenario

testdb=# create user creator1 password 'Gauss_234';  
 CREATE USER
 testdb=# create user creator2 password 'Gauss_234';  
 CREATE ROLE
 testdb=# create user user1 password 'Gauss_234';
 CREATE USER
 --创建共享schema,授予creator1和creator2创建权限,授予user1使用权限
 testdb=# create schema shared_schema;  
 CREATE SCHEMA
 testdb=> grant create, usage on schema shared_schema to creator1;
 GRANT
 testdb=> grant create, usage on schema shared_schema to creator2;
 GRANT
 testdb=# grant usage on schema shared_schema to user1;
 GRANT
 --将creator1和creator2在shared_schema中创建表的select权限授予user1
 testdb=# alter default privileges for user creator1, creator2 in schema shared_schema grant select on tables to user1;
 ALTER DEFAULT PRIVILEGES
 --切到creator1,建表
 testdb=# \c testdb creator1
 You are now connected to database "testdb" as user "creator1".
 testdb=> create table shared_schema.t1 (c1 int);
 CREATE TABLE
 --切到creator2,建表
 testdb=> \c testdb creator2
 You are now connected to database "testdb" as user "creator2".
 testdb=> create table shared_schema.t2 (c1 int);
 CREATE TABLE
 --切到user1,查询OK
 testdb=> \c testdb user1
 You are now connected to database "testdb" as user "user1".
 testdb=> select * from shared_schema.t1 union select * from shared_schema.t2;
  c1 
 ----
 (0 rows)

View the status of granting default permissions

Query the system table pg_default_acl to see which schemas are currently granted default permissions. From the Defaclacl field, you can see that creator1 and creator2 respectively granted user1 the select permission on the objects in the shared_schema (r stands for read).

 testdb=# select r.rolname, n.nspname, a.defaclobjtype, a.defaclacl from
 testdb-#     pg_default_acl a, pg_roles r, pg_namespace n
 testdb-#     where a.defaclrole=r.oid and a.defaclnamespace=n.oid;
  rolname  |    nspname    | defaclobjtype |     defaclacl      
 ----------+---------------+---------------+--------------------
  creator1 | shared_schema | r             | {user1=r/creator1}
  creator2 | shared_schema | r             | {user1=r/creator2}
 (2 rows)

Some details

All users who create objects in the shared schema should appear in the list after alter default privileges for user. Otherwise, if a user creator3 is not in the list, the objects created in the shared schema or those whose Owner is creator3 will not be queried by user1. Because the table created by the creator3 user in the shared schema does not grant user1 the default permissions.

testdb=# create user creator3 password 'Gauss_234';
 CREATE USER
 testdb=# grant create, usage on schema shared_schema to creator3;
 GRANT
 testdb=# \c testdb creator3
 You are now connected to database "testdb" as user "creator3".
 testdb=> create table shared_schema.t3 (c1 int);
 CREATE TABLE
 testdb=> \c testdb user1
 You are now connected to database "testdb" as user "user1".
 testdb=> select * from shared_schema.t3;
 ERROR:  permission denied for relation t3

The administrator can add creator3 to the list through alter default privileges for user to grant user1 the default permissions to access the creator3 user creation table, or the creator3 user himself can authorize it to user1 through alter default privileges. In the previous syntax parameter description, if FOR is omitted ROLE/USER, the default value is the current user.

testdb=> \c testdb creator3
 You are now connected to database "testdb" as user "creator3".
 testdb=> alter default privileges in schema shared_schema grant select on tables to user1;
 ALTER DEFAULT PRIVILEGES
 testdb=> \c testdb user1
 You are now connected to database "testdb" as user "user1".
 testdb=> select * from shared_schema.t3;
 ERROR:  permission denied for relation t3
 testdb=> \c testdb creator3
 testdb=> create table shared_schema.t4 (c1 int);
 CREATE TABLE
 testdb=> \c testdb user1
 You are now connected to database "testdb" as user "user1".
 testdb=> select * from shared_schema.t4;
  c1 
 ----
 (0 rows)

The third line of the above code grants user1 the select permission of the table created by the current user under shared_schema. In line 7, user1 queries shared_schema.t3 and reports insufficient permissions, because alter default privileges only deal with future objects. shared_schema.t3 was created before. We create a new table shared_schema.t4, user1 user query is normal.

If you want to deal with the permissions of an existing table, use the grant statement. See grant syntax description.

testdb=> \c testdb creator3
 You are now connected to database "testdb" as user "creator3".
 testdb=> grant select on all tables in schema shared_schema to user1;
 ERROR:  permission denied for relation t1
 testdb=> grant select on table shared_schema.t3 to user1;
 GRANT
 testdb=> \c testdb user1
 You are now connected to database "testdb" as user "user1".
 testdb=> select * from shared_schema.t3;
  c1 
 ----
 (0 rows)

The shared_schema in line 3 of the code contains three user-created tables, and creator3 is just the creator (Owner) of table t3. Therefore, an error will be reported if the permissions granted to the entire schema are granted only to the table t3 whose creator3 is the Owner, the user1 user query is normal.

alter default privileges only deals with future objects, grant only deals with existing objects. Furthermore, the objects involved in granting permissions in these two grammars only include objects whose Owner is the current user. If you want to grant permissions to all Owner objects under the shared schema, you need to use the alter default privileges for user syntax and grant syntax for the administrator user.

Transparent encryption

Application scenarios of transparent encryption

Transparent encryption can ensure the security of user data. Disk replacement, disk outflow, or illegal direct reading of disk files by operation and maintenance will bypass authentication, rights management, and auditing, leading to the risk of data leakage. When customers have high confidentiality requirements for business data, transparent encryption is recommended.

The principle of transparent encryption

The transparent encryption function encrypts and stores user data stored on the hard disk, and is not aware of users and upper-layer applications using SQL. The meaning of transparency means that it is unaware to customers, and only needs to configure transparent encryption when creating a GaussDB (DWS) cluster. Currently, it supports encrypted storage of row-stored table and column-stored table files, and supports cluster-level transparent encryption configuration.

Transparent encryption at the cluster level means that all libraries in the cluster and all tables in the library are stored encrypted. Transparent encryption at the cluster level also means that it needs to be configured when the cluster is created. After the cluster is created, it cannot be modified. It is neither possible to modify a non-encrypted cluster to an encrypted cluster, nor can an encrypted cluster to a non-encrypted cluster.

Encryption Algorithm

The core of transparent encryption is algorithm and key. We use the AES-128 algorithm, and the encryption mode uses CTR. CTR stream encryption can ensure that the length of the plaintext and the ciphertext are equal, and will not cause the data storage space to expand after encryption.

Key management

Use Huawei public cloud KMS service management to ensure the security of users' keys.
The encryption key hierarchy has three levels. Arranged in hierarchical order, these keys are master key (CMK), cluster key (CEK), and database key (DEK).

  • The master key is stored in KMS and is used to encrypt CEK.
  • CEK is used to encrypt DEK, CEK plaintext is stored in the cluster memory, and ciphertext is stored in the service management plane.
  • DEK is used to encrypt data in the database, DEK plaintext is stored in the cluster memory, and ciphertext is stored in the service management plane.
    image.png

Key rotation

For security reasons, users can perform key rotation operations. The key rotation only rotates the cluster key, regardless of the database secret key.

The subsequent evolution of transparent encryption

The advantage of cluster-level transparent encryption is that all data including user tables and system tables are encrypted, which is suitable for all encryption requirements. The two sides of a coin tell us that advantages can also be disadvantages. Encrypting all database objects will bring performance overhead to data import and query.

To solve this problem, we will consider supporting fine-grained transparent encryption in the future. For example, table-level transparent encryption can be supported. The user specifies the attribute as an encrypted table when creating a table, and the data of the user table will be encrypted and stored. The user can turn on the encryption attribute in the table containing sensitive data, and does not perceive the encryption and decryption process during query and use. Because the encryption granularity is small, the impact on performance is also small.

Transparent encryption is an effective means to ensure the security of users' core data. It introduces the transparent encryption feature of GaussDB (DWS) data warehouse from the usage scenarios and principles, and points out the research direction of subsequent transparent encryption features.

SQL function encryption

technical background

Cryptographic algorithms in cryptography can be divided into three categories: hash functions, symmetric cryptographic algorithms and asymmetric cryptographic algorithms.

  • Hash function

Hash function is also called digest algorithm. For data, Hash function generates fixed-length data, that is, Hash(data)=result. This process is irreversible, that is, the Hash function does not have an inverse function, and data cannot be obtained from the result. In scenarios where plaintext should not be saved, such as passwords are sensitive information, and system administrator users should not know the user's plaintext passwords, hash algorithms should be used to store the one-way hash value of the password.

In actual use, the salt value and the number of iterations are added to prevent the same password from generating the same hash value to prevent rainbow table attacks.
image.png

  • Symmetric cryptographic algorithm

Symmetric cryptographic algorithms use the same key to encrypt and decrypt data. Symmetric cipher algorithms are divided into block cipher algorithms and stream cipher algorithms.

The block cipher algorithm divides the plaintext into fixed-length blocks, and encrypts each block with a key. Since the packet length is fixed, when the plaintext length is not an integer multiple of the packet length, the plaintext will be filled. Due to the existence of padding, the length of the cipher text obtained by the block cipher algorithm will be greater than the length of the plain text.

The stream cipher algorithm calculates the plaintext bit by bit with the key stream. The stream cipher algorithm does not require padding, and the length of the cipher text obtained is equal to the length of the plain text.
image.png

  • Asymmetric cryptographic algorithm

Asymmetric cryptographic algorithm, also known as public key cryptographic algorithm. The algorithm uses two keys: a public key and a private key. The public key is disclosed to everyone, and the private key is kept secret. Asymmetric cryptographic algorithms are used in key agreement, digital signatures, digital certificates and other fields.
image.png

Technical realization

GaussDB (DWS) mainly provides hash functions and symmetric cryptographic algorithms. The hash function supports sha256, sha384, sha512 and national secret sm3. Symmetric encryption algorithm supports aes128, aes192, aes256 and national secret sm4.

Hash function

  • md5(string)

Use MD5 encryption to string and use hexadecimal number as the return value. MD5 has low security and is not recommended.

  • gs_hash(hashstr, hashmethod)

Use the hashmethod algorithm to digest the hashstr string, and return the message digest string. Supported hashmethods: sha256, sha384, sha512, sm3.

testdb=# SELECT gs_hash('GaussDB(DWS)', 'sha256');
                             gs_hash                              
------------------------------------------------------------------
 cc2d1b97c6adfba44bbce7386516f63f16fc6e6a10bd938861d3aba501ac8aab
(1 row)

Symmetric cryptographic algorithm

  • gs_encrypt(encryptstr, keystr, cryptotype, cryptomode, hashmethod)
    Use the encryption algorithm composed of cryptotype and cryptomode and the HMAC algorithm specified by hashmethod to encrypt the encryptstr string with keystr as the key, and return the encrypted string.
    Supported cryptotypes: aes128, aes192, aes256, sm4.
    Supported cryptomode: cbc.
    Supported hashmethods: sha256, sha384, sha512, sm3.
testdb=# SELECT gs_encrypt('GaussDB(DWS)', '1234', 'aes128', 'cbc',  'sha256');
                                                        gs_encrypt                                                        
--------------------------------------------------------------------------------------------------------------------------
 AAAAAAAAAADlzZYiNQK1uB+p1gza4Lu3Moj3HdP4E1uJmqfDYBaXDLMt7RZoE0YVx9h2dMRYBQ5fhFNqqM49sUkeS72o8kX5vWRQvfW3fuocGyp+b+lX9A==
(1 row)
  • gs_decrypt(decryptstr, keystr,cryptotype, cryptomode, hashmethod)
    Use the encryption algorithm composed of cryptotype and cryptomode and the HMAC algorithm specified by hashmethod to decrypt the decryptstr string with keystr as the key, and return the decrypted string. The keystr used for decryption must be consistent with the keystr used for encryption in order to decrypt normally.

testdb=# SELECT gs_decrypt('AAAAAAAAAADlzZYiNQK1uB+p1gza4Lu3Moj3HdP4E1uJmqfDYBaXDLMt7RZoE0YVx9h2dMRYBQ5fhFNqqM49sUkeS72o8kX5vWRQvfW3fuocGyp+b+lX9A==', '1234', 'aes128', 'cbc', 'sha256');

  gs_decrypt  
--------------
 GaussDB(DWS)
(1 row)

Effectiveness analysis

There is a student table with three attributes: id, name and score. The name can be encrypted and saved using a hash function, and the score can be saved using a symmetric cryptographic algorithm.

testdb=# create table student (id int, name text, score text);
CREATE TABLE
testdb=# insert into student values (1, gs_hash('alice', 'sha256'), gs_encrypt('95', '12345', 'aes128', 'cbc', 'sha256'));
INSERT 0 1
testdb=# insert into student values (2, gs_hash('bob', 'sha256'), gs_encrypt('92', '12345', 'aes128', 'cbc', 'sha256'));
INSERT 0 1
testdb=# insert into student values (3, gs_hash('peter', 'sha256'), gs_encrypt('98', '12345', 'aes128', 'cbc', 'sha256'));
INSERT 0 1

Users without a key cannot see the encrypted data in the two columns of name and score even if they have the select permission.

testdb=# select * from student;
 id |                               name                               |                                                          score                                                           
----+------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------
  1 | 2bd806c97f0e00af1a1fc3328fa763a9269723c8db8fac4f93af71db186d6e90 | AAAAAAAAAAB26RmKZdGciLdOM1Z0sjsHg6Qh1b8taF3cY5KDVm+faJK5AT9tjufkr3Wogj3tIpFfiIEb6+miGqPHWcmKnFsArAMoBG9pPDawGs1Qze7xGg==
  2 | 81b637d8fcd2c6da6359e6963113a1170de795e4b725b84d1e0b4cfd9ec58ce9 | AAAAAAAAAAB26RmKZdGciLdOM1Z0sjsHZOHH7URkyme6r8Hfh1k0UsVbgbREjFMkgB52w+7GtUGqGgUik07ghajSD9PMIDLd/49wBCVROm2/HSOw6jzbxA==
  3 | 026ad9b14a7453b7488daa0c6acbc258b1506f52c441c7c465474c1a564394ff | AAAAAAAAAAB26RmKZdGciLdOM1Z0sjsHwv6p/OAfDUyVULAqpaHIrYJYMcqLmQSj3K/REyavfMoKB7hgUpEPXfHRutWur37bru68jjt5XcBHFBjZeMgowA==
(3 rows)

The user who has the key can view the encrypted data through decryption.


testdb=# select id, gs_decrypt(score, '12345', 'aes128', 'cbc', 'sha256') from student;
 id | gs_decrypt 
----+------------
  1 | 95
  2 | 92
  3 | 98
(3 rows)

to sum up

Data encryption is an effective technology to prevent unauthorized access and prevent data leakage. The basic principles of cryptographic algorithms and the encryption functions of GaussDB (DWS) data warehouses are introduced, including the hash function gs_hash and the symmetric cryptographic algorithm gs_encrypt/gs_decrypt. An example is the usage scenario of the encryption function.

For more information about GuassDB (DWS), welcome to search "GaussDB DWS" on WeChat and follow the WeChat official account to share with you the latest and most complete PB-level data warehouse black technology~

Click to follow and learn about Huawei Cloud's fresh technology for the first time~


华为云开发者联盟
1.4k 声望1.8k 粉丝

生于云,长于云,让开发者成为决定性力量