NodeJS encryption and decryption of Crypto

If you think the article is good, welcome to follow, like and share!

Continue to share technical blog posts and follow WeChat public account 👉🏻 Front-end LeBron

In the Internet age, the amount of data on the Internet is increasing at an alarming rate every day. At the same time, various types of network security issues are emerging one after another. Today, when the importance of information security is becoming more and more prominent, as a developer, you need to strengthen your understanding of security and enhance the security of services through technical means. The purpose of the crypto module is to provide general encryption and hashing algorithms. It is not impossible to implement these functions with pure JavaScript code, but the speed will be very slow. After Nodejs implements these algorithms in C/C++, it is exposed as a JavaScript interface through the cypto module, which is convenient to use and fast.

Encoding

Why is encoding required for information transmission?
When developing encrypted and decrypted data, we encountered the need to convert the encrypted byte array into String objects for network transmission. If the byte array is directly converted into encoding methods such as UTF-8, there will definitely be some encodings. Corresponding characters (8bit can only represent 128 characters), there will be errors in the encoding and parsing process, and the information cannot be expressed correctly. At this time, it can be realized by the commonly used binary data encoding method Base64 encoding or Hex encoding.

hex code

Coding principle

Express an 8-bit byte data with two hexadecimal numbers

Regroup the 8-bit binary code into two 4-bit bytes
The lower 4 bits of one byte are the upper 4 bits of the original byte, and the lower 4 bits of the other byte are the lower 4 bits of the original data.
The upper 4 bits are filled with 0, and then the hexadecimal number corresponding to these two bytes is output as the code

example

ASCII码：A(65)

二进制码:0100 0001

重新分组: 00000100  00000001

十六进制: 4         1

Hex编码：41

Even if the original file is pure English content, the encoded content is completely different from the original. It is difficult for ordinary people to read but because there are only 16 characters, I heard that some programmers can write down their mapping relationship, so as to read hex encoding and The same effect as reading the original text. In addition, after the data is hex-encoded, the space occupancy becomes twice the original.

base64 encoding

Coding principle

Base64 encoding is to express binary data through 64 characters, 64 characters means that binary data can only express 6 bits, so it can express 3 bytes by 4 Base64 characters, the following is the character encoding table of Base64

Take an example of Base64 encoding, the picture is very simple and easy to understand

Add 0 when the length of the string is not a multiple of 3, that is, "="

Composed of 64 characters, it is more difficult to read than hex encoding, but because every 3 bytes will be encoded as 4 characters.

Therefore, the space occupied will be 4/3 of the original, which is more space-saving than hex. Another thing to note is that although Base64 encoded data is difficult to read, it cannot be used as an encryption algorithm, because it does not require you to provide a key for decoding.

urlencode

Coding principle

urlencode coding, see the name to know is designed to encode url for a-z , A-Z , 0-9 , . , - and _ is output, urlencode will not do any treatment, while the other bytes will be encoded as %xx (16 hex ), where xx is the hex code corresponding to this byte. Since the English characters are kept as they are, for English-based content, the readability is the best, and the space occupation is almost unchanged. For non-English content, each byte will be encoded as 3 characters of %xx, and the space occupation is 3 times the original, so urlencode is an English-friendly coding scheme.

Hash

Abstract: Use a message of variable length as the input Hash function to generate a fixed-length output. This output is called a summary
Applicable scenarios: Checksum storage of sensitive information, complete verification of messages & no tampering

Features

Fixed output length: The input length is not fixed, and the output length is fixed (differs depending on the algorithm, the common ones are MD5, SHA series).
Operation is irreversible: When the result of the operation is known, the original string cannot be obtained through the inverse operation.
Highly discrete: Small changes in input can cause huge differences in calculation results.
Weak collision: the hash value of different inputs may be the same.

Take MD5 as an example

MD5 (Message-Digest Algorithm) is a hash function (also known as hash algorithm, digest algorithm) widely used in the field of computer security, which is mainly used to ensure the integrity and consistency of messages.
Common application scenarios: password protection, download file verification, etc.

application scenario

File integrity verification: For example, when downloading a software from the Internet, the general website will attach the software's md5 value to the web page. After the user downloads the software, he can perform the md5 calculation on the downloaded software, and then follow the md5 value on the website Make comparisons to ensure the integrity of the software
Password protection: save the password after md5 to the database instead of saving the plaintext password to avoid the leakage of the plaintext password after events such as dragging the database.
Anti-tampering: For example, the anti-tampering of digital certificates uses the digest algorithm. (Of course it must also be combined with digital signatures and other means)

Simple md5 calculation

hash.digest([encoding])

Calculation summary. The encoding can be hex , base64 or others. If encoding is specified, then a string is returned. Otherwise, the Buffer instance is returned. Note that after calling hash.digest(), the hash object is invalidated, and an error will be reported if you call it again.

hash.update(data[, input_encoding])

input_encoding can be utf8 , ascii or others. If data is a string and input_encoding is not specified, the default is utf8 . Note that the hash.update() method can be called multiple times.

const crypto = require('crypto');
const fs = require('fs');

const FILE_PATH = './index.txt'
const ENCODING = 'hex';

const md5 = crypto.createHash('md5');
const content = fs.readFileSync(FILE_PATH);
const result = md5.update(content).digest(ENCODING);
console.log(result);

// f62091d58876a322864f5a522eb05052

password protection

As mentioned earlier, saving the plaintext password to the database is very insecure
The worst thing is to save after md5
For example, the user's password is 123456 , after md5 run, get output: e10adc3949ba59abbe56e057f20f883e

This has at least two benefits:

Prevent internal attacks: website developers don’t know the user’s plaintext password, so that developers can avoid doing bad things with the user’s plaintext password, and protect the user’s privacy in this way
Anti-external attack: If the website is hacked, the hacker can only get the password after md5 instead of the user's plaintext password, which ensures the security of the password

const crypto = require('crypto');

const cryptPwd = (password) => {
    const md5 = crypto.createHash('md5');
    return md5.update(password).digest('hex');
}

const password = '123456';
const cryptPassword = cryptPwd(password);
console.log(cryptPassword);

// e10adc3949ba59abbe56e057f20f883e

As mentioned earlier, security is improved by performing md5 operations on user passwords.
- But in fact, this kind of security is very poor, why?
- Modify the above example slightly, you may understand. The same plaintext password has the same md5 value.
That is to say, when the attacker knows that the algorithm is md5 and the password value stored in the database is e10adc3949ba59abbe56e057f20f883e , it can theoretically be guessed that the user's plaintext password is 123456 .
In fact, the rainbow table is brute force cracked like this: the md5 value of common plaintext passwords is stored in advance, and then matched with the password stored in the website database, the user's plaintext password can be quickly found.

So is there any way to further improve security?
The answer is: password plus salt.

`password`

The word "salt" looks very mysterious, but the principle is very simple
That is, after inserting a specific character string at a specific position of the password, the md5 operation is performed on the modified character string.

The same password, when the "salt" value is different, the difference in md5 value is very big

By adding salt to the password, it is possible to prevent the initial brute force cracking. If the attacker does not know the "salt" value in advance, the cracking will be very difficult.

const crypto = require('crypto');

const cryptPwd = (password, salt) => {
    const saltPassword = `${password}:${salt}`;
    console.log(`原始密码：${password}`);
    console.log(`加盐密码：${saltPassword}`);

    const md5 = crypto.createHash('md5');
    const result = md5.update(password).digest('hex');
    console.log(`加盐密码的MD5值：${result}`)
}



const password = '123456';
const salt = 'abc'
cryptPwd(password, salt);
/*
原始密码：123456
加盐密码：123456:abc
加盐密码的MD5值：e10adc3949ba59abbe56e057f20f883e
*/

`Password plus salt: random salt value`

By adding salt to the password, the security of the password has been improved a lot
But in fact, the above example has many problems

Assuming that the string splicing algorithm and salt value have been leaked, the above code has at least the following problems:

Short salt value: The possibility of exhaustive enumeration is less, and it is easy to brute force the solution. Generally, long salt value is used to solve it.
Fixed salt value: Similarly, the attacker only needs to calculate the hash value table of common password + salt value.

Needless to say, short salt values should be avoided
- As to why a fixed salt value should not be used, I need to explain more here. Many times, our salt value is hard-coded into our code (such as configuration files). Once the attacker knows the salt value through some means, then we only need to brute force the series of fixed salt values. Up
For example, the above code, when you know that salt value is abc time, instantly guessed 51011af1892f59e74baf61f3d4389092 corresponding plaintext password is 123456 .

So, how to optimize it? The answer is: random salt value.

As you can see, the password is also 123456. Because of the random salt value, the results of the previous and after calculations are different.

The advantage of this is that for multiple users and the same password, the attacker needs to perform multiple operations to be able to completely crack

The same is a pure digital 3-digit short salt value, the amount of calculation required to crack the random salt value >> fixed salt value

The sample code is as follows

const crypto = require('crypto');

const getRandomSalt = () => {
    return Math.random().toString().slice(2,5);
}

const cryptPwd = (password, salt) => {
    const saltPassword = `${password}:${salt}`;
    console.log(`原始密码：${password}`);
    console.log(`加盐密码：${saltPassword}`);

    const md5 = crypto.createHash('md5');
    const result = md5.update(saltPassword).digest('hex');
    console.log(`加盐密码的MD5值：${result}`)
}

const password = '123456';

cryptPwd(password, getRandomSalt());

/*
原始密码：123456
加盐密码：123456:126
加盐密码的MD5值：3aeb1848ff63aa32b262bc3f8dd5bd82
*/

cryptPwd(password, getRandomSalt());

/*
原始密码：123456
加盐密码：123456:232
加盐密码的MD5值：21a427268a5094322146e18e47b135fb
*/

`HMAC function`

The full name of HMAC is Hash-based Message Authentication Code, that is, the salting operation in the hash.
For specific use, it is similar to the hash module. Select the hash algorithm and specify the "salt".
The difference with the above example is that one is to manually spell the salt value, and the other is to use the HMAC module

const crypto = require("crypto")
const fs = require("fs")

const FILE_PATH = "./index.txt"
const SECRET = 'secret'
const content = fs.readFileSync(FILE_PATH,{encoding:'utf8'})
const hmac = crypto.createHmac('sha256', SECRET);

hmac.update(content)
const output = hmac.digest('hex')
console.log(`Hmac: ${output}`)

// Hmac: 6f438ef66d3806ae14d6692d9610e55c41ebb4eb3ee73911a4d512bd1cade976

Note: Large files can be streamed

`encrypt and decode`

Encryption and decryption mainly use the following two methods:

encryption:
- crypto.createCipher(algorithm, password)
- crypto.createCipheriv(algorithm, key, iv)
Decryption:
- crypto.createDecipher(algorithm, password)
- crypto.createDecipheriv(algorithm, key, iv)

`crypto.createCipher / crypto.createDecipher`

First look at crypto.createCipher(algorithm, password), the two parameters are encryption algorithm and password

algorithm: encryption algorithm, such as aes192
- The specific optional algorithms depend on the local version of openssl
- You can check which algorithms are supported openssl list-cipher-algorithms
password: used to generate the key (key) and initialization vector (IV)

crypto.createDecipher(algorithm, password) can be regarded as the reverse operation of crypto.createCipher(algorithm, password)

const crypto = require("crypto")

const SECRET = 'secret'
const ALGORITHM = 'aes192'
const content = 'Hello Node.js'
const encoding = 'hex'

// 加密
const cipher = crypto.createCipher(ALGORITHM, SECRET)
cipher.update(content)
const output = cipher.final(encoding)
console.log(output)
// 944e6e3c21d6eb8568bd6a9716631e、e

// 解密
const decipher = crypto.createDecipher(ALGORITHM, SECRET)
decipher.update(output, encoding)
const input = decipher.final('utf8')
console.log(input)

// Hello Node.js

`crypto.createCipheriv / crypto.createDecipheriv`

Compared with crypto.createCipher(), crypto.createCipheriv() needs to provide key and iv , and crypto.createCipher() is calculated based on the password provided by the user
The key and iv can be either Buffer or utf8-encoded strings. What you need to pay attention to here is their length:

key: According to the selected algorithm
- For example, aes128, aes192, aes256, the length is 128, 192, 256 bits (16, 24, 32 bytes)
iv: The initialization vector is 128 bits (16 bytes), which can also be understood as a kind of password salt

const crypto = require("crypto")

const key = crypto.randomBytes(192 / 8)
const iv = crypto.randomBytes(128 / 8)
const algorithm = 'aes192'
const encoding = 'hex'

const encrypt = (text) => {
    const cipher = crypto.createCipheriv(algorithm, key, iv)
    cipher.update(text)
    return cipher.final(encoding)
}

const decrypt = (encrypted) => {
    const decipher = crypto.createDecipheriv(algorithm, key, iv)
    decipher.update(encrypted, encoding)
    return decipher.final('utf8')
}

const content = 'Hello Node.js'
const crypted = encrypt(content)
console.log(crypted)

// db75f3e9e78fba0401ca82527a0bbd62

const decrypted = decrypt(crypted)
console.log(decrypted)

// Hello Node.js

`digital signature / signature verification`

Assumptions:
- The original information of the server is M, the digest algorithm is Hash, and the digest obtained by Hash(M) is H
- The public key is Pub, the private key is Piv, the asymmetric encryption algorithm is Encrypt, and the asymmetric decryption algorithm is Decrypt.
- The result of Encrypt(H) is S
- The information obtained by the client is M1, and the result obtained by Hash(M1) is H1
The steps of digital signature generation and verification are as follows:
- Steps to generate digital signature:
  - Use the digest algorithm Hash to calculate the digest of M, that is, Hash(M) == H
  - Use an asymmetric encryption algorithm to encrypt the digest Encrypt( H, Piv) to get the digital signature S
- Digital signature verification steps:
  - Use decryption algorithm D to decrypt the digital signature, that is, Decrypt(S) == H
  - Calculate the digest of M1 Hash(M1) == H1, compare H and H1, if the two are the same, pass the check

How to generate the private key is not the focus here, it is generated using online services.

After understanding the principles of digital signature generation and verification, I believe the following code is easy to understand:

const crypto = require('crypto');
const fs = require('fs');
const privateKey = fs.readFileSync('./private-key.pem');  // 私钥
const publicKey = fs.readFileSync('./public-key.pem');  // 公钥
const algorithm = 'RSA-SHA256';  // 加密算法 vs 摘要算法
const encoding = 'hex'

// 数字签名
function sign(text){
    const sign = crypto.createSign(algorithm);
    sign.update(text);
    return sign.sign(privateKey, encoding);
}

// 校验签名
function verify(oriContent, signature){
    const verifier = crypto.createVerify(algorithm);
    verifier.update(oriContent);
    return verifier.verify(publicKey, signature, encoding);
}

// 对内容进行签名
const content = 'hello world';
const signature = sign(content);
console.log(signature);

// 校验签名，如果通过，返回true
const verified = verify(content, signature);
console.log(verified);

`DH(DiffieHellman)`

DiffieHellman: Diffie-Hellman key exchange, abbreviated as DH, is a security protocol, often used for key exchange, allowing two parties in communication to create a key through an insecure communication channel without the other party's information in advance. This key can be used as a symmetric encryption key to encrypt the transmitted information in subsequent communications.

Principle analysis

Suppose that the client and the server select two prime numbers a and p (both public), and then

Client: select the natural number Xa, Ya = a^Xa mod p, and send Ya to the server;
Server: select the natural number Xb, Yb = a^Xb mod p, and send Yb to the client;
Client: calculate Ka = Yb^Xa mod p
Server: Calculate Kb = Ya^Xb mod p

Ka = Yb^Xa mod p
= (a^Xb mod p)^Xa mod p
= a^(Xb * Xa) mod p
= (a^Xa mod p)^Xb mod p
= Ya^Xb mod p
= Kb
It can be seen that although the client and server do not know each other’s Xa and Xb, they have calculated the same secret.

const crypto = require('crypto');

const primeLength = 1024;  // 素数p的长度
const generator = 5;  // 素数a

// 创建客户端的DH实例
const client = crypto.createDiffieHellman(primeLength, generator);
// 产生公、私钥对，Ya = a^Xa mod p
const clientKey = client.generateKeys();

// 创建服务端的DH实例，采用跟客户端相同的素数a、p
const server = crypto.createDiffieHellman(client.getPrime(), client.getGenerator());
// 产生公、私钥对，Yb = a^Xb mod p
const serverKey = server.generateKeys();

// 计算 Ka = Yb^Xa mod p
const clientSecret = client.computeSecret(server.getPublicKey());
// 计算 Kb = Ya^Xb mod p
const serverSecret = server.computeSecret(client.getPublicKey());

// 由于素数p是动态生成的，所以每次打印都不一样
// 但是 clientSecret === serverSecret
console.log(clientSecret.toString('hex'));
console.log(serverSecret.toString('hex'));
// 39edfedad4f1be731977436936ca844e50ebc90953ad208c71d7f2dc1772409962ec3eb90eaf99db5948f089e1d4951f148bd7ff76c18b53ff6be32f267fc54535928ce4acf15d923cfd0caec45db95b206e7636128210ea6813a20fb09cbfb06214b2f488716fea32788023d98cb4cb7fe39b68bd3563b3b34257e37f6b7fb7

// 39edfedad4f1be731977436936ca844e50ebc90953ad208c71d7f2dc1772409962ec3eb90eaf99db5948f089e1d4951f148bd7ff76c18b53ff6be32f267fc54535928ce4acf15d923cfd0caec45db95b206e7636128210ea6813a20fb09cbfb06214b2f488716fea32788023d98cb4cb7fe39b68bd3563b3b34257e37f6b7fb7

`ECDH(Elliptic Curve Diffie-Hellma)`

ECDH and DH have similar principles and are both secure key agreement protocols.
Compared with the DH protocol, combined with elliptic curve cryptography ECC acceleration, the operation saves CPU resources

The principle of ECDH ( Elliptic Curve Diffie-Hellman ) is as follows

const crypto = require('crypto');

const G = 'secp521r1';
const encoding = 'hex'

const server = crypto.createECDH(G);
const serverKey = server.generateKeys();

const client = crypto.createECDH(G);
const clientKey = client.generateKeys();

const serverSecret = server.computeSecret(clientKey);
const clientSecret = client.computeSecret(serverKey);

console.log(serverSecret.toString(encoding));
console.log(clientSecret.toString(encoding));
// 01c418be1b479f936397d4c1653ad77fa28fade67ff058dc18264a72bd1fc208ea6cac4dad996fda55bf271e84f0faef085173257b67bf21f95b09acee4d0a204517

// 01c418be1b479f936397d4c1653ad77fa28fade67ff058dc18264a72bd1fc208ea6cac4dad996fda55bf271e84f0faef085173257b67bf21f95b09acee4d0a204517

`ECDHE(Elliptic Curve Diffie-Hellma Ephemeral)`

Ordinary ECDH algorithms also have certain flaws. For example, during key negotiation, one party's private key is always the same. Generally, the server's private key is fixed and the client's private key is randomly generated. Over time, hackers can intercept massive key negotiation processes (some data is public), and hackers can brute force the server's private key based on these data, and then calculate the session key, encrypted The data will be cracked accordingly. The private key of the fixed party will be at risk of being cracked, so let the private keys of both parties be randomly generated and temporary during each key exchange communication. This algorithm is an enhanced version of ECDH: ECDHE, the full name of E is Ephemeral (temporary).

`Expand`

While learning this knowledge, I also learned a lot of cryptography-related knowledge, and found that the more you dig deeper and deeper, you will get caught up 😂, interested students can continue to look at the difference between related encryption algorithms and their application scenarios, for example :

Asymmetric encryption DSA, RSA, DH, DHE, ECDHE
Symmetric encryption AES, DES

RSA algorithm principle (2)-Ruan Yifeng's blog
Graphical ECDHE key exchange algorithm-Kobayashi coding
Data Encryption Standard (DES)-Wikipedia]( https://en.wikipedia.org/wiki/Data Encryption Standard)
Advanced Encryption Standard (AES)

`related terms`

SPKAC：Signed Public Key and Challenge

MD5: Message-Digest Algorithm 5, information-digest algorithm.

SHA: Secure Hash Algorithm, a secure hash algorithm.

HMAC: Hash-based Message Authentication Code, a key-related hash operation message authentication code.

Symmetric encryption: such as AES, DES

Asymmetric encryption: such as RSA, DSA

AES: Advanced Encryption Standard (Advanced Encryption Standard), the key length can be 128, 192, and 256 bits.

DES: Data Encryption Standard, data encryption standard, symmetric key encryption algorithm (now considered insecure).

DiffieHellman: Diffie-Hellman key exchange, abbreviated as DH, is a security protocol that allows both parties in communication to create a key through an insecure communication channel without the other party's information in advance. This key can be used as a symmetric encryption key to encrypt the transmitted information in subsequent communications. (Remarks, named by the inventor of the usage agreement)

Key exchange algorithm

Common key exchange algorithms are RSA, ECDHE, DH, DHE and other algorithms. Their characteristics are as follows:

RSA: The algorithm is simple to implement. It was born in 1977 and has a long history. It has been tested for a long time and has high security. The disadvantage is that a relatively large prime number (2048 bits is commonly used at present) is required to ensure security strength, which consumes CPU computing resources. RSA is currently the only algorithm that can be used for both key exchange and certificate signing.
DH: The diffie-hellman key exchange algorithm was born relatively early (1977), but it was only made public in 1999. The disadvantage is that it consumes more CPU performance.
ECDHE: The DH algorithm using elliptic curve (ECC) has the advantage of being able to use a smaller prime number (256 bits) to achieve the same security level as RSA. The disadvantage is that the algorithm is complex to implement, the history used for key exchange is not long, and there is no long-term security attack test.
ECDH: Does not support PFS, has low security, and cannot achieve false start.
DHE: ECC is not supported. Very consuming CPU resources.

It is recommended to give priority to RSA and ECDH_RSA key exchange algorithms. The reason is that:

ECDHE supports ECC acceleration, and the calculation speed is faster. Support PFS, more secure. Support false start, user access speed is faster.
At present, at least 20% of clients do not support ECDHE. We recommend using RSA instead of DH or DHE, because the DH series algorithms are very CPU intensive (equivalent to two RSA calculations).
Continue to share technical blog posts, welcome to follow!
Nuggets: front-end LeBron
Knowing: Front-end LeBron Continue to share technical blog posts and follow WeChat public account👇🏻