Overview
In programming development, functions such as user login and registration are very common, so what encryption algorithm should we choose for user password processing? In this scenario, the algorithm needs to satisfy the following two conditions:
- The algorithm needs to be irreversible, so as to effectively prevent password leakage.
- The algorithm needs to be relatively slow, and the computational cost can be dynamically adjusted. Slowness is an effective way to deal with brute force cracking.
At present, there are several algorithms PBKDF2 , BCrypt and SCrypt can satisfy. Let's first look at the old password encryption method.
old encryption
In the past, password encryption commonly used MD5 or SHA. MD5 is an encrypted hash designed in the early days. It generates a hash very quickly. With the enhancement of computer power, it has been cracked, so there are some hash functions with increased length, such as: SHA-1, SHA-256 etc. Here are some comparisons of them:
- MD5: Fast to generate short hashes (16 bytes). The probability of accidental collision is approximately: \( 1.47 \times 10^{-29} \) .
- SHA1: 20% slower than md5, produces a slightly longer hash (20 bytes) than MD5. The probability of accidental collision is approximately: \( 1 \times 10^{-45} \) .
- SHA256: The slowest, typically 60% slower than md5, and the resulting hash is long (32 bytes). The probability of an accidental collision is about: \( 4.3 \times 10^{-60} \) .
In order to ensure security, you may choose SHA-512, which is the longest hash at present, but hardware capabilities are increasing, and new vulnerabilities may be discovered one day, and researchers will launch a newer version, and the length of the new version will be longer. More and more, and they may also publish the underlying algorithm, so we should look for another more suitable algorithm.
salting operation
Password security, in addition to choosing a reliable enough encryption algorithm, the strength of the input data should also be improved, because the password is set by people, the combination of character length and strength cannot be consistent, if directly hash storage will often increase the probability of blasting, At this point we need to add salt .
Salting is a concept often mentioned in cryptography, which is actually random data. Here is an example of salt generation in java:
public static byte[] generateSalt() {
SecureRandom random = new SecureRandom();
byte[] salt = new byte[16];
random.nextBytes(salt);
return salt;
}
SHA-512 salted hashed password
public static String sha512(String rawPassword, byte[] salt) {
try {
MessageDigest md = MessageDigest.getInstance("SHA-512");
// 加点盐
md.update(salt);
return Hex.encodeHexString(md.digest(rawPassword.getBytes(StandardCharsets.UTF_8)));
} catch (GeneralSecurityException ex) {
throw new IllegalStateException("Could not create hash", ex);
}
}
PBKDF2
PBKDF1 and PBKDF2 are a key derivation function whose role is to generate an encryption key based on a specified passphrase. It was mentioned before in Common Encryption Algorithms . Although it is not a cryptographic hash function, it is still suitable for password storage scenarios because it has sufficient security. The PBKDF2 function is calculated as follows:
$$ DK = PBKDF2(PRF, Password, Salt, Iterations, HashWidth) $$
- \( PRF \) is a pseudo-random function with two arguments and outputs a fixed length (for example, HMAC);
- \( Password \) is the master password for generating the derived key;
- \( Salt \) is the encryption salt;
- \( Iterations \) is the number of iterations, the more times;
- \( HashWidth \) is the length of the derived key;
- \( DK \) is the generated derived key.
PRF (HMAC) roughly iterative process, the first time the Password is passed in as the key and Salt, and then the output result is used as the input to repeat the subsequent iterations.
HMAC : Hash-based message authentication code that can provide authentication using a shared secret. For example, HMAC-SHA256, input the message and key to be authenticated for calculation, and then output the hash value of sha256.
PBKDF2 is different from the MD and SHA hash functions in that it improves the cracking difficulty by increasing the number of iterations, and it can also be configured according to the situation, which makes it have a sliding computational cost.
With MD5 and SHA, an attacker can guess billions of passwords per second. With PBKDF2, the attacker can only make a few thousand guesses per second (or less, depending on configuration), so it is suitable for fighting brute force attacks.
In 2021, OWASP recommends 310,000 iterations for PBKDF2-HMAC-SHA256 and 120,000 iterations for PBKDF2-HMAC-SHA512
public static String pbkdf2Encode(String rawPassword, byte[] salt) {
try {
int iterations = 310000;
int hashWidth = 256;
PBEKeySpec spec = new PBEKeySpec(rawPassword.toCharArray(), salt, iterations, hashWidth);
SecretKeyFactory skf = SecretKeyFactory.getInstance("PBKDF2WithHmacSHA256");
return Base64.getEncoder().encodeToString(skf.generateSecret(spec).getEncoded());
} catch (GeneralSecurityException ex) {
throw new IllegalStateException("Could not create hash", ex);
}
}
Bcrypt
Introduction
bcrypt is an encrypted hash function designed based on the eksblowfish algorithm. Its biggest feature is that it can dynamically adjust the work factor (the number of iterations) to adjust the calculation speed, so even if the computer power continues to increase in the future, it can still resist brute force attacks.
Regarding the eksblowfish algorithm, it adopts the block encryption mode and supports the dynamic setting of the key calculation cost (the number of iterations). A detailed introduction to the algorithm can be found in the following articles:
https://www.usenix.org/legacy/publications/library/proceedings/usenix99/full_papers/provos/provos_html/node4.html
structure
The password string input to the bcrypt function does not exceed 72 bytes, contains the algorithm identifier, a computational cost, and a 16-byte (128-bit) salt value. The 24-byte (192-bit) hash is obtained through the input calculation, and the final output format is as follows:
$2a$12$DQoa2eT/aXFPgIoGwfllHuj4wEA3F71WWT7E/Trez331HGDUSRvXi
\__/\/ \____________________/\_____________________________/
Alg Cost Salt Hash
-
$2a$
: bcrypt algorithm identifier or version; -
12
: work factor (2^12 means 4096 iterations) -
DQoa2eT/aXFPgIoGwfllHu
: salt value of base64; -
j4wEA3F71WWT7E/Trez331HGDUSRvXi
: Calculated Base64 hash value (24 bytes).
bcrypt version
-
$2a$
: Specifies that the hash string must be UTF-8 encoded and must contain a null terminator. -
$2y$
: This release fixes a bug in PHP's bcrypt implementation in June 2011. -
$2b$
: This release fixes a bug in OpenBSD's bcrypt implementation in February 2014.
Discovered in OpenBSD's bcrypt implementation in February 2014, it uses an unsigned 8-bit value to hold the length of the password. For passwords longer than 255 bytes, the password will be truncated at the lesser of 72 or length modulo 256, instead of being truncated to 72 bytes. For example: a 260-byte password will be truncated to 4 bytes instead of 72 bytes.
practice
The key to bcrypt is to set an appropriate work factor. There is no specific rule for the ideal work factor. It mainly depends on the performance of the server and the number of users on the application. Generally, it is set in a trade-off between security and application performance .
If your factor is set high, although it can be guaranteed that it is difficult for an attacker to crack the hash, the login verification will also be slow, which will seriously affect the user experience, and it may also be executed by the attacker exhausting the server's CPU through a large number of login attempts to perform denial of service. attack. Generally speaking, it should not take more than one second to calculate the hash.
We use spring security BCryptPasswordEncoder
to see the time of hash generation under different factors. My computer configuration is as follows:
Processor: 2.2 GHz quad-core Intel Core i7
Memory: 16 GB 1600 MHz DDR3
Graphics Card: Intel Iris Pro 1536 MB
Map<Integer, BCryptPasswordEncoder> encoderMap = new LinkedHashMap<>();
for (int i = 8; i <= 21; i++) {
encoderMap.put(i, new BCryptPasswordEncoder(i));
}
String plainTextPassword = "huhdfJ*!4";
for (int i : encoderMap.keySet()) {
BCryptPasswordEncoder encoder = encoderMap.get(i);
long start = System.currentTimeMillis();
encoder.encode(plainTextPassword);
long end = System.currentTimeMillis();
System.out.println(String.format("bcrypt | cost: %d, time : %dms", i, end - start));
}
bcrypt | cost: 8, time : 39ms
bcrypt | cost: 9, time : 45ms
bcrypt | cost: 10, time : 89ms
bcrypt | cost: 11, time : 195ms
bcrypt | cost: 12, time : 376ms
bcrypt | cost: 13, time : 720ms
bcrypt | cost: 14, time : 1430ms
bcrypt | cost: 15, time : 2809ms
bcrypt | cost: 16, time : 5351ms
bcrypt | cost: 17, time : 10737ms
bcrypt | cost: 18, time : 21417ms
bcrypt | cost: 19, time : 43789ms
bcrypt | cost: 20, time : 88723ms
bcrypt | cost: 21, time : 176704ms
The fitting yields the following formula:
$$ 10.3064 \cdot e^{0.696464 x} $$
BCryptPasswordEncoder
factor range is 4-31, the default is 10, let's deduce how long it takes to 31 according to the formula.
/**
* @param strength the log rounds to use, between 4 and 31
*/
public BCryptPasswordEncoder(int strength) {
this(strength, null);
}
$$ 10.3064 \cdot e^{0.696464(31)} = 24529665567.08815 $$
The work factor 31
takes about 284
days, so we know that using bcrypt can easily scale the hashing process to accommodate faster hardware, leaving us with a lot of wiggle room leeway to prevent attackers from benefiting from future technological improvements.
SCrypt
SCrypt came out later than the algorithm mentioned above and is a password-based key derivation function created by Colin Percival in March 2009. We need to understand the following two points about this algorithm:
- The algorithm is specifically designed to perform large-scale custom hardware attacks by requiring large amounts of memory, which is expensive.
- It belongs to the same category of key derivation functions and PBKDF2 mentioned above.
Spring security also implements this algorithm SCryptPasswordEncoder
, the input parameters are as follows:
- CpuCost: The cpu cost of the algorithm. Must be a power of 2 greater than 1. Default is currently 16,384 or 2^14)
- MemoryCost: The memory cost of the algorithm. The default is currently 8.
- Parallelization: The parallelization of the algorithm currently defaults to 1. Note that this implementation does not currently utilize parallelization.
- KeyLength: The key length of the algorithm. The current default is 32.
- SaltLength: Salt length. The current default is 64.
However, it was also mentioned that it is not recommended to use it in production systems to store passwords, and his conclusion is that first SCrypt is designed to be a key derivation function rather than a cryptographic hash, and its implementation is not so perfect. See the article below for details.
https://blog.ircmaxell.com/2014/03/why-i-dont-recommend-scrypt.html
in conclusion
I would recommend using bcrypt. Why bcrypt?
In the password storage scenario, hashing the password is the best way. First, it is a cryptographic hash function itself. Second, according to the definition of Moore's Law, the number of transistors per square inch on an integrated system will double about every 18 months. some time. In 2 years, we can increase its work factor to accommodate any changes.
Of course this does not mean that other algorithms are not secure enough, you can still choose other algorithms. It is recommended to use bcrypt first, followed by key derivation classes (PBKDF2 and SCrypt), and lastly hash+salt (SHA256(salt)).
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。