HTTPS
HTTP Cons:
- The communication is in clear text (not encrypted) and the content may be eavesdropped;
- The identity of the communicating party is not verified, so there is a possibility of masquerading;
- The integrity of the message cannot be proven, so it may have been tampered with;
In order to solve the above problems uniformly, mechanisms such as encryption and authentication need to be added to HTTP. That is, the HTTP with the encryption and authentication mechanism added is called HTTPS (HTTP Secure).
HTTPS
HTTPS is not a new protocol at the application layer, but the part of the HTTP communication interface is replaced by the SSL or TLS protocol. With SSL, HTTP has encryption, certificates, and integrity protection.
Typically, HTTP communicates directly with TCP. When using SSL, it evolves to communicate with SSL first, and then communicate with SSL and TCP.
SSL : Secure Socket Layer, secure socket layer, the main task is to provide privacy, information integrity and authentication. SSL is independent of the HTTP protocol and is located between the TCP/IP protocol and various application layer protocols. Other protocols such as SMTP and Telnet running at the application layer can be used with the SSL protocol. SSL is the most widely used network security technology in the world today. For details, please refer to Detailed explanation of SSL/TLS protocol
TLS : Transport Layer Security
HTTPS : HTTP Secure, Hypertext Transfer Security Protocol
HTTPS hybrid encryption mechanism
HTTPS adopts a hybrid encryption mechanism that uses both symmetric encryption and asymmetric encryption, making full use of their respective advantages.
Symmetric encryption
Encryption and decryption using the same key is called symmetric encryption.
In symmetric encryption, the key must also be sent to the other party.
Why does not only use symmetric encryption?
If both parties of the communication hold the same key and do not leak it out, the communication between the two parties is secure. However, the question is how does let both parties in the transmission know the key securely? .
If a key is generated by the server and transmitted to the browser, and the communication is monitored at this time, the key may fall into the hands of an attacker, and the meaning of encryption is lost.
If the key of website A is pre-stored in the browser, and it can be ensured that no one except the browser and website A knows the key, it is theoretically possible to use symmetric encryption. As long as the browser pre-stores the keys for all HTTPS websites in the world! This is clearly unrealistic.
Asymmetric encryption
Asymmetric encryption solves the shortcomings of symmetric encryption well. Encryption and decryption using different keys is called asymmetric encryption.
So asymmetric encryption has two keys, one is the private key and the other is the public key. The information encrypted by the private key can only be decrypted by the public key, and the information encrypted by the public key can only be decrypted by the private key. Why is this so? Because RSA algorithm .
RSA algorithm
The calculation process of the RSA algorithm is as follows:
- Randomly choose two prime numbers p and q
- Calculate n = pq
- Calculate φ(n) = (p-1)(q-1)
- Find a small odd number e that is coprime to φ(n). Coprime means that the common divisor of the two numbers is only 1
- For modulo φ(n), calculate the multiplicative inverse d of e, that is, find a d such that the following equation holds: (e*d) mod φ(n) = 1
- Get public key: (e, n), private key: (d, n)
- Encryption process: c = (m^e) mod n, (c is the encrypted ciphertext, m is the original text)
- Decryption process: m = (c^d) mod n
Why can't the data encrypted by the public key be decoded by itself? Because the encryption algorithm (m^e) mod n
is a modulo operation, the modulo operation cannot be reversed to . For example, 5 modulo 4, 5 % 4 = 1, but the other way around, knowing that x % 4 = 1, find x. This x can be infinite, 5, 9, 13... So even if there is a public key (e,n) and a ciphertext c, I don't know which value (m^e) takes. This is the core of asymmetric encryption. confidential. The same is true for private key encryption, and you can't solve it yourself if you encrypt it yourself.
Why doesn't just use asymmetric encryption?
Because asymmetric encryption is very time-consuming and inefficient, while symmetric encryption is much faster, the number of asymmetric encryptions should be minimized.
Request Process for Hybrid Encryption Mechanism
- A website has public key A and private key A' for asymmetric encryption.
- The browser requests the website server, and the server transmits the public key A plaintext to the browser.
- The browser randomly generates a key X for symmetric encryption, encrypts it with the public key A and sends it to the server.
- After getting it, the server decrypts it with the private key A' to obtain the key X.
- Both parties have the key X, and then all data of both parties can be encrypted and decrypted through the key X.
Why is the public key transmitted in plaintext?
If the public key is encrypted and transmitted, it becomes a chicken-and-egg problem...
Everything looks perfect at this point, but how does prove that the public key received by the browser must be the website's public key? If the request is attacked:
- A website has public key A and private key A' for asymmetric encryption.
- The browser requests the website server, and the server transmits the public key A plaintext to the browser.
- attacker hijacks the public key A, saves it, and replaces the public key A in the data packet with the fake public key B (the attacker has the private key B' corresponding to the public key B) .
- The browser generates a key X for symmetric encryption, encrypts it with the public key B (the browser cannot know that the public key has been replaced) and sends it to the server.
- After hijacking, the attacker decrypts with the private key B' to obtain the key X, and then encrypts it with the public key A and transmits it to the server .
- After getting it, the server decrypts it with the private key A' to obtain the key X.
At this time, neither side will find any abnormality. The root cause of this situation is that the browser cannot confirm whether the received public key is correct.
digital certificate
To sum up, the browser cannot prove that the public key received is the public key of the website. To solve this problem, the digital certificate . A digital certificate is issued to the server by a digital certificate certification authority (CA, Certificate Authority). (The CA is a third-party authority trusted by both the client and the server.)
The CA agency generates the certificate through the relevant information provided by the server. A digital certificate usually contains:
- public key;
- holder information;
- Information about the Certificate Authority (CA);
- The CA's digital signature of this document and the algorithm used;
- certificate validity period;
- Other extra info...
You can view the digital certificate in the browser console. Baidu's digital certificate is as follows:
How to prevent certificate tampering?
In order to avoid tampering with the content of the certificate, digital signature Certificate Signature appeared. The CA organizes the website's public key, purpose, issuer, valid time and other basic information into a package, and then performs Hash calculation on these information to obtain a Hash value; then the CA agency uses 's own private key to the Hash value Encryption generates a digital signature, that is, the CA signs the certificate.
Business Process
- The server submits an application for a digital certificate to the CA agency. After the CA agency determines the identity of the applicant, it generates a digital certificate with the public key, purpose, issuer, valid time, digital signature, hash algorithm and other information, and sends it to the server. .
- The server will send this digital certificate to the client, and the client who receives the certificate will use the public key of the CA to verify the digital certificate.
The client uses the same Hash algorithm to obtain the Hash value H1 of the certificate; decrypts the digital signature with the CA's public key to obtain a Hash value H2; compares H1 and H2, if the values are the same, it is a trusted certificate, otherwise, the certificate is considered unacceptable letter.
If the certificate verification passes, the client knows two things:- The authentication server's public key is a real and valid digital certificate authority.
- The public key of a digital certificate is trustworthy.
- After the certificate verification is passed, the client generates a key X for symmetric encryption, encrypts it with the public key, and transmits it to the server.
- After getting it, the server decrypts it with the private key A to obtain the key X. The subsequent communication is encrypted and decrypted by the key X.
Why do you need a hash when making a digital signature?
Because the hash code obtained after the hash algorithm is a fixed-length hash code (for example, a fixed 128-bit value can be obtained after hashing with the md5 algorithm). The recipient decrypts the private key to obtain a hash code, and then uses the hash algorithm to hash the message content to obtain a new hash code. It only needs to compare the decrypted hash code and the two fixed-length hash codes from the hash content. ; Encrypting and decrypting the message content directly performs better than without hashing.
How to prove that the public key of the CA authority is trusted?
The public key of the CA authority must be securely transferred to the client. When using a communication method, it is difficult to transfer it securely. Therefore, most browsers implant the public key of a commonly used authentication authority in advance. This ensures that the public key of the CA authority is trusted. (The chicken-and-egg problem...)
certificate chain
Please look forward to it~~
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。