Speed, speed, or speed, if a website wants to experience a good experience, it must be displayed at the fastest speed in the first time. MySQL query is slow, add a layer of redis for caching, website resources are slow to load, how to do it, use HTTP cache
HTTP caching has been around since HTTP/1.0, in order to reduce server pressure and speed up web page response
The target of the cache operation
HTTP cache can only store the response of GET request, and can do nothing for other types of requests
cache history
HTTP/1.0 proposes the concept of cache, namely strong cache Expires and negotiated cache Last-Modified. After HTTP/1.1, there is a better solution, namely strong cache Cache-Control and negotiated cache ETag
Why are Expires and Last-Modified not applicable?
Expires is the expiration time, but the problem is that this time point is the server's time. If the client's time is different from the server's time, it will be inaccurate. So use Cache-Control instead, which means the expiration time, there is no ambiguity
Last-Modified is the last modification time, and the unit time it can perceive is seconds, that is to say, if the content file is changed multiple times within 1 second, the content file is changed, but the display is still the previous one, and there are inaccurate scenes, so With ETag, the resource is identified by the content to determine whether the resource has changed
The following table is helpful for comparison and understanding
Version | Strong cache | Negotiate cache |
---|---|---|
HTTP/1.0 | Expires | Last-Modified |
HTTP/1.1 | Cache-Control | ETag |
Comparison of two cache types
The cache types in different versions have been described above. At that time, there was a strong cache and negotiation cache, but it was not introduced in detail. Now let's talk about these two cache types
Strong cache
Cache-Control
- HTTP/1.1
The cache is controlled by the expiration time, and there are many corresponding fields, such as max-age
- For example, Cache-Control: max-age=3600, which means that the cache time is 3600 seconds, and it expires
Cache request directive:
Cache-Control: max-age=<seconds> Cache-Control: max-stale[=<seconds>] Cache-Control: min-fresh=<seconds> Cache-control: no-cache Cache-control: no-store Cache-control: no-transform
Cache response directive:
Cache-control: must-revalidate Cache-control: no-cache Cache-control: no-store Cache-control: no-transform Cache-control: public Cache-control: private Cache-control: proxy-revalidate Cache-Control: max-age=<seconds>
Among the key points:
Cache-control: no-cache
- Skip the current strong cache and send an HTTP request (if there is a negotiated cache identifier, it will directly enter the negotiation cache stage )
- The meaning of no-cache is the same as
max-age=0
, that is, skip strong cache and force refresh
Cache-control: no-store
- No caching (including negotiated caching)
Cache-Control: public, max-age=31536000
- Generally used for caching static resources
- public: The response can be cached by intermediate proxies, CDNs, etc.
- private: dedicated to personal cache, intermediate proxy, CDN, etc. can cache this response
- max-age: The unit is seconds
For more instructions, please refer to the instruction booklet
Expires
- HTTP/1.0
grammar:
-
Expires: <http-date>
-
That is, the expiration time, which exists in the response header returned by the server
- Expires: Mon, 11 Apr 2022 06:57:18 GMT
- Indicates that the resource will expire at 6:57 on April 11, 2022, and a request will be sent to the server when it expires
- If the "max-age" or "s-max-age" command is set in the
Cache-Control
response header, then theExpires
header will be ignored - Disadvantage: Server time may not match browser time
For more instructions, please refer to the instruction booklet
Cache-Control vs Expires
- Cache-Control is more accurate than Expires
- When both exist, Cache-Control takes precedence over Expires
- Expires is proposed by HTTP/1.0, and its browser compatibility is better. Cache-Control is proposed by HTTP/1.1 and can exist at the same time. When there are browsers that do not support Cache-Control, Expires will prevail.
Negotiate cache
The negotiation cache needs to be used with the strong cache. The premise of using the negotiation cache is to set the strong cache setting Cache-Control: no-cache
or pragma: no-cache
or max-age=0
to tell the browser not to strengthen the cache
pragma is a field that prohibits web page caching in HTTP/1.0. Its value is no-cache and the effect of no-cache in Cache-Control is the same.
ETag/If-None-Match
- HTTP/1.1
- That is, the unique identifier of the file is generated to determine whether it has expired. This value will change whenever the content changes
- In conjunction with
If-None-Match
, ETag is the unique identifier returned to each resource file after requesting the server. The client will store this identifier in the client (ie the browser), and it will be displayed in the request header in the next request.If-Nono-Match
will bring its value, and the server will judge whetherIf-None-Match
is consistent with the ETag on its own server, if it is consistent, it will return 304, and the redirection jump will use the local cache; if it is inconsistent, it will return 200, return the latest resource to the client, and bring the ETag - For more instructions, please refer to the instruction booklet
Last-Modified/If-Modified-Since
- HTTP/1.0
- The last modification time, that is, whether it has expired or not is judged by the last modification time. After the browser sends a request to the server for the first time, the server will add this field to the response header
- In cooperation with
If-Modified-Since
, when the client accesses the server resource, the server will put Last-Modified in the response header, that is, the last modification time of this resource on the server, the client caches this value, and waits for the next time When requesting this resource, the browser will detect the Last-Modified in the request header, so addIf-Modified-Since
, if the value ofIf-Modified-Since
is consistent with the last modification time of this resource in the server, If it is not consistent, it will return 200, and the latest resource will be returned to the client with Last-Modified shortcoming:
- Although the file has been modified, the final content has not changed, so the modification time of the file will still be updated.
- The modification frequency of some files is within seconds, so it is not applicable to record with second granularity
- Some servers cannot accurately obtain the last modification time of a file
- For more instructions, please refer to the instruction booklet
ETag VS Last-Modified
Accuracy
- ETag > Last-Modified. ETag uses the content to identify the resource to determine whether the resource has changed, but Last-Modified is different, and the accuracy will fail in some scenarios. For example, when editing a file, but the content of the file has not changed, the cache will be invalid; or if it changes multiple times within 1 second, the unit time that Last-Modified can perceive is seconds.
performance
- Last-Modified > ETag. Last-Modified only records a time point, and ETag needs to generate a hash value based on the specific content of the file
- If both are supported, the server will prefer ETag
Negotiate cached conditional requests
As mentioned earlier, the negotiation cache is to add If-None-Match
or If-Modified-Since
to the request header. What are these request headers and what is the use of adding them?
Strong cache is to control the cache through specific time expiration or expiration time. This is a problem. If some of the files are modified, the browser will still display the original data because of the strong cache. Data cannot be cached using strong cache. Therefore, there is a negotiated cache, which tells the browser that the cache is invalid through file changes. Before using it, you need to go to the server to verify whether it is the latest version.
In this way, the browser will send two consecutive requests to verify:
- The first is the HEAD request, which obtains the meta information such as the modification time and hash value of the resource, and then compares it with the cached data. If there is no change, the cache is used.
- Otherwise, send another GET request to get the latest version
However, the network cost of such two requests is too high, so the HTTP protocol defines a series of conditional request fields starting with If, which are specially used to check and verify whether the resource expires, and combine the two requests in one request. And the responsibility of verification is also given to the server
- If-Modified-Since: Compared with Last-modified, whether it has been modified
- If-None-Match: Compare with ETag, whether the unique identifier is consistent
- If-Unmodified-Since: Compared with Last-modified, whether to modify
- If-Match: Compare with ETag for match
- If-Range
Among them, the most common ones are If-Modified-Since and If-None-Match. They correspond to Last-Modified and ETag respectively. It is necessary to provide Last-Modified and ETag in advance in the first response message, and then the original address in the cache can be brought in the second request to verify whether the resource is up-to-date.
If the resource has not changed, the server will respond with a 304 Not Modified, indicating that the cache is still valid, the browser can update a validity period, and then use the cache
When to use strong cache and when to use negotiated cache?
First of all, the weight of strong cache is greater than that of negotiation cache. When strong cache exists, negotiation cache can only watch it; secondly, the cache identifier in HTTP/1.1 is greater than HTTP/1; so when Cache-Control exists, watch it, if it If it does not exist, look at Expires. If the strong cache is set to Cache-Control:no-cache
, Cache-Control:max-age=0
, pragma: no-cache
, it will tell the browser not to enter the strong cache.
Determine whether there is an ETag in the last response, if so, initiate a request with a conditional request in the request header If-None-Match
, if not, then determine whether there is Last-Modified in the last response, if so, Then initiate a conditional request with If-Modified-Since
in the request header. If not, it means that there is no negotiated cache, and you can initiate an HTTP request. Whether it is a request with If-None-Match
or a request with ---0991384ad0fbffb71f16de5f48c90914 If-Modified-Since
, the status will be returned (the server interprets whether the resource has changed). If it is 304, it means that the cache resource has not changed, and the local cache is used; If it is 200, it means that the resource has changed, initiate an HTTP request, and remember the ETag/Last-Modified in the response header
The general flow chart is as follows:
So which resources should use strong caching, and which resources should use negotiated caching?
It is not difficult to understand that resources such as static resources that we will not change for a long time should use strong caching; and files that we often modify should use negotiated caching. If the resource does not change, then the user will still use the resource when the user enters the second time. , if the resource is modified, the user enters to initiate an HTTP request to obtain the latest resource
When we visit the website, if we pay attention, we can observe one or two in F12. As shown in the figure, my five-year front-end three-year interview is placed on the github server. F12 enters the network and can see the information in the return header. Cache-Control, Expires, ETag, Last-Modified all exist
cache location
It is often mentioned above that whether strong cache or negotiation cache is used, it will be obtained locally from the browser, so where does the browser's local storage exist, and what are their classifications?
According to the cache location, it is divided into four parts, Memory Cache (memory cache), Disk Cache (hard disk cache), Service Worker, Push Cache
Memory Cache
Because of limited memory, not all resource files will be cached in memory. It is mainly used to cache resources with preloader related instructions, such as <link rel="prefetch">
. The preloader can parse js/css files while requesting the next resource from the network
Disk Cache
Cache on disk. Among all browser caches, disk cache has the largest coverage. It will determine which resources need to be cached according to the fields in the HTTP Header, and which resources have expired and need to be re-requested from the server.
Service Worker
Independent thread, drawing on the idea of Web Worker. That is, let JS run outside the main thread, because it is out of the browser window, because it cannot directly access the DOM, but it can still do many things, such as
- Offline cache, Service Worker Cache
- message push
- web proxy
- It is an important implementation mechanism of PWA
Push Cache
i.e. push cache, last line of defense in browsers, content in HTTP2
Priority: Service Worker-->Memory Cache-->Disk Cache-->Push Cache.
practice
After talking about so much theoretical knowledge, when I wait for the actual combat, I am at a loss. How can I break it?
The above are all verbal debates, only practice can bring out the truth
At present, front-end projects are packaged with webpack or webpack-like tool library, configure hash in webpack, and the front-end caching work is completed
The effect we want to achieve is:
- HTML: Negotiate cache
- CSS, JS, pictures and other resources: strong cache, file name with hash
There are three kinds of hashes in webpack: hash, chunkHash, contentHash
- Hash: It is related to the construction of the entire project. As long as the project file is changed, the hash value of the entire project construction will change.
- chunkHash: related to the chunk packaged by webpack, different entries will generate different chunkHash values
- contentHash: Define the hash according to the content of the file. If the content of the file is unchanged, the contentHash will remain unchanged.
Here, CSS needs to be processed with contentHash, and other resources are processed with chunkHash.
Non-front-end engineering projects
That is, the traditional front-end page is generally placed in a static server, so it is necessary to perform version control on the modified files, such as adding a version number (index-v2.min.js) or adding a timestamp (time) to the entry file index.js =1626226), as a caching strategy
Backend cache practice
What really plays the role of caching is to set the caching strategy in the backend, and tell the browser whether it can do caching. Here we make a demo for strong cache and negotiation cache to experiment.
Strong caching scheme
code show as below:
const express = require('express');
const app = express();
var options = {
etag: false, // 禁用协商缓存
lastModified: false, // 禁用协商缓存
setHeaders: (res, path, stat) => {
res.set('Cache-Control', 'max-age=10'); // 强缓存超时时间为10秒
},
};
app.use(express.static((__dirname + '/public'), options));
app.listen(3008);
PS: The source of the code is: Graphical HTTP cache . When doing the test, you need to pay attention to it. Under strong cache, refreshing the page cannot be measured, and it will be valid after clicking and returning.
Negotiate caching scheme
code show as below:
const express = require('express');
const app = express();
var options = {
etag: true, // 开启协商缓存
lastModified: true, // 开启协商缓存
setHeaders: (res, path, stat) => {
res.set({
'Cache-Control': 'max-age=00', // 浏览器不走强缓存
'Pragma': 'no-cache', // 浏览器不走强缓存
});
},
};
app.use(express.static((__dirname + '/public'), options));
app.listen(3001);
The effect is as follows:
Attach two demo addresses for your reference
Summarize
Why does HTTP cache, in order to share server pressure, and to make pages load faster
What means? HTTP's strong cache and negotiation cache. Strong cache works on resources that do not change very much (such as imported libraries, js, css, etc.), and negotiation cache is suitable for frequently updated files (such as html)
What is strong cache? In HTTP/1.0, it is based on Expires, but it is not accurate. After the HTTP protocol is upgraded to 1.1, it is replaced by a new identifier Cache-Control, but both can exist at the same time, and the weight of Cache-Control is greater.
What is Negotiation Cache? In HTTP/1.0, it is based on Last-Modified, that is, the last expired modification time, which is also inaccurate. After HTTP is upgraded to 1.1, it is replaced by a new identifier ETag. Both can exist at the same time, and the latter has a greater weight.
Whether it is Expires or Last-Modified, it is based on the time point. In theory, there is no problem, but there is a problem, so there is a new solution.
When the strong cache exists, the browser will use the strong cache identifier to cache, and when the strong cache is set to be invalid, the browser will use the negotiated cache as the cache strategy
The above, even if the author understands the HTTP cache
References
- In-depth understanding of browser caching mechanism
- Thorough understanding of browser caching mechanism
- Front-end caching best practices
- The power of browser caching
- Node practice thoroughly understands strong cache and negotiation cache
- A Brief Analysis of HTTP Cache
- MDN web docs
- Graphical HTTP caching
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。