From the public Gopher refers to the north
It is a consensus that URLs cannot contain spaces explicitly, and the form in which spaces exist is not completely consistent in different standards, so that different languages have different implementations.
rfc2396
clearly indicates that spaces should be encoded as %20
.
The W3C standard says that spaces can be replaced with +
or %20
.
+
on the spot, and the space was replaced with 06185e8bc6504c, so +
itself can only be encoded. That being the case, why not encode the spaces directly. Of course, this is just the doubt in Old Xu's heart. We can no longer trace the previous background, nor can we change the facts that have been made. However, whether the space is replaced with +
or 20%
, and +
needs to be encoded is now a question we need to face directly.
Three commonly used URL encoding methods in Go
As Gopher's first concern is the implementation of the Go language itself, so we first understand the similarities and differences of the three commonly used URL encoding methods in Go.
url.QueryEscape
fmt.Println(url.QueryEscape(" +Gopher指北"))
// 输出:+%2BGopher%E6%8C%87%E5%8C%97
When using url.QueryEscape
encoding, spaces are encoded as +
, and +
itself is encoded as %2B
.
url.PathEscape
fmt.Println(url.PathEscape(" +Gopher指北"))
// 输出:%20+Gopher%E6%8C%87%E5%8C%97
When using url.PathEscape
encoding, spaces are encoded as 20%
, but +
is not encoded.
url.Values
var query = url.Values{}
query.Set("hygz", " +Gopher指北")
fmt.Println(query.Encode())
// 输出:hygz=+%2BGopher%E6%8C%87%E5%8C%97
When encoding using the (Values).Encode
method, the space is encoded as +
, and the +
itself is encoded as %2B
, and a closer look (Values).Encode
the source code of the 06185e8bc651de method shows that it still calls the url.QueryEscape
function internally. (Values).Encode
method and url.QueryEscape
is that the former only encodes the key and value in the query, while the latter &
both =
and 06185e8bc651e7.
For us developers, which one of these three coding methods should be used, please continue to read the following article and believe you can find the answer in the following article.
Implementation in different languages
Since spaces and +
have different URL encoding methods in Go, is there such a situation in other languages? Let's take PHP and JS as examples.
URL encoding in PHP
urlencode
echo urlencode(' +Gopher指北');
// 输出:+%2BGopher%E6%8C%87%E5%8C%97
rawurlencode
echo rawurlencode(" +Gopher指北");
// 输出:%20%2BGopher%E6%8C%87%E5%8C%97
PHP's urlencode
and Go's url.QueryEscape
have the same effect, while rawurlencode
both spaces and +
.
URL encoding in JS
encodeURI
encodeURI(' +Gopher指北')
// 输出:%20+Gopher%E6%8C%87%E5%8C%97
encodeURIComponent
encodeURIComponent(' +Gopher指北')
// 输出:%20%2BGopher%E6%8C%87%E5%8C%97
JS's encodeURI
and Go's url.PathEscape
have the same effect, while encodeURIComponent
spaces and +
.
What should we do
It is more recommended to use url.PathEscape function encoding
Go
, PHP
and JS
to +Gopher pointing north have been summarized in the previous article. The following summarizes the two-dimensional table of whether the corresponding decoding operation is feasible.
encode decode | url.QueryUnescape | url.PathUnescape | urldecode | rawurldecode | decodeURI | decodeURIComponent |
---|---|---|---|---|---|---|
url.QueryEscape | Y | N | Y | N | N | N |
url.PathEscape | N | Y | N | YY | Y | YY |
urlencode | Y | N | Y | N | N | N |
rawurlencode | Y | YY | Y | Y | N | Y |
encodeURI | N | Y | N | Y | Y | Y |
encodeURIComponent | Y | YY | Y | Y | N | Y |
On the table YY
and Y
same meaning, the old promise only YY
representation recommended in Go url.PathEscape
encode, respectively, at the same time recommended in PHP and JS in rawurldecode
and decodeURIComponent
decoded.
In the actual development process, Gopher will definitely have a scene that needs to be decoded. At this time, it needs to communicate with the URL encoding party to get a proper way of decoding.
Encode the value
Is there a general way that does not require URL encoding and decoding? There is no doubt there is! Take the base32
code as an example. The coded character set is AZ and the numbers 2-7. At this time, after base32 encoding the value, URL encoding is not required.
Finally, I sincerely hope that this article can be helpful to readers.
The environment used in this article isPHP 7.3.29
,go 1.16.6
andjs Chrome94.0.4606.71 Console
reference
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。