2

RFC3986

First of all, let's take a look at the RFC 3986 standard, which simply stipulates the following: Except 数字 + 字母 + -_.~ will not be escaped, other characters will not be escaped 百分号(%)后跟两位十六进制数 %{hex} to escape.

再者,了解下wwwpost form data也就是x-www-form-urlencode规则:除-_. (没有~非字母非数字都将被替换成百分号(%)后跟两位十六进制数 %{hex}空格 (注意)则编码为加号+ .

The difference between the two is as follows:

1、 rfc3986对---d6d44e4a5193dcfac6e2245fc3db4708 ~转码, x-www-form-urlencode ~做转码%7E
2、 rfc3986 空格 0ee30b0f863ccb65dc6938d4af530c2c---转为%20x-www-form-urlencode对---ae68aa935f30a3263a8771ff23f19d0f 空格 + .

Next, let's look at the encoding methods of several high-level languages url .

 js encodeURIComponent
php urlencode/rawurlencode
go url.QueryEscape

js

encodeURIComponent

 console.log(encodeURIComponent("hello233 ~-_."))
hello233%20~-_.

It can be seen that js completely follows rfc3986 , and retains ~-_. , 空格 and is transcoded to %20 normal.

php

urlencode

 <?php
echo urlencode("hello233 ~-_.");
hello233+%7E-_.

空格+ ,只保留-_.没保留~x-www-form-urlencode规则。

rawurlencode

 <?php
echo rawurlencode("hello233 ~-_.");
hello233%20~-_.

rfc3986 mode.

http_build_query

这里要清楚, http_build_query key val urlencode= & symbols are not processed ( go of url.Values.Encode can clearly see the above-mentioned processing).

 <?php
echo http_build_query(["msg" => "hello233 ~-_.", "name" => "sqrtCat"]);
msg=hello233+%7E-_.&name=sqrtCat

go

The encoding method of go is more interesting, which is why I wrote this article.
go url.Values.Encode ( phpksort+http_build_query )、 url.QueryEscape (相当于php of urlencode/rawurlencode ).

url.Values.Encode

phphttp_query_builder , 只key val= & No processing, just look at the implementation.

 func (v Values) Encode() string {
    if v == nil {
        return ""
    }
    var buf strings.Builder
    keys := make([]string, 0, len(v))
    for k := range v {
        keys = append(keys, k)
    }
    sort.Strings(keys)
    for _, k := range keys {
        vs := v[k]
        keyEscaped := QueryEscape(k)
        for _, v := range vs {
            if buf.Len() > 0 {
                buf.WriteByte('&')//拼接
            }
            buf.WriteString(keyEscaped)//已转义
            buf.WriteByte('=')//拼接
            buf.WriteString(QueryEscape(v))//转义
        }
    }
    return buf.String()
}

It can be seen that key and val are escaped using url.QueryEscape , then we continue to look at its escape standard.

url.QueryEscape

 func main() {
    fmt.Println(url.QueryEscape("hello233 ~-_."))
}
hello233+~-_.

Are you a little confused?
~ is reserved, rfc3986 ? 空格+ %20 ,这不是x-www-form-urlencode的么, ~ didn't change to %70 .

Summarize

  1. If you don't have 空格 and ~ in your data, you can skip it directly.
  2. In fact, whether it is the mixed encoding of rfc3986 or x-www-form-urlencode or go , the address bar of the browser can be correctly parsed and processed, you can try to print it GET parameter, all three items can get the data correctly. (But taking the result of rfc3986 encoding, let the decoding following the x-www-form-urlencode mode will not solve the correct data, you simply think that this is the feature of the browser address bar).
    http://0.0.0.0:8888/?rfc3986=hello233%20~-_.&urlencode=hello233+%7E-_.&go=hello233+~-_.

     array(3) {
      ["rfc3986"]=>
      string(12) "hello233 ~-_"
      ["urlencode"]=>
      string(13) "hello233 ~-_."
      ["go"]=>
      string(13) "hello233 ~-_."
    }
  3. 我掉坑的主要原因是一些历史遗留的服务还在使用参数字典排序签名验证 75628399299a4c2e56a5208f3be1dba6---的模式,碰巧数据中含有---fa9c568554ab6d7d10b662d21729fb7f 空格 ~gourl.Values.Encode 空格+~保留, phphttp_build_query --right http_build_query 空格 turn + , but right ~ turn %7E the two ends always do not match the signature of 23bfc6c539.

solution

  1. Follow the x-www-form-urlencode pattern.
    1.1 go end url.Values.Encode do ~ replace with %7E .
    1.2 php end directly ksort + http_build_query can be.
  2. Follow the rfc3986 standard.
    2.1 The premise ensures that there is no 空格 in the data. go also changed to rfc3986 mode without 空格 349d58147d8b85e3310217a286460dbd---.
    2.2 php end rawurlencode(urldecode(http_build_query($params))) .
    2.3 go end url.QueryEscape(url.QueryUnescape(url.Values.Encode())) .
    2.4 http_build_query / url.Values.Encode() queryString key val已转码,所以urldecode / url.QueryUnescape get the original key=val&key=val and then encode it.
  3. Do not escape the signed queryString .
    3.1 php End urldecode(http_build_query($params)) After calculating the signature.
    3.2 go end url.QueryUnescape(url.Values.Encode()) then calculate the signature.

big_cat
1.7k 声望130 粉丝

规范至上


引用和评论

0 条评论