statement
All the content in this article is for learning and communication only. The captured content, sensitive URLs, and data interfaces have been desensitized. Commercial and illegal uses are strictly prohibited. Otherwise, all consequences arising therefrom have nothing to do with the author. If there is any infringement , please contact me to delete immediately!
reverse goal
- Target: A local public resource trading network
- Homepage:
aHR0cDovL2dnenkuamNzLmdvdi5jbi93ZWJzaXRlL3RyYW5zYWN0aW9uL2luZGV4
- Interface:
aHR0cDovL2dnenkuamNzLmdvdi5jbi9wcm8tYXBpLWNvbnN0cnVjdGlvbi9jb25zdHJ1Y3Rpb24vYmlkZGVyL2JpZFNlY3Rpb24vbGlzdA==
- Reverse parameters: projectId, projectInfo parameters in the URL link
reverse process
Packet capture analysis
If you enter the website through the link, you will find that you first enter the transfer circle before entering the webpage. There may be a rendering and loading process here. Open the developer tools, refresh the webpage, and slide down to see the interface of the captured data and the data return: aHR0cDovL2dnenkuamNzLmdvdi5jbi9wcm8tYXBpLWNvbnN0cnVjdGlvbi9jb25zdHJ1Y3Rpb24vYmlkZGVyL2JpZFNlY3Rpb24vbGlzdA==
, GET request, you can see the information of all announcements on the current page from the preview response preview:
There is some parameter information in Query String Parameters. The meaning of each type will be explained in detail later:
-
pageNum
: current page number -
pageSize
: page size -
informationType
: Announcement Type -
projectType
: item type -
informationName
: message type
Next, click on an announcement and jump to a new page. You will find that the web page link has become this format: XXX/index?projectId=XXX&projectInfo=XXX
, two encryption parameters projectId and projectInfo are generated, and after testing, the same announcement The values of these two encryption parameters of the page are fixed, and then we need to try to find the encrypted location of these two parameters.
Debug analysis positioning
From the homepage location CTRL + SHIFT + F to globally search for the projectId parameter, and by comparison, you can find that the two encryption parameters projectId and projectInfo are defined in chunk-63628500.eb5f8d30.js
, here is a ternary operation, if the project type is the same, then execute the following method, if different, execute it later:
What do the ZFCG and GTGC in the above code line judgment mean? CTRL + SHIFT + F searches for the ZBGG parameter globally. We can find the corresponding definitions in the chunk-043c03b8.34f6abab.js
file, and the following are their respective meanings:
On line 267, set a breakpoint at return t.stop() for debugging and analysis, click any announcement, you will find that the breakpoint is broken, that is, the location is successful, hover the mouse over the values corresponding to projectId and projectInfo, you can know the following information:
-
projectId
: item number -
projectInfo
: message type
Knowing the specific meaning of the two encryption parameters, then we need to find the encryption location. The projectId and projectInfo parameters are executed by the a.parameterTool.encryptJumpPage
method, and the encryptJumpPage jump page is encrypted? It's not just that:
We hover the mouse over a.parameterTool.encryptJumpPage
and follow up to the js file generated by the method app.3275fd87.js
to take a look:
Above we can clearly know the specific meaning of the following two parameters:
-
query
: encrypted data (projectId and projectInfo) -
nextPath
: Routing jump location
Breakpoint at line 2389 for debugging analysis. As you can see from the following figure, the projectId and projectInfo parameters are encrypted in a:
Further tracking the position of a, slide up to see that lines 2335 to 2356 are obvious DES encryption:
However, it is not known which function part encrypted the projectId and projectInfo parameters in the query. We continue to debug and analyze the breakpoint and find that the value corresponding to the projectId parameter is 424, and the value corresponding to the projectInfo parameter is 424. The value ZBGG is processed in function c(t)
, which proves that this is the key encryption location:
function c(t) {
return i.a.DES.encrypt(t, o.keyHex, {
iv: o.ivHex,
mode: i.a.mode.CBC,
padding: i.a.pad.Pkcs7
}).ciphertext.toString()
}
Analyze this key encrypted code:
-
iv
: ivHex hex initial vector -
mode
: use CBC encryption mode, which is a circular mode, the ciphertext of the previous block and the plaintext of the current block are XORed and then encrypted -
padding
: Pkcs7 padding method is used, when padding, first get the length of bytes to be filled = block length- (data length % block length), in the padding byte sequence all bytes are filled with the ones that need to be filled byte length value -
ciphertext.toString()
: Return the encrypted ciphertext as a hexadecimal string
Simulation execution
JS is directly quoted here, and the encryption module crypto-js in nodejs is used to perform DES encryption. During the debugging process, it is prompted that which function is undefined, just add its definition part. The rewritten complete JS code is as follows:
var CryptoJS = require('crypto-js');
o = {
keyHex: CryptoJS.enc.Utf8.parse(Object({
NODE_ENV: "production",
VUE_APP_BASE_API: "/pro-api",
VUE_APP_CONSTRUCTION_API: "/pro-api-construction",
VUE_APP_DEV_FILE_PREVIEW: "/lyjcdFileView/onlinePreview",
VUE_APP_FILE_ALL_PATH: "http://www.lyjcd.cn:8089",
VUE_APP_FILE_PREFIX: "/mygroup",
VUE_APP_LAND_API: "/pro-api-land",
VUE_APP_PREVIEW_PREFIX: "/lyjcdFileView",
VUE_APP_PROCUREMENT_API: "/pro-api-procurement",
VUE_APP_WINDOW_TITLE: "XXXXXX",
BASE_URL: "/"
}).VUE_APP_CUSTOM_KEY || "54367819"),
ivHex: CryptoJS.enc.Utf8.parse(Object({
NODE_ENV: "production",
VUE_APP_BASE_API: "/pro-api",
VUE_APP_CONSTRUCTION_API: "/pro-api-construction",
VUE_APP_DEV_FILE_PREVIEW: "/lyjcdFileView/onlinePreview",
VUE_APP_FILE_ALL_PATH: "http://www.lyjcd.cn:8089",
VUE_APP_FILE_PREFIX: "/mygroup",
VUE_APP_LAND_API: "/pro-api-land",
VUE_APP_PREVIEW_PREFIX: "/lyjcdFileView",
VUE_APP_PROCUREMENT_API: "/pro-api-procurement",
VUE_APP_WINDOW_TITLE: "XXXXXX",
BASE_URL: "/"
}).VUE_APP_CUSTOM_IV || "54367819")
};
function c(t) {
return CryptoJS.DES.encrypt(t, o.keyHex, {
iv: o.ivHex,
mode: CryptoJS.mode.CBC,
padding: CryptoJS.pad.Pkcs7
}).ciphertext.toString()
}
// 测试
// console.log(c('ZBGG'))
// ff15d186c4d5fa7a
VUE_APP_WINDOW_TITLE
corresponding value content has been desensitized and tested, it does not affect the result output
full code
GitHub pays attention to Brother K's crawler and continues to share crawler-related code! Welcome star! https://github.com/kgepachong/
The following only demonstrates some key codes and cannot be run directly! Complete code repository address: https://github.com/kgepachong/crawler/
Code for this case: https://github.com/kgepachong/crawler/tree/main/ggzy_jcs_gov_cn
# =======================
# --*-- coding: utf-8 --*--
# @Author : 微信公众号:K哥爬虫
# @FileName: ggzy.py
# @Software: PyCharm
# =======================
import urllib.parse
import execjs
import requests
url = '脱敏处理,完整代码关注 https://github.com/kgepachong/crawler/'
def encrypted_project_id(id_enc):
with open('ggzy_js.js', 'r', encoding='utf-8') as f:
public_js = f.read()
project_id = execjs.compile(public_js).call('Public', id_enc)
return project_id
def encrypted_project_info(info_enc):
with open('ggzy_js.js', 'r', encoding='utf-8') as f:
public_js = f.read()
project_info = execjs.compile(public_js).call('Public', info_enc)
return project_info
def get_project_info(info_name, info_type):
index_url = '脱敏处理,完整代码关注 https://github.com/kgepachong/crawler/'
urlparse = urllib.parse.urlparse(index_url)
project_info = urllib.parse.parse_qs(urlparse.query)['informationName'][0]
return project_info
def get_content(page, info_name, info_type):
headers = {
"Connection": "keep-alive",
"Pragma": "no-cache",
"Cache-Control": "no-cache",
"Accept": "application/json, text/plain, */*",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36",
"Referer": "脱敏处理,完整代码关注 https://github.com/kgepachong/crawler/",
"Accept-Language": "zh-CN,zh;q=0.9"
}
url_param = "脱敏处理,完整代码关注 https://github.com/kgepachong/crawler/"
params = {
"pageNum": page,
"pageSize": "20",
"releaseTime": "",
"search": "",
"informationType": info_type,
"departmentId": "",
"projectType": "SZFJ",
"informationName": info_name,
"onlyCanBidSectionFlag": "NO"
}
response = requests.get(url=url_param, headers=headers, params=params)
return response
def main():
print("脱敏处理,完整代码关注 https://github.com/kgepachong/crawler/")
info_name = input("请输入信息类型:")
info_type = input("请输入公告类型:")
page = input("您想要获取数据的页数:")
get_content(page, info_name, info_type)
response = get_content(page, info_name.upper(), info_type.upper())
num = int(page) * 20
print("总共获取了 %d 个项目" % num)
for i in range(20):
title = response.json()['rows'][i]['content']
query_id = response.json()['rows'][i]['projectId']
query_info = get_project_info(info_name.upper(), info_type.upper())
project_id_enc = encrypted_project_id(str(query_id))
project_info_enc = encrypted_project_info(query_info)
project_url = '%s?projectId=%s&projectInfo=%s' % (url, project_id_enc, project_info_enc)
print("第 %d 个项目:" % (i+1) + "\n" + "项目名称:%s 项目编号:%d " % (title, query_id) + "\n" + "项目链接:%s" % project_url)
if __name__ == '__main__':
main()
Code implementation effect:
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。