项目通过nginx_lua把http日志用json的形式存进redis,然后取出来用python分析,大部分日志都没问题,小部分日志解析的时候报编码错误
Traceback (most recent call last):
File "/home/dev/Spectre/spectre/libs/sqlmapd/request.py", line 41, in get
req = json.loads(req)
File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib64/python2.7/json/decoder.py", line 365, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib64/python2.7/json/decoder.py", line 381, in raw_decode
obj, end = self.scan_once(s, idx)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xcf in position 0: invalid continuation byte
报错的日志长这样:
{"host":"configsvr.msf.3g.qq.com","method":"POST","uri":"\/configsvr\/serverlist.jsp","args":"[iYcp!\u0011dY#\u0002Yc\b000e012⚾!012`:\"\"Nu001fc3\u001f00160ˮ\"܀\u000e\u0016\u001au0003in=true&=true&i007f0001a\u001aҰKf\u0004\u0004`\nr\u001b1O3-=true&0017u001f=true","agent":"QQ\/6.3.5.437 CFNetwork\/758.4.3 Darwin\/15.5.0","nowtime":"07\/Jun\/2016:17:47:28 +0800","remote":"192.168.36.154","cookie":"uin=o1666666666; vkey=A1JloT4cYlb\/a5v2sVDYH+Kx+vNk59pG3bf8d2ea0201=="}
类型是str,编码未知,会不会是lua和python的json编码解码不完全兼容呢?
应该是的,你可以试一下手动转化日志到 utf-8 再看看是否可以正常 load