python处理json数据

部分原始数据,使用spyder打开后的json文件显示如下:
{"msgid":"8280204259419051","msgpriority":0,"msgtext":"PTV-8698|8280204259419051|function () {}|1498722414739|257|!206!3041!0!0!1!CCTV-1 综合(高清)","receiverid":"data","senderid":"8280204259419051","subjectid":"data.stb.report"}
{"msgid":"0","msgpriority":0,"msgtext":"HC3100||ec-f4-bb-da-4c-e4|599925|257|1!5!501!0!0!2!CCTV-2 财经","receiverid":"data","senderid":"0","subjectid":"data.stb.report"}
{"msgid":"8280203295899516","msgpriority":0,"msgtext":"OTS_4K_SC|8280203295899516|00-23-b8-d6-9d-f1|139169|257|1!5!500!0!0!1!CCTV-1 综合","receiverid":"data","senderid":"8280203295899516","subjectid":"data.stb.report"}
{"msgid":"0","msgpriority":0,"msgtext":"HC3100||ec-f4-bb-da-4c-e4|747643|49|影视/剧场&logos=/poster/201705261441546372.jpg!index.html/second.html!剧场&logos=/poster/201705261441546372.jpg","receiverid":"data","senderid":"0","subjectid":"data.stb.report"}
{"msgid":"0","msgpriority":0,"msgtext":"HC3100||ec-f4-bb-da-4c-e4|751253|771|1!2591697!剧场/年度新剧!1!10","receiverid":"data","senderid":"0","subjectid":"data.stb.report"}
{"msgid":"0","msgpriority":0,"msgtext":"HC3100||ec-f4-bb-da-4c-e4|753289|772|_A1004457072!欢乐颂!01!0!0!1004457072!剧场/年度新剧!2582!1!0","receiverid":"data","senderid":"0","subjectid":"data.stb.report"}
{"msgid":"8280203295899516","msgpriority":0,"msgtext":"OTS_4K_SC|8280203295899516|00-23-b8-d6-9d-f1|182526|772|TVMA214976_A1003122726!回魂夜(香港 1995年!)!0!0!1003122726!!4734!1!0","receiverid":"data","senderid":"8280203295899516","subjectid":"data.stb.report"}
{"msgid":"0","msgpriority":0,"msgtext":"HC3100||ec-f4-bb-da-4c-e4|835216|49|影视/!index.html/!","receiverid":"data","senderid":"0","subjectid":"data.stb.report"}
{"msgid":"8510010615009789","msgpriority":0,"msgtext":"|8510010615009789|18-99-f5-ea-ed-61|1498722859743|774|1005273535!楚乔传:末路逢生(6)!湖南卫视高清!20170628!22:59:24!0!1","receiverid":"data","senderid":"8510010615009789","subjectid":"data.stb.report"}
{"msgid":"8280204143293241","msgpriority":0,"msgtext":"DVC-7078|8280204143293241||1498723024614|257|1!206!3041!0!0!1!CCTV-1 综合(高清)","receiverid":"data","senderid":"8280204143293241","subjectid":"data.stb.report"}

求助如何处理这个json文件,提取msgtext的信息,并根据‘|’和‘!’来分词,最后导入到excel文件或者csv文件格式。

阅读 3.1k
2 个回答
import json

with open('filename', 'r') as f:
    with open('out.csv', 'w') as out:
        for line in f.readline():
            msgtext = json.loads(line)['msgtext']
            out.write(msgtext.replace('!',',').replace('|',',')+'\n')
        

给你一个demo,分词什么你自己来吧,用到pandas,json。

import pandas as pd
import json
a='{"msgid":"8280204259419051","msgpriority":0,"msgtext":"PTV-8698|8280204259419051|function () {}|1498722414739|257|!206!3041!0!0!1!CCTV-1 综合(高清)","receiverid":"data","senderid":"8280204259419051","subjectid":"data.stb.report"}\n{"msgid":"0","msgpriority":0,"msgtext":"HC3100||ec-f4-bb-da-4c-e4|599925|257|1!5!501!0!0!2!CCTV-2 财经","receiverid":"data","senderid":"0","subjectid":"data.stb.report"}'
array=a.split('\n')
data=[]
for each in array:
    data.append(json.loads(each,encoding='gbk'))
res=pd.DataFrame()
for each in data:
    res=res.append(pd.DataFrame(each,index=[0]))
res.index=range(len(data))
res.to_csv('res.csv')
撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题