如何加快在 Python 中加载和读取 JSON 文件的过程?

新手上路,请多包涵

我正在运行一个脚本(在多处理模式下),它从一堆 JSON 文件中提取一些参数,但目前它非常慢。这是脚本:

 from __future__ import print_function, division
import os
from glob import glob
from os import getpid
from time import time
from sys import stdout
import resource
from multiprocessing import Pool
import subprocess
try:
    import simplejson as json
except ImportError:
    import json

path = '/data/data//*.A.1'
print("Running with PID: %d" % getpid())

def process_file(file):
    start = time()
    filename =file.split('/')[-1]
    print(file)
    with open('/data/data/A.1/%s_DI' %filename, 'w') as w:
        with open(file, 'r') as f:
            for n, line in enumerate(f):
                d = json.loads(line)
                try:

                    domain = d['rrname']
                    ips = d['rdata']
                    for i in ips:
                        print("%s|%s" % (i, domain), file=w)
                except:
                    print (d)
                    pass

if __name__ == "__main__":
    files_list = glob(path)
    cores = 12
    print("Using %d cores" % cores)
    pp = Pool(processes=cores)
    pp.imap_unordered(process_file, files_list)
    pp.close()
    pp.join()

有没有人知道如何加快速度?

原文由 UserYmY 发布,翻译遵循 CC BY-SA 4.0 许可协议

阅读 841
1 个回答
推荐问题