python多进程程序一直在运行却不出结果,求大佬改进

各位大佬好,我用如下的代码进行文本之间的相似度计算(其中相似度计算的代码未附),由于文本量非常大,要进行一一对比计算相似度,非常耗时,用如下多进程方式改进,程序如图界面报错image.png!
应该是我在``

rl =pool.map(deal_many_data,'tt',data_all)

`这句代码上传参有问题,请问该如何修改?其中data_all的格式如图image.png`

待匹配文本列表data3的格式如图所示,image.png
`
请问该如何调整,多谢大佬。

def deal_many_data(threadName,data_all,list1):
    for key,v in data_all.items():
        sim_all = 0
        count = 0
        sim3 = 0 #判断前两个是否相似不相似则跳出
        if len(data_all[key]['content']) >= 10:
            newlist = random.sample(list(range(0,len(data_all[key]['content']))),10)
        else:
            newlist = list(range(0,len(data_all[key]['content'])))
        for index in newlist:
            data = data_all[key]['title'][index] + data_all[key]['content'][index]
            sim = sentence_similarity(data,new_data)
            sim_all += sim
            count += 1
            list1[int(key)] = sim_all / count
    print ("%s: %s" % ( threadName, key ))

from multiprocessing import Process, Manager
from multiprocessing import Pool

if __name__ == '__main__':
    db,data_all,data2 = load_data()
    data3 =data2[:10]
    start_time=time.time()
    for item in data3:
        title = item['autn:content']['DOCUMENT']['DRETITLE']['$'].strip().replace(' ','')
        content = item['autn:content']['DOCUMENT']['DRECONTENT']['$'].strip().replace('\n','').replace(' ','')
        list1 = [0] * (len(data_all)+1)
        new_data =    title + ' ' + content
        with Manager() as manager:

            #list1 = manager.list1

            pool = Pool(5)       #创建拥有5个进程数量的进程池,假设核数就是4个,轮询处理4个,  
            rl =rl =pool.map(deal_many_data,'tt',data_all)#传递主运行函数,待循环变量为字典格式数据data_all,待修改共享数据变量list1
            pool.close()         #关闭进程池,不再接受新的任务  
            pool.join()          #主进程阻塞等待子进程的退出
    end = time.time()
    print ('finally cost time %ss'%(end-start_time))`
阅读 5.3k
1 个回答

你的参数传递的不对:

def map(self, func, iterable, chunksize=None):
        '''
        Apply `func` to each element in `iterable`, collecting the results
        in a list that is returned.
        '''
        return self._map_async(func, iterable, mapstar, chunksize).get()
撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题