pandas 重复行去除,列值合并怎么做?

微信截图_20191127173401.png

city 列 chongqing 重复了,
但是population 列值比一样,
怎么可以做到合并一行,然后把population列两个不同的值存储到一个一列中?

阅读 12.4k
3 个回答
df.astype(str).groupby(['year','city'], as_index=False).agg(list).eval("population = population.str.join(',')")

这样咩~

frame.groupby(['year', 'city'], sort=False)['population'].sum().reset_index()

   year       city  population
0  2016    Beijing        2100
1  2016   Shanghai        2300
2  2015  Guangzhou        1000
3  2017   Shenzhen         700
4  2016  Chongqing         300

import pandas as pd

data = {'year':[2012,2013,2014,2015,2015],
        'city':['A','A','B','C','C'],
        'pop':[1,2,3,4,5]}

frame = pd.DataFrame(data,columns=['year','city','pop'])
groups = frame["pop"].groupby([frame["year"],frame["city"]])

year = []
city = []
pop = []

for i in groups:
    print(i)
    year.append(i[0][0])
    city.append(i[0][1])
    temp = []
    for j in i[1].values:
        temp.append(j)
    pop.append(temp)
        
d = {'a':year,'b':city,'c':pop}
da = pd.DataFrame(d)

print(da)
撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
宣传栏