我想根据 Pandas 中的 groupedby 合并数据框中的几个字符串。
到目前为止,这是我的代码:
import pandas as pd
from io import StringIO
data = StringIO("""
"name1","hej","2014-11-01"
"name1","du","2014-11-02"
"name1","aj","2014-12-01"
"name1","oj","2014-12-02"
"name2","fin","2014-11-01"
"name2","katt","2014-11-02"
"name2","mycket","2014-12-01"
"name2","lite","2014-12-01"
""")
# load string as stream into dataframe
df = pd.read_csv(data,header=0, names=["name","text","date"],parse_dates=[2])
# add column with month
df["month"] = df["date"].apply(lambda x: x.month)
我希望最终结果看起来像这样:
我不知道如何使用 groupby 并在“文本”列中应用某种字符串连接。任何帮助表示赞赏!
原文由 mattiasostmar 发布,翻译遵循 CC BY-SA 4.0 许可协议
You can groupby the
'name'
and'month'
columns, then calltransform
which will return data aligned to the original df and apply a lambda where wejoin
文本条目:我通过在此处传递感兴趣的列列表来子原始 df
df[['name','text','month']]
然后调用drop_duplicates
编辑 其实我可以打电话
apply
然后reset_index
:更新
lambda
在这里是不必要的: