ValueError:必须仅使用布尔值传递 DataFrame

新手上路,请多包涵

问题

在此数据文件中,美国使用“REGION”列分为四个区域。

创建一个查询,查找属于区域 1 或 2、名称以“华盛顿”开头且 POPESTIMATE2015 大于其 POPESTIMATE 2014 的县。

_此函数应返回一个 5x2 DataFrame,其列 = [‘STNAME’, ‘CTYNAME’] 和与 censusdf 相同的索引 ID(按索引升序排列)。

代码

    def answer_eight():
    counties=census_df[census_df['SUMLEV']==50]
    regions = counties[(counties[counties['REGION']==1]) | (counties[counties['REGION']==2])]
    washingtons = regions[regions[regions['COUNTY']].str.startswith("Washington")]
    grew = washingtons[washingtons[washingtons['POPESTIMATE2015']]>washingtons[washingtons['POPESTIMATES2014']]]
    return grew[grew['STNAME'],grew['COUNTY']]

outcome = answer_eight()
assert outcome.shape == (5,2)
assert list (outcome.columns)== ['STNAME','CTYNAME']
print(tabulate(outcome, headers=["index"]+list(outcome.columns),tablefmt="orgtbl"))

错误

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-77-546e58ae1c85> in <module>()
      6     return grew[grew['STNAME'],grew['COUNTY']]
      7
----> 8 outcome = answer_eight()
      9 assert outcome.shape == (5,2)
     10 assert list (outcome.columns)== ['STNAME','CTYNAME']

<ipython-input-77-546e58ae1c85> in answer_eight()
      1 def answer_eight():
      2     counties=census_df[census_df['SUMLEV']==50]
----> 3     regions = counties[(counties[counties['REGION']==1]) | (counties[counties['REGION']==2])]
      4     washingtons = regions[regions[regions['COUNTY']].str.startswith("Washington")]
      5     grew = washingtons[washingtons[washingtons['POPESTIMATE2015']]>washingtons[washingtons['POPESTIMATES2014']]]

/opt/conda/lib/python3.5/site-packages/pandas/core/frame.py in __getitem__(self, key)
   1991             return self._getitem_array(key)
   1992         elif isinstance(key, DataFrame):
-> 1993             return self._getitem_frame(key)
   1994         elif is_mi_columns:
   1995             return self._getitem_multilevel(key)

/opt/conda/lib/python3.5/site-packages/pandas/core/frame.py in _getitem_frame(self, key)
   2066     def _getitem_frame(self, key):
   2067         if key.values.size and not com.is_bool_dtype(key.values):
-> 2068             raise ValueError('Must pass DataFrame with boolean values only')
   2069         return self.where(key)
   2070

ValueError: Must pass DataFrame with boolean values only

我一无所知。我哪里错了?

谢谢

原文由 Umang Mistry 发布,翻译遵循 CC BY-SA 4.0 许可协议

阅读 609
2 个回答

您正在尝试使用不同形状的 df 来掩盖您的 df,这是错误的,此外,您传递条件的方式使用不当。当您将 df 中的列或系列与标量进行比较以生成布尔掩码时,您应该只传递条件,而不是连续使用它。

 def answer_eight():
    counties=census_df[census_df['SUMLEV']==50]
    # this is wrong you're passing the df here multiple times
    regions = counties[(counties[counties['REGION']==1]) | (counties[counties['REGION']==2])]
    # here you're doing it again
    washingtons = regions[regions[regions['COUNTY']].str.startswith("Washington")]
    # here you're doing here again also
    grew = washingtons[washingtons[washingtons['POPESTIMATE2015']]>washingtons[washingtons['POPESTIMATES2014']]]
    return grew[grew['STNAME'],grew['COUNTY']]

你要:

 def answer_eight():
    counties=census_df[census_df['SUMLEV']==50]
    regions = counties[(counties['REGION']==1]) | (counties['REGION']==2])]
    washingtons = regions[regions['COUNTY'].str.startswith("Washington")]
    grew = washingtons[washingtons['POPESTIMATE2015']>washingtons['POPESTIMATES2014']]
    return grew[['STNAME','COUNTY']]

原文由 EdChum 发布,翻译遵循 CC BY-SA 3.0 许可协议

def answer_eight():
    df=census_df[census_df['SUMLEV']==50]
    #df=census_df
    df=df[(df['REGION']==1) | (df['REGION']==2)]
    df=df[df['CTYNAME'].str.startswith('Washington')]
    df=df[df['POPESTIMATE2015'] > df['POPESTIMATE2014']]
    df=df[['STNAME','CTYNAME']]
    print(df.shape)
    return df.head(5)

原文由 gourav chatterjee 发布,翻译遵循 CC BY-SA 4.0 许可协议

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题