头图

Pandas is written in python language, a super easy-to-use data processing tool, and also provides a particularly convenient excel reading and writing function, which can read all the data in the excel file in one sentence:

import pandas as pd

dataframe = pd.read_excel(io=file_path_name, header=1)

The read_excel method has many parameters. The header indicates which row is the header row of the excel file-Pandas will use the value in the header row as the column name in the dataframe. The number of rows starts counting from 0, for example, excel looks like this:

Empty empty emptyEmpty empty empty
Namegender
Zhang Dazhumale
Wang CuihuaFemale

Then the above code regards the second row as the title row. The dataframe read out has two columns, namely "name" and "gender". You can use dataframe['name'] to get the data in the first column.

But sometimes, we don't know which row is the title row , and the excel file is quite large , for example, there are tens of thousands of rows, and the time spent on read_excel may be as high as tens of seconds. At this time, we must first determine the position of the header row before reading the data once.

Then first look for the characteristics of the header row. For example, if we know that the first column is always "name", we can cycle through all the cells in the first column of the table until the "name" is found, and then confirm The title is out.

It happens that Pandas uses xlrd to read excel at the bottom layer, and this library does not need to be installed separately, it is directly quoted, and the following lines of code can be added:

import pandas as pd
import xlrd

workbook = xlrd.open_workbook(file_path_name)  # 打开指定的excel文件
sheet = workbook.sheets()[0]  # 读取指定的sheet表格
i = 1  # 假设缺省情况下,第2行是标题行
for i in range(10):
    value = sheet.cell(i, 0).value  # 行号、列号,都是从0开始
    if value == '姓名':
        break
workbook.release_resources()

dataframe = pd.read_excel(io=file_path_name, header=i)

So no matter which row is the header row, it can be parsed correctly


songofhawk
303 声望24 粉丝