按文本文件的行读取头、尾和向后

新手上路,请多包涵

如何在 python 中实现诸如“head”和“tail”命令之类的东西并按文本文件的行向后读取?

原文由 user739650 发布,翻译遵循 CC BY-SA 4.0 许可协议

阅读 430
2 个回答

这是我的个人文件类 ;-)

 class File(file):
    """ An helper class for file reading  """

    def __init__(self, *args, **kwargs):
        super(File, self).__init__(*args, **kwargs)
        self.BLOCKSIZE = 4096

    def head(self, lines_2find=1):
        self.seek(0)                            #Rewind file
        return [super(File, self).next() for x in xrange(lines_2find)]

    def tail(self, lines_2find=1):
        self.seek(0, 2)                         #Go to end of file
        bytes_in_file = self.tell()
        lines_found, total_bytes_scanned = 0, 0
        while (lines_2find + 1 > lines_found and
               bytes_in_file > total_bytes_scanned):
            byte_block = min(
                self.BLOCKSIZE,
                bytes_in_file - total_bytes_scanned)
            self.seek( -(byte_block + total_bytes_scanned), 2)
            total_bytes_scanned += byte_block
            lines_found += self.read(self.BLOCKSIZE).count('\n')
        self.seek(-total_bytes_scanned, 2)
        line_list = list(self.readlines())
        return line_list[-lines_2find:]

    def backward(self):
        self.seek(0, 2)                         #Go to end of file
        blocksize = self.BLOCKSIZE
        last_row = ''
        while self.tell() != 0:
            try:
                self.seek(-blocksize, 1)
            except IOError:
                blocksize = self.tell()
                self.seek(-blocksize, 1)
            block = self.read(blocksize)
            self.seek(-blocksize, 1)
            rows = block.split('\n')
            rows[-1] = rows[-1] + last_row
            while rows:
                last_row = rows.pop(-1)
                if rows and last_row:
                    yield last_row
        yield last_row

用法示例:

 with File('file.name') as f:
    print f.head(5)
    print f.tail(5)
    for row in f.backward():
        print row

原文由 fdb 发布,翻译遵循 CC BY-SA 3.0 许可协议

head 很简单:

 from itertools import islice
with open("file") as f:
    for line in islice(f, n):
        print line

tail 如果你不想将整个文件保存在内存中,那就更难了。如果输入是文件,您可以从文件末尾开始读取块。如果输入是管道,原始的 tail 也有效,因此更通用的解决方案是读取并丢弃整个输入,最后几行除外。一个简单的方法是 collections.deque

 from collections import deque
with open("file") as f:
    for line in deque(f, maxlen=n):
        print line

在这两个代码片段中, n 是要打印的行数。

原文由 Sven Marnach 发布,翻译遵循 CC BY-SA 3.0 许可协议

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题