很奇怪，为什么我的爬虫程序在爬到第100部电影，猫鼠游戏的时候爬不下来？

现在的代码

import requests
import re

path = 'F:豆瓣Top250.txt'
#抓取网页
def getHTMLText(url):
    try:
        r = requests.get(url,timeout=30)
        r.raise_for_status()
        r.encoding = r.apparent_encoding
        return r.text
    except:
        return ""

#分析网页，提取所需信息
def parseHTML(info,html):
    try:
        tlt = re.findall(r'\"title\"\>[\w\u4e00-\u9fa5].*[\w \u4e00-\u9fa5]*',html)
        dirlt = re.findall(r'导演\:\s[\w \u4e00-\u9fa5]+[\·\/\s]*[\w \u4e00-\u9fa5]*',html)
        yearlt = re.findall(r'[\d]{4}\&nbsp',html)
        coult = re.findall(r'\&nbsp\;[\s\u4e00-\u9fa5]+\&nbsp',html)
        comlt = re.findall(r'inq\"\>.+\<',html)
        rlt = re.findall(r'\"v:average\"\>[0-9]\.[0-9]',html)
        
        for i in range(len(tlt)):      #电影个数
            title = re.split('>|<',tlt[i])[1]   #用>隔开
            direct = dirlt[i].split(': ')[1]
            year = yearlt[i].split('&')[0]
            country = re.split(';|&',coult[i])[2]
            comment = re.split('>|<',comlt[i])[1]
            rank = rlt[i].split('>')[1]   #用>隔开
            info.append([title,year,direct,country,comment,rank])
    except:
        print("")

def printInfo(info):
    tplt = "{:\u3000>7}:{:<7}"
    count = 0
    for g in info:
        with open(path,'a',encoding='utf-8') as f:
            count = count + 1
            print(tplt.format('序号',count,chr(12288)))
            print(tplt.format('电影名称',g[0],chr(12288)))
            print(tplt.format('年份',g[1],chr(12288)))
            print(tplt.format('导演',g[2],chr(12288)))
            print(tplt.format('国家',g[3],chr(12288)))
            print(tplt.format('简短点评',g[4],chr(12288)))
            print(tplt.format('豆瓣评分',g[5],chr(12288)))
            print("-------------------------------------")
            f.write(tplt.format('序号',count,chr(12288))+'\n')
            f.write(tplt.format('电影名称',g[0],chr(12288))+'\n')
            f.write(tplt.format('年份',g[1],chr(12288))+'\n')
            f.write(tplt.format('导演',g[2],chr(12288))+'\n')
            f.write(tplt.format('国家',g[3],chr(12288))+'\n')
            f.write(tplt.format('简短点评',g[4],chr(12288))+'\n')
            f.write(tplt.format('豆瓣评分',g[5],chr(12288))+'\n')
            f.write("-------------------------------------"+'\n')
                
def main():
    start_url = "https://movie.douban.com/top250?start="
    depth = 10   #总共10页
    infomation = [] #用来存储相关信息
    
    for i in range(depth):
        try:
            url = start_url+str(25*i)
            html = getHTMLText(url)
            parseHTML(infomation,html)
        except:
                print("")
    printInfo(infomation)
main()

而豆瓣那也所在的网页代码如下：

  <li>
            <div class="item">
                <div class="pic">
                    <em class="">100</em>
                    <a href="https://movie.douban.com/subject/1305487/">
                        <img width="100" alt="猫鼠游戏" src="https://img3.doubanio.com/view/photo/s_ratio_poster/public/p453924541.webp" class="">
                    </a>
                </div>
                <div class="info">
                    <div class="hd">
                        <a href="https://movie.douban.com/subject/1305487/" class="">
                            <span class="title">猫鼠游戏</span>
                                    <span class="title">&nbsp;/&nbsp;Catch Me If You Can</span>
                                <span class="other">&nbsp;/&nbsp;逍遥法外  /  神鬼交锋(台)</span>
                        </a>


                            <span class="playable">[可播放]</span>
                    </div>
                    <div class="bd">
                        <p class="">
                            导演: 史蒂文·斯皮尔伯格 Steven Spielberg&nbsp;&nbsp;&nbsp;主演: 莱昂纳多·迪卡普里奥 L...<br>
                            2002&nbsp;/&nbsp;美国 加拿大&nbsp;/&nbsp;传记 犯罪 剧情
                        </p>

我用正则表达式匹配工具，是可以匹配出猫鼠游戏的，但是为何这里不行，求好心人帮忙，学爬虫，总是遇到各种迷之问题o(╯□╰)o

阅读 2.7k

很奇怪，为什么我的爬虫程序在爬到第100部电影，猫鼠游戏的时候爬不下来？

你尚未登录，登录后可以

字节的 trae AI IDE 不支持类似 vscode 的 ssh remote 远程开发怎么办？

DataCap 中验证码无法显示，后台出现 NullPointerException 错误?

发现深拷贝和浅拷贝效果一致：请问一下有什么区别呢？

如何实现一个深拷贝函数？

Python 成员变量在多个子类实例间共享，如何避免？

分解质因素的算法很难，理解不了。请问有哪位大佬可以进行解释一下呢？

为什么 Qwen2.5-Omni-7B 官方教程都报错 Cannot import available module of Qwen2_5OmniModel in modelscope ？

很奇怪，为什么我的爬虫程序在爬到第100部电影，猫鼠游戏的时候爬不下来？

你尚未登录，登录后可以

字节的 trae AI IDE 不支持类似 vscode 的 ssh remote 远程开发怎么办？

DataCap 中验证码无法显示，后台出现 NullPointerException 错误?

发现深拷贝和浅拷贝效果一致：请问一下有什么区别呢？

如何实现一个深拷贝函数？

Python 成员变量在多个子类实例间共享，如何避免？

分解质因素的算法很难，理解不了。 请问有哪位大佬可以进行解释一下呢？

为什么 Qwen2.5-Omni-7B 官方教程都报错 Cannot import available module of Qwen2_5OmniModel in modelscope ？

分解质因素的算法很难，理解不了。请问有哪位大佬可以进行解释一下呢？