新手上路，请多包涵

我正在尝试学习 Python，并尝试编写代码从我的教堂网站下载所有圣经 mp3 文件，其中有一个 mp3 超链接列表，例如：

第1章，第2、3、4、5章等等…… 参考链接

运行我的代码后，我设法让所有 mp3 URL 链接显示在 shell 上，但我似乎根本无法下载它们。

这是我的代码

import requests
import urllib.request
import re
from bs4 import BeautifulSoup

r = requests.get('https://ghalliance.org/resource/bible-reading')
soup = BeautifulSoup(r.content, 'html.parser')

for a in soup.find_all('a', href=re.compile('http.*\.mp3')):
    print(a['href'])

我确实尝试过使用 wget 但我似乎无法让 wget 在运行 VSCode Python 3.8.1 64 位或 conda 3.7.4 的机器上工作…我已经检查了 conda cmd 和 cmd，它表明我我的系统中有 wget，我什至手动将 wget.exe 下载到我的 system32 目录，但每当我尝试运行

wget.download(url)

我总是收到一条错误消息或类似 wget 没有属性“下载”或诸如此类的东西。

我阅读了一些关于使用 selenium、wget、beautifulsoup 下载简单图片等的初学者教程，但我似乎无法将他们的方法结合起来解决我的这个特定问题……因为我对编程还是太陌生了将军，所以我为问这些愚蠢的愚蠢问题而道歉。

但是现在我有了所有的 MP3 URL 链接，所以我的问题是：我如何使用 Python 下载它们？

原文由 iGamers 发布，翻译遵循 CC BY-SA 4.0 许可协议

python python-3.x python-requests download mp3

阅读 1.2k

2 个回答

得票最新

社区维基

发布于
2022-11-17

✓ 已被采纳

请注意：

to download multiple files from same host you should use requests.Session() to maintain the TCP connection session instead of keep repeat an action of opening a socket and closing 它。
您应该使用 stream=True 来避免损坏的下载。
在编写内容之前，您应该使用 .status_code 为 response 检查状态。
您还知道缺少 2 个文件名吗？这是 Chiv Keeb 22mp3 和 Cov Thawjtswj 01mp3 其中扩展名应该是 .mp3 。

以下是实现您的目标的正确代码。

 import requests
from bs4 import BeautifulSoup
import re

r = requests.get("https://ghalliance.org/resource/bible-reading/")
soup = BeautifulSoup(r.text, 'html.parser')

with requests.Session() as req:
    for item in soup.select("#playlist"):
        for href in item.findAll("a"):
            href = href.get("href")
            name = re.search(r"([^\/]+$)", href).group()
            if '.' not in name[-4]:
                name = name[:-3] + '.mp3'
            else:
                pass
            print(f"Downloading File {name}")
            download = req.get(href)
            if download.status_code == 200:
                with open(name, 'wb') as f:
                    f.write(download.content)
            else:
                print(f"Download Failed For File {name}")

原文由 αԋɱҽԃ αмєяιcαη 发布，翻译遵循 CC BY-SA 4.0 许可协议

社区维基

发布于
2022-11-17

因为您已经使用库 requests 您也可以使用 requests 下载 mp3（或任何文件）

例如，如果您想从 URL 下载文件 https://test.ghalliance.org/resources//bible_reading/audio/Chiv Keeb 01.mp3

 doc = requests.get(https://test.ghalliance.org/resources//bible_reading/audio/Chiv%20Keeb%2001.mp3)

如果下载成功。 mp3 内容将存储在 doc.content 然后您需要打开文件并将数据写入该文件。

 with open('myfile.mp3', 'wb') as f:
        f.write(doc.content)

此时您拥有文件名为“myfile.mp3”的 mp3，但您可能希望保存为与 URL 中的名称相同的文件名。

让我们从 URL 中提取文件名。

 filename = a['href'][a['href'].rfind("/")+1:]
with open(filename, 'wb') as f:
        f.write(doc.content)

现在让我们把它们放在一起。

 import requests
import urllib.request
import re
from bs4 import BeautifulSoup

r = requests.get('https://ghalliance.org/resource/bible-reading')
soup = BeautifulSoup(r.content, 'html.parser')

for a in soup.find_all('a', href=re.compile(r'http.*\.mp3')):
    filename = a['href'][a['href'].rfind("/")+1:]
    doc = requests.get(a['href'])
    with open(filename, 'wb') as f:
        f.write(doc.content)

原文由 Audy 发布，翻译遵循 CC BY-SA 4.0 许可协议

撰写回答

你尚未登录，登录后可以

和开发者交流问题的细节
关注并接收问题和回答的更新提醒
参与内容的编辑和改进，让解决方法与时俱进

推荐问题

如何使用 Python3 从网页下载所有 MP3 URL 作为 MP3？

你尚未登录，登录后可以

字节的 trae AI IDE 不支持类似 vscode 的 ssh remote 远程开发怎么办？

DataCap 中验证码无法显示，后台出现 NullPointerException 错误?

发现深拷贝和浅拷贝效果一致：请问一下有什么区别呢？

如何实现一个深拷贝函数？

Python 成员变量在多个子类实例间共享，如何避免？

为什么 Qwen2.5-Omni-7B 官方教程都报错 Cannot import available module of Qwen2_5OmniModel in modelscope ？

Spark-TTS-0.5B 的 requirements.txt 在哪里？

Stack Overflow 翻译