如何获取爬取iconfont里面的图标？

发布于
3 月 15 日山东

如何获取爬取iconfont里面的图标？

iconfont网站的搜索接口是https://www.iconfont.cn/api/icon/search.json，为什么按照接口传递了数据，请求携带的参数:

一直返回的是：Redirecting to /.

应该如何操作呢？才能拿到搜索的数据？

iconfont 网络请求爬虫

阅读 966

AI BotBETA

出现重定向问题通常是由于请求缺少必要认证信息或触发反爬机制。建议通过以下步骤解决：

添加完整请求头（关键参数）：

headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36",
 "Referer": "https://www.iconfont.cn/",
 "Cookie": "你的登录Cookie（必需）" 
}

参数需要URL编码：

params = {
 "q": "搜索词".encode('utf-8'),  # 中文需编码
 "page": "1",
 "pageSize": "100",
 "t": str(int(time.time() * 1000))  # 添加时间戳
}

完整请求示例：

import requests

url = "https://www.iconfont.cn/api/icon/search.json"
params = {
 "q": "首页",
 "page": 1,
 "pageSize": 100,
 "t": 1689234567890
}

headers = {
 "User-Agent": "Mozilla/5.0...",
 "Referer": "https://www.iconfont.cn/",
 "Cookie": "登录后的cookie"
}

response = requests.get(url, params=params, headers=headers)
print(response.json())

关键注意事项：
必须携带有效Cookie（需先通过浏览器登录获取）
时间戳参数是必须的
中文搜索词需要URL编码
建议使用会话保持登录状态
注意请求频率限制（建议添加2-5秒延迟）
登录Cookie获取方式：
通过浏览器登录后，F12打开开发者工具 -> Application -> Cookies -> 复制_uab_collina和登录凭证相关字段

如果仍然失败，可能是触发了以下防护机制：

验证码验证（频繁访问时出现）
请求签名验证（需要分析前端加密逻辑）
IP限制（需使用代理IP池）

建议先通过浏览器正常操作，使用开发者工具抓取实际请求参数进行比对。

1 个回答

Devlive开源社区

发布于
3 月 17 日北京

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import os
import re
import time

# 创建保存图标的文件夹
if not os.path.exists('iconfont_icons'):
    os.makedirs('iconfont_icons')

# 设置Chrome选项
chrome_options = Options()
# 如果需要无头模式（不显示浏览器窗口）取消下面这行的注释
# chrome_options.add_argument("--headless")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")
chrome_options.add_argument("--window-size=1920,1080")
chrome_options.add_argument("--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36")

# 初始化WebDriver (需要确保Chrome WebDriver已安装并在PATH中)
driver = webdriver.Chrome(options=chrome_options)

try:
    # 打开网页
    driver.get('https://www.iconfont.cn/collections/detail?spm=a313x.7781069.1998910419.d9df05512&cid=xxxx')

    # 等待页面加载 (最多等待20秒)
    wait = WebDriverWait(driver, 20)
    wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, 'li[class*="J_icon_id_"]')))

    # 查找所有图标元素
    icon_elements = driver.find_elements(By.CSS_SELECTOR, 'li[class*="J_icon_id_"]')
    print(f"找到 {len(icon_elements)} 个图标")

    # 遍历并保存图标
    for index, icon in enumerate(icon_elements):
        try:
            # 获取SVG元素
            svg_element = icon.find_element(By.CSS_SELECTOR, 'svg')

            # 获取图标名称
            try:
                icon_name_element = icon.find_element(By.CSS_SELECTOR, '.icon-name')
                icon_name = icon_name_element.text.strip()
                filename = f"{icon_name}.svg"
            except:
                # 从元素class中提取ID
                icon_class = icon.get_attribute('class')
                match = re.search(r'J_icon_id_(\d+)', icon_class)
                if match:
                    icon_id = match.group(1)
                    filename = f"icon_{icon_id}.svg"
                else:
                    filename = f"icon_{index}.svg"

            # 清理文件名中的非法字符
            filename = re.sub(r'[\\/*?:"<>|]', "", filename)

            # 获取SVG代码
            svg_html = svg_element.get_attribute('outerHTML')

            # 保存SVG
            with open(os.path.join('iconfont_icons', filename), 'w', encoding='utf-8') as f:
                f.write(svg_html)

            print(f"已保存: {filename}")

            # 添加延迟
            time.sleep(0.5)
        except Exception as e:
            print(f"处理第 {index} 个图标时出错: {e}")

    print('图标爬取完成!')

except Exception as e:
    print(f"爬取过程中出错: {e}")

finally:
    # 关闭浏览器
    driver.quit()

将 xxxx 替换为 cid

撰写回答

你尚未登录，登录后可以

和开发者交流问题的细节
关注并接收问题和回答的更新提醒
参与内容的编辑和改进，让解决方法与时俱进

推荐问题

相似问题

找不到问题？创建新问题

宣传栏

自顶向下学 React 源码
从理念到架构到实现到代码，透彻理解