爬虫 实时变化的span该如何抓取

图片描述

请问比如这个外汇牌价, 是实时变化的<span>,能抓取吗? bs4似乎并不行
https://www.easymarkets.com/e...
周末休市,所以现在牌价不变了。。。

阅读 4.6k
2 个回答

嗯,看了这个网页很久发现一件事儿,
标签//*[@id="em_div_1_0"]/div/div[2]/div[2]/div[1]/div/div[1]/div[7]/div[1]/div/button/div[2]/span确实是在变化,但是不妨碍抓取
理由:标签的CLASS属性不会随着text的改变而改变。而且这个标签在页面内是唯一的。

很尴尬一点是并没有数据
<div class="ticket-open-deal-container-sell-current-rate"><span class="ticket-open-deal-label-sell-current-rate"></span></div>

这时候反应到这是一个异步加载的页面,使用到了AJAX。打开Chrome,进入页面,开发者选项Network->XHR。找到了一条很关键的:
https://chn.easymarkets.com/api/jsapiservice.svc/?rq=%5B%7B%22action%22%3A%22GetMarketInfoRates%22%2C%22args%22%3A%7B%22fetchOnlyRequestedCps%22%3Atrue%2C%22symbolsSet%22%3A%5B%22EURUSD%22%2C%22GBPUSD%22%2C%22AUDUSD%22%2C%22USDJPY%22%2C%22OILUSD%22%2C%22BRTUSD%22%2C%22WHTUSD%22%2C%22NGSUSD%22%2C%22DAXEUR%22%2C%22CNXUSD%22%2C%22NDQUSD%22%2C%22HSXHKD%22%2C%22XAUUSD%22%2C%22XAGUSD%22%2C%22CPRUSD%22%2C%22XPTUSD%22%2C%22EURJPY%22%5D%7D%7D%5D&sid=3675_d3c94ed0-2500-4e73-b08c-193eac587be1&timestamp=1478434567412&appid=CA7D0F97-F865-4D89-9983-409E5EE5DDF3,他的返回值包括了信息,并且其中timestamp字段为当前时间戳。

题主可以测试测试,或者去访问图表对应的借口~

直接抓接口:
试试命令

curl 'https://chn.easymarkets.com/api/jsapiservice.svc/?rq=%5B%7B%22action%22%3A%22GetMarketInfoRates%22%2C%22args%22%3A%7B%22fetchOnlyRequestedCps%22%3Atrue%2C%22symbolsSet%22%3A%5B%22EURUSD%22%5D%7D%7D%5D&sid=3675_c3343fd9-6d69-4ace-bff1-1a415e32395d&timestamp=1478256948955&appid=0A121E99-D3A7-4E0C-8159-383BC90443B1' -H 'Pragma: no-cache' -H 'Accept-Encoding: gzip, deflate, sdch, br' -H 'Accept-Language: zh-CN,zh;q=0.8' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36' -H 'Accept: application/json, text/javascript, */*; q=0.01' -H 'Referer: https://chn.easymarkets.com/int/trade/forex/eur-usd/?internal_src=preview_platform' -H 'X-Requested-With: XMLHttpRequest'
撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题