Reddit 阻止互联网档案馆以结束偷偷摸摸的人工智能抓取

  • Reddit blocking IA: Reddit is now preventing the Internet Archive (IA) from indexing popular Reddit threads. It caught AI firms that were restricted from scraping Reddit but instead were scraping data from IA's archived content.
  • Changes to archive: Only screenshots of the Reddit homepage will be archived in the future. This means the archive will only show popular posts and news headlines daily and not provide backups of deleted posts or insights into subcultures or user activities.
  • Unconfirmed AI firms: Reddit has not identified which AI firms were scraping its data from the Wayback Machine. Its spokesperson only confirmed that Reddit is aware of AI companies violating platform policies and scraping data.
  • Steps for IA: Rathschmidt suggested that IA could take steps to better defend against AI scraping of archived Reddit content, which might lead Reddit to lift the scraping restrictions.
  • Privacy concerns: Reddit is also addressing privacy concerns as the Wayback Machine archives deleted content. It is limiting IA's access to protect redditors until IA can defend its site and comply with policies.
  • Social media comments: Some Redditors have used the Wayback Machine to research deleted comments or threads, but there are other tools available. Redditors also turn to IA during platform changes and content removals.
  • IA's response: IA has not signaled any fixes to lift Reddit's restrictions and did not respond to Ars' request about the impact on the archive's utility.
  • Financial motivation: Reddit is likely motivated by finances to restrict AI firms from using Wayback Machine archives and may hope to get more lucrative licensing deals like with OpenAI and Google. The Google deal was reportedly worth $60 million, and Reddit expects to make more than $200 million from such deals over three years.
  • IA's stance: The director of the Wayback Machine said IA has a longstanding relationship with Reddit and is having ongoing discussions about the matter.
阅读 23
0 条评论