如何从字符串中删除 HTML 标记以便输出干净的文本?
let str = string.stringByReplacingOccurrencesOfString("<[^>]+>", withString: "", options: .RegularExpressionSearch, range: nil)
print(str)
原文由 arled 发布,翻译遵循 CC BY-SA 4.0 许可协议
由于 HTML 不是 常规语言(HTML 是 上下文无关 语言),因此您不能使用正则表达式。请参阅: 使用正则表达式解析 HTML:为什么不呢?
我会考虑改用 NSAttributedString。
let htmlString = "LCD Soundsystem was the musical project of producer <a href='http://www.last.fm/music/James+Murphy' class='bbcode_artist'>James Murphy</a>, co-founder of <a href='http://www.last.fm/tag/dance-punk' class='bbcode_tag' rel='tag'>dance-punk</a> label <a href='http://www.last.fm/label/DFA' class='bbcode_label'>DFA</a> Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of <a href='http://www.last.fm/tag/alternative%20dance' class='bbcode_tag' rel='tag'>alternative dance</a> and <a href='http://www.last.fm/tag/post%20punk' class='bbcode_tag' rel='tag'>post punk</a>, along with elements of <a href='http://www.last.fm/tag/disco' class='bbcode_tag' rel='tag'>disco</a> and other styles. <br />"
let htmlStringData = htmlString.dataUsingEncoding(NSUTF8StringEncoding)!
let options: [String: AnyObject] = [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: NSUTF8StringEncoding]
let attributedHTMLString = try! NSAttributedString(data: htmlStringData, options: options, documentAttributes: nil)
let string = attributedHTMLString.string
或者,正如 Irshad Mohamed 在评论中所做的那样:
let attributed = try NSAttributedString(data: htmlString.data(using: .unicode)!, options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType], documentAttributes: nil)
print(attributed.string)
原文由 Joony 发布,翻译遵循 CC BY-SA 4.0 许可协议
2 回答1.5k 阅读✓ 已解决
2 回答863 阅读✓ 已解决
1 回答1.1k 阅读✓ 已解决
1 回答875 阅读✓ 已解决
2 回答777 阅读
1 回答757 阅读✓ 已解决
2 回答1.1k 阅读
嗯,我试过你的功能,它在一个小例子上起作用:
你能举一个问题的例子吗?
Swift 4 和 5 版本: