扫一扫分享
FaFathom是一个针对dom api的JavaScript框架,用于从网页中提取识别弹窗、按钮、表单、文本内容等内容。
将Fathom想象成一种微型编程语言,通过其程序Fathom规则集识别DOM树的重要部分。
const rules = ruleset(
// Give any title tag the (default) score of 1, and tag it as title-ish:
rule(dom('title'), type('titley')),
// Give any OpenGraph meta tag a score of 2, and tag it as title-ish as well:
rule(dom('meta[property="og:title"]'), type('titley').score(2)),
// Take all title-ish things, and punish them if they contain
// navigational claptrap like colons or dashes:
rule(type('titley'), score(fnode => containsColonsOrDashes(fnode.element) ? .5 : 1)),
// Offer the max-scoring title-ish node under the output key "title":
rule(type('titley').max(), out('title'))
);
手机预览