-
-
Notifications
You must be signed in to change notification settings - Fork 307
Open
Labels
good first issueGood for newcomersGood for newcomersup for grabsGood for (first) contributorsGood for (first) contributors
Description
I have mostly tested trafilatura
on a set of English, German and French web pages I had run into by surfing or during web crawls. There are definitely further web pages and cases in other languages for which the extraction doesn't work so far.
Corresponding bug reports can either be filed as a list in an issue like this one or in the code as XPath expressions in xpaths.py (see BODY_XPATH
and COMMENTS_XPATH
lists).
Thanks!
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomersup for grabsGood for (first) contributorsGood for (first) contributors