List of smaller extraction bugs (text & metadata)

I have mostly tested `trafilatura` on a set of English, German and French web pages I had run into by surfing or during web crawls. There are definitely further web pages and cases in other languages for which the extraction doesn't work so far.

Corresponding bug reports can either be filed as a list in an issue like this one or in the code as XPath expressions in [xpaths.py](https://github.com/adbar/trafilatura/blob/master/trafilatura/xpaths.py) (see `BODY_XPATH` and `COMMENTS_XPATH` lists).

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

List of smaller extraction bugs (text & metadata) #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

List of smaller extraction bugs (text & metadata) #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions