Skip to content
This repository was archived by the owner on May 5, 2020. It is now read-only.

Different results based on allowed file formats #30

@siccovansas

Description

@siccovansas

When I search rijksoverheid.nl (the main website of the Dutch government) I retrieve 384 files (60 xlsx, 323 xls, 1 unknown/xlsm) with the default set of allowed file formats (as specified in constants.py). When I add 'ods' to the list of allowed file formats and retry rijksoverheid.nl then I only retrieve 298 files (24 xlsx, 149 xls, 124 ods, 1 unknown/xlsm). I have tested both versions 3 times and the results were always like this.

So, how can it be that I retrieve less files when I add a file format?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions