Skip to content

Compare similar ads to check for duplicates #631

@dhowe

Description

@dhowe

Some ads, like these two:

http://pagead2.googlesyndication.com/pagead/imgad?id=CICAgKDj2__aTRABGAEyCG3qbJztYSV0

https://tpc.googlesyndication.com/pagead/imgad?id=CICAgKDj2__aTRABGAEyCG3qbJztYSV0

are actually the same image served from different sources (with different URLs).

We want them stacked (as duplicates) to prevent cases like this:
screen shot 2015-01-09 at 10 17 45 am

We might try to compare them first by dimensions, then by file size, possibly even by their ID (as seen in this example, though the subdomain is different, the ID is identical), and finally, if these are not reliable enough, compare them bitwise to make sure they're one and the same.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions