Skip to content

New UniqueFilter for images associated with compromised accounts.#3519

Open
swfarnsworth wants to merge 6 commits into
mainfrom
swfarnsworth/image-filter
Open

New UniqueFilter for images associated with compromised accounts.#3519
swfarnsworth wants to merge 6 commits into
mainfrom
swfarnsworth/image-filter

Conversation

@swfarnsworth

Copy link
Copy Markdown
Contributor

Currently, the perceptual hashes of known images are hard-coded, but this is a temporary solution.

image

In response to only the second image:

image

Upon merging this change, someone with requisite permissions must run !filter add unique image and then configure the desired behavior for positive identifications from this filter.

@swfarnsworth swfarnsworth requested a review from mbaruh as a code owner June 12, 2026 22:53
@python-discord-policy-bot python-discord-policy-bot Bot requested a review from a team June 12, 2026 22:53
@swfarnsworth swfarnsworth requested a review from jb3 June 12, 2026 22:53
Currently, the perceptual hashes of known images are hard-coded, but this is a temporary solution.
@swfarnsworth swfarnsworth force-pushed the swfarnsworth/image-filter branch from 2b9c4f9 to 539ab1c Compare June 12, 2026 23:01

@wookie184 wookie184 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a requirement that this is a perceptual hash, or would a normal file hash be sufficient for our current use case.

The reasons I suggest this is:

  • Currently this brings in some chunky dependencies (numpy, scikit) - more than doubling the .venv size. Not a massive issue, but nice to avoid if possible.
  • Image processing is a bit slow. At the very least we would need to use asyncio.to_thread to avoid blocking the event loop (though would potentially make sense to use that for a file hash too).
  • Image processing is complex and creates a larger attack surface for malicious files. See https://pillow.readthedocs.io/en/latest/handbook/security.html

Using a file hash like md5 is simple, fast, and can be used on any type of file.

Comment thread bot/exts/filtering/_filters/unique/image.py Outdated
Comment thread bot/exts/filtering/_filters/unique/image.py
@swfarnsworth

Copy link
Copy Markdown
Contributor Author

@wookie184 thank you for your review. I think I've addressed all your concerns.

As we discussed on Discord (restating here for others), the images used in the ongoing attack differ slightly each time.

It's disappointing that the new dependencies double the size of the venv. We could instead have this as a microservice, but that might not be worth the tradeoffs. In either case, I'm fine with ultimately removing this functionality and its dependencies when the ongoing attack ends.

@swfarnsworth

Copy link
Copy Markdown
Contributor Author

Oh, I meant to also say that I don't have any logging statements in either of the except blocks, though I considered it. Let me know if you want me to do that. I think "info" would be the correct logging level here?

@swfarnsworth

Copy link
Copy Markdown
Contributor Author

@jb3 the two most recent commits address concerns that you raised in messages in Discord.

The filter will only consider attachments less than 30mb and I've added comments describing the image for each perceptual hash.

There were two instances of this attack while I typed this message 🫠

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants