New UniqueFilter for images associated with compromised accounts.#3519
New UniqueFilter for images associated with compromised accounts.#3519swfarnsworth wants to merge 6 commits into
Conversation
Currently, the perceptual hashes of known images are hard-coded, but this is a temporary solution.
2b9c4f9 to
539ab1c
Compare
wookie184
left a comment
There was a problem hiding this comment.
Is it a requirement that this is a perceptual hash, or would a normal file hash be sufficient for our current use case.
The reasons I suggest this is:
- Currently this brings in some chunky dependencies (numpy, scikit) - more than doubling the .venv size. Not a massive issue, but nice to avoid if possible.
- Image processing is a bit slow. At the very least we would need to use
asyncio.to_threadto avoid blocking the event loop (though would potentially make sense to use that for a file hash too). - Image processing is complex and creates a larger attack surface for malicious files. See https://pillow.readthedocs.io/en/latest/handbook/security.html
Using a file hash like md5 is simple, fast, and can be used on any type of file.
`Image.open` and `imagehash.phash` are now awaitable.
|
@wookie184 thank you for your review. I think I've addressed all your concerns. As we discussed on Discord (restating here for others), the images used in the ongoing attack differ slightly each time. It's disappointing that the new dependencies double the size of the venv. We could instead have this as a microservice, but that might not be worth the tradeoffs. In either case, I'm fine with ultimately removing this functionality and its dependencies when the ongoing attack ends. |
|
Oh, I meant to also say that I don't have any logging statements in either of the except blocks, though I considered it. Let me know if you want me to do that. I think "info" would be the correct logging level here? |
|
@jb3 the two most recent commits address concerns that you raised in messages in Discord. The filter will only consider attachments less than 30mb and I've added comments describing the image for each perceptual hash. There were two instances of this attack while I typed this message 🫠 |
Currently, the perceptual hashes of known images are hard-coded, but this is a temporary solution.
In response to only the second image:
Upon merging this change, someone with requisite permissions must run
!filter add unique imageand then configure the desired behavior for positive identifications from this filter.