Vinay Prabhu

Mar 15, 2021

13 min read

A study of “A Study of Face Obfuscation in ImageNet”

Table of Contents:

— — — — — — — — — — — — — — — — — — — — — — — -

· Background
· Main Issues
Issue-1: The curious case of Face obfuscation
Issue-2: NSFW analysis
Issue-3: Human co-occurrence analysis
👹FAQs of the ‘Devil’s advocacy’ kind: Our humble tribute to the cult of “Both-sideism”
👹1: Why not just contact them in lieu of public grandstanding?
👹2: May be your little emails slipped through the cracks perhaps?
👹 3: Well, OK. But PL is not the sole author. How do you know all the co-authors and the collaborators in the acknowledgment were even aware of your work?!
👹 4: In the Wired interview published on March 15th, when pressed by the reporter, one of the authors states that “a citation will appear in an updated version of the paper”. Doesn’t that solve the problem?
· Concluding thoughts: The real issues
a) Erasure of black-women-scholarship:
b) Revisiting the horrors of ghost labor:
— — — — — — — — — — — — — — — — — — — — — — — -

Background

Left: Version of Brief History of Deep Learning from 1943–2019 [Timeline] on Apr 23, 2020. Right: Version today after the community uproar driven by Dr. Gebru’s tweet

Below, we bemoan this unfortunate departure from norms of academic integrity by carefully disentangling the specific details that characterize the situation from our standpoint. In doing so, we are sharing the exact snapshots of the conversation(s) that unraveled between the parties involved here.
Pre-script: The authors of the paper Large image datasets: A pyrrhic win for computer vision?, (and this blog-post you are reading), are abbreviated as VP (Vinay Prabhu) and AB (Abeba Birhane) respectively in the rest of the material presented here. PL refers to the main visionary of the ImageNet dataset. Wherever relevant, ‘Our paper’ refers to Large image datasets: A pyrrhic win for computer vision? and ‘their paper’ refers to A Study of Face Obfuscation in ImageNet.

Main Issues

Issue-1: The curious case of Face obfuscation

Screenshot of the email VP wrote to WACV organizers. There was no reply received to this email.

Our goal, simply put, was to in fact, engage directly with the practitioners in the field and not just the ethics community. And, in many ways, it did culminate in a rather “lively discussion” when we did present the paper at WACV in a session chaired by Jordi Pont-Tuset, a Research Scientist @ Google Zürich.
At this juncture, we’d like to share that AB, along with a lot of our other colleagues and pre-reviewers, rightfully questioned the very need for the section as they felt it reeked of tech-solutionism. Nonetheless, predicting the clamor for ‘possible solutions’ from the reviewers of this traditional Computer Vision conference (which was eventually proven to be a correct assumption), the section persisted. In this regard, we’d like to draw the attention of the reader towards Section 4.3 in our paper which is literally titled “Differentially private obfuscation of the faces” where we state: ”This path entails harnessing techniques such as DP-Blur [36] with quantifiable privacy guarantees to obfuscate the identity of the humans in the image. The Inclusive images challenge [94], for example, already incorporated blurring during dataset curation and addressed the downstream effects surrounding change in predictive power of the models trained on the blurred versions of the dataset curated. We believe that replication of this template that also clearly included avenues for recourse in case of an erroneously non-blurred image being sighted by a researcher will be a step in the right direction for the community at large”. As evinced by the papers we cited, privacy preserving obfuscation of images is neither a novel idea and most certainly not our idea. But, in the specific context of imagining a face-obfuscated version of ImageNet, it is reasonable to assume that any one who will author a paper audaciously titled “A Study of Face Obfuscation in ImageNet” will pay at least a lip-service towards citing either our work and/or [94] in our paper which is:
[94] Shreya Shankar, Yoni Halpern, Eric Breck, James Atwood, Jimbo Wilson, and D Sculley. No classification without representation: Assessing geodiversity issues in open data sets for the developing world. arXiv preprint arXiv:1711.08536, 2017.
But the authors chose not to cite this one either. Their paper begins with “Image obfuscation (blurring, mosaicing, etc.) is widely used for privacy protection. However, computer vision research often overlooks privacy by assuming access to original unobfuscated images” (Like, really?!🙄 ) and goes on to claim that they have discovered that “.. the dataset exposes many people co-occurring with other objects in images, e.g., people sitting on chairs, walking their dogs, or drinking beer (Fig. 1). It is concerning since ILSVRC is publicly available and widely used.”🥴

Also, we’d like to ask the readers to take 2 minutes to parse these FAQs from the associated Kaggle contest ([94] in our paper) from > 2 years ago and then read their paper again 🤐

Source: https://www.kaggle.com/c/inclusive-images-challenge/overview/inclusive-images-faq#recognizable-faces

Issue-2: NSFW analysis

Again, much to our disappointment, the authors claim to have discovered that : ”The number of NSFW areas varies significantly across different ILSVRC categories. Bikini is likely to contain much more NSFW areas than the average.” 😒

Issue-3: Human co-occurrence analysis

👹FAQs of the ‘Devil’s advocacy’ kind: Our humble tribute to the cult of “Both-sideism”

👹1: Why not just contact them in lieu of public grandstanding? Have you bothered to even contact the curators of the ImageNet dataset?

Understanding the magnitude of the impact and being wary of any possible Streissand effect, we spent the entirety of the near 10 month period between Aug 2019 and Jun 2020 in various outreach efforts amongst many journalists, Computer Vision and Ethics communities and organizations. This also involved VP working with journalists such as Katyanna Quach at The Register who then authored this article: Inside the 1TB ImageNet data set used to train the world’s AI: Naked kids, drunken frat parties, porno stars, and more

👹2: Oh come on! Stop with the self-aggrandizing and self-loathing. AI royalty tend to receive hundreds of emails a day. May be your little emails slipped through the cracks perhaps?

This was followed by the first ever communication I received voluntarily from PL whose screen-shot is below.

This was followed by a delayed reply on April 17th, that read …

And lastly, here is the actual video of our zoom-face-to-face meeting 📹 https://youtu.be/hpA67iDxNGU
Q.E.D !
👹 3: Well, OK. But PL is not the sole author. How do you know all the co-authors and the collaborators in the acknowledgment were even aware of your work?!

👹 4: In the Wired interview published on March 15th, when pressed by the reporter, one of the authors states that “a citation will appear in an updated version of the paper”. Doesn’t that solve the problem?

Concluding thoughts: The real issues

a) Erasure of black-women-scholarship:

b) Revisiting the horrors of ghost labor:

  • Resort to using the unethical “SoTA” tools from companies like Amazon, Face++ or Clarifai to perform face detection and filter the problematic images
  • Resort to exploiting the ghost labor markets of AMT to hand-annotate the NSFW facet of the dataset.

As it turns out, on the very same day that the Turkopticon fundraising campaign was announced, a few hours later, we see the efforts of this paper falling prey to both the ills. In fact, the gamified HIT (Human
Intelligence Task) details reads 🤢: These images have verified ground truth faces, but we intentionally show incorrect annotations for the workers to fix. The entire HIT resembles an action game. Starting with 2 lives, the worker
will lose a life when making a mistake on gold standard images. In that case, they will see the ground truth faces (Fig. B Right) and the remaining lives. If they lose both 2 lives, the game is over, and they have to start from scratch at
the first image. We found this strategy to effectively retain workers’ attention and improve annotation quality.

(Also see https://www.vice.com/en/article/88apnv/underpaid-workers-are-being-forced-to-train-biased-ai-on-mechanical-turk )

To conclude, we say:
- This is NOT us desperately hoping to drum up some antics to garner more attention
- This is NOT us trying to eke out one more citation
- This is NOT us assuming the proverbial higher pedestal and judging anyone
- This is NOT an ad hominem attack on any member of the ImageNet team.
- This IS us calling out a pattern of citation erasure (with specific verifiable proofs) and highlighting the ethical shortcomings in a paper that will probably be extremely well cited in the near future and much worse, celebrated (wrongly IMHO) as a template for stop-gap fixes.
We call upon the curators of the dataset to pay heed to the issues raised and take corrective measures.

Kindest regards,

  • Abeba Birhane and Vinay Prabhu

PS: If all of this is confusing, here is the VERIFIABLE timeline of events to summarize what happened.
1: 19 Aug 2019 — Contacted ImageNet curators via email. No response.
2: Sep 2019: Chat with
Katyanna Quach at ‘The Register’ in order to research the specific details regarding ImageNet for an impending article.
3: 23 Oct 2019: Register article comes out:
https://www.theregister.com/2019/10/23/ai_dataset_imagenet_consent/
4: 12 Apr 2020: Second email contact with the ImageNet curators via email. No response.
5: 15 Apr 2020: PL contacts me via email
6: Apr 25, 2020 : Talk at Stanford that PL attends titled “
Ethical Malice in Peer-Reviewed Machine Learning Literature” (Video link included)
7: June 2020, The first version of our paper appears on ArXiv :
https://arxiv.org/abs/2006.16923
8: March 2021, PL et al publish “A Study of Face Obfuscation in ImageNet” sans any citation or acknowledgement