A study of “A Study of Face Obfuscation in ImageNet”

Table of Contents:

— — — — — — — — — — — — — — — — — — — — — — — -

· Background
· Main Issues
Issue-1: The curious case of Face obfuscation
Issue-2: NSFW analysis
Issue-3: Human co-occurrence analysis
👹FAQs of the ‘Devil’s advocacy’ kind: Our humble tribute to the cult of “Both-sideism”
👹1: Why not just contact them in lieu of public grandstanding?
👹2: May be your little emails slipped through the cracks perhaps?
👹 3: Well, OK. But PL is not the sole author. How do you know all the co-authors and the collaborators in the acknowledgment were even aware of your work?!
👹 4: In the Wired interview published on March 15th, when pressed by the reporter, one of the authors states that “a citation will appear in an updated version of the paper”. Doesn’t that solve the problem?
· Concluding thoughts: The real issues
a) Erasure of black-women-scholarship:
b) Revisiting the horrors of ghost labor:
— — — — — — — — — — — — — — — — — — — — — — — -


On June 24, 2020, Abeba Birhane and I released on our paper “Large image datasets: A pyrrhic win for computer vision?” critiquing the culture of large scale datasets in Computer Vision. In the paper, we performed using the ImageNet dataset as a templateThe nature and the expanse of the transgressions attracted quite some media attention (See this, this and this). In Section 2.3 of our paper, we revisited the downstream effects of “ that results from inheriting labels from the WordNet taxonomy ) and showed how this affects not just the ImageNet dataset but also other datasets such as the the Tiny Images dataset and the latest Tencent-ML-Images dataset that either directly or indirectly inherited the label-space from WordNet. On June 29th 2020, we learnt that the curators of the Tiny Images dataset had apologized and withdrawn the dataset.
In Jan 2021, the paper was formally presented at the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV -2021) and has been cited in more than two dozen papers since.
In the backdrop of all of this work, this Wednesday, on the 10th of March-2021, we encountered a paper titled from the ImageNet curators that has left us disappointed and flummoxed. By indulging in to be a calculated and systematic erasure of the entire body of critique that our work was a part of, the authors have sent out a wide range of wrong signals. This erasure is doubly disappointing given how the community had recently rallied behind the main visionary of the ImageNet project when her contributions towards the “ were being erased in an online compendium and the sheer clout she enjoins in the field.

Left: Version of Brief History of Deep Learning from 1943–2019 [Timeline] on Apr 23, 2020. Right: Version today after the community uproar driven by Dr. Gebru’s tweet

Below, we bemoan this unfortunate departure from norms of academic integrity by carefully disentangling the specific details . In doing so, we are sharing the exact snapshots of the conversation(s) that unraveled between the parties involved here.
Pre-script: The authors of the paper Large image datasets: A pyrrhic win for computer vision?, (and this blog-post you are reading), are abbreviated as VP (Vinay Prabhu) and AB (Abeba Birhane) respectively in the rest of the material presented here. PL refers to the main visionary of the ImageNet dataset. Wherever relevant, ‘’ refers to and ‘’ refers to

Main Issues

Issue-1: The curious case of Face obfuscation

: Section 4: in our paper, was perhaps, the most difficult for us to author. We knew we were submitting to WACV (on the stubborn insistence of VP) , an unlikely venue for ‘’ whose submission portal did not even have a primary or secondary topic for “”, which did receive a mention in the CFP however. (See screenshot below).

Screenshot of the email VP wrote to WACV organizers. There was no reply received to this email.

Our goal, simply put, was to in fact, and not just the ethics community. And, in many ways, it did culminate in a rather “” when we did present the paper at WACV in a session chaired by Jordi Pont-Tuset, a Research Scientist @ Google Zürich.
At this juncture, we’d like to share that AB, along with a lot of our other colleagues and pre-reviewers, rightfully questioned the very need for the section as they felt it reeked of tech-solutionism. Nonetheless, predicting the clamor for ‘possible solutions’ from the reviewers of this traditional Computer Vision conference (which was eventually proven to be a correct assumption), the section persisted. In this regard, we’d like to draw the attention of the reader towards Section 4.3 in our paper which is literally titled “where we state: ””. As evinced by the papers we cited, privacy preserving obfuscation of images is . But, in the specific context of imagining a face-obfuscated version of ImageNet, it is reasonable to assume that any one who will author a paper audaciously titled “” will pay at least a lip-service towards citing either our work and/or [94] in our paper which is:

But the authors chose not to cite this one either. Their paper begins with “🙄and goes on to claim that they have discovered that “.. 🥴

Also, we’d like to ask the readers to take 2 minutes to parse these FAQs from the associated Kaggle contest ([94] in our paper) from > 2 years ago and then read again 🤐

Source: https://www.kaggle.com/c/inclusive-images-challenge/overview/inclusive-images-faq#recognizable-faces

Issue-2: NSFW analysis

The term NSFW appears 38 times in our paper and we not only curated a class-wise meta-dataset (df_nsfw.csv | Size: (1000, 5) ) consisting of the mean and std of the NSFW scores of the train and validation images arranged per-class but also dedicated Appendix B.2 towards “In Table-5, we specifically focus on classes 445, 638,639, 655 and 459 mapping to bikini, two-piece , maillot ,
miniskirt and brassiere/ bra/ bandeau in the dataset that we found were NSFW-dense classes.

Again, much to our disappointment, the authors claim to have discovered that .” 😒

Issue-3: Human co-occurrence analysis

In our paper, we dedicated Section towards human co-occurrence-biases, specifically with regards to classes involving dog-breed-class images and musical instruments that have high density of incidentally co-occurring humans. Their new paper states: “ ”🤦

👹FAQs of the ‘Devil’s advocacy’ kind: Our humble tribute to the cult of “Both-sideism”

Given the attention this might elicit we pre-emptively anticipate the exact flavor of attacks and cover the following “” in the section below:

👹1: Why not just contact them in lieu of public grandstanding? Have you bothered to even contact the curators of the ImageNet dataset?

Yes! Glad you asked. Here are the screenshots of our emails dating all the way back to Aug 19th 2019 and later, on Apr 12, 2020 to which we received no replies whatsoever:

Understanding the magnitude of the impact and being wary of any possible Streissand effect, we spent the entirety of the near 10 month period between Aug 2019 and Jun 2020 in various outreach efforts amongst many journalists, Computer Vision and Ethics communities and organizations. This also involved VP working with journalists such as Katyanna Quach at who then authored this article: Inside the 1TB ImageNet data set used to train the world’s AI: Naked kids, drunken frat parties, porno stars, and more

👹2: Oh come on! Stop with the self-aggrandizing and self-loathing. AI royalty tend to receive hundreds of emails a day. May be your little emails slipped through the cracks perhaps?

Again, glad you asked!
Lemma-1: PL was *extremely* well aware of the work.
Proof: The paper that we published heavily draws from my talk “On the four horsemen of ethical malice in peer reviewed machine learning literature” given under the aegis of the Stanford-HAI weekly seminars (thanks to an invite from Colin Kelley Garvey, an AI ethicist) on April 17, 2020–11:00am–12:00pm. On Apr 15th, I received this email from a HAI co-ordinator stating that “”.

This was followed by the first ever communication I received voluntarily from PL whose screen-shot is below.

This was followed by a delayed reply on April 17th, that read …

And lastly, here is the actual video of our zoom-face-to-face meeting 📹 https://youtu.be/hpA67iDxNGU
Q.E.D !
👹 3: Well, OK. But PL is not the sole author. How do you know all the co-authors and the collaborators in the acknowledgment were even aware of your work?!

Because they have literally cited us just recently In their paper titled “”, the authors contextualize our work by citing “.” A reductionist take on our work, but a proof-of-awareness nonetheless!

👹 4: In the Wired interview published on March 15th, when pressed by the reporter, one of the authors states that “a citation will appear in an updated version of the paper”. Doesn’t that solve the problem?

Again. This blog is not about citation-seeking. We’d like to clearly point out that the biggest shortcomings are the tactical abdication of responsibility for all the mess in ImageNet combined with systematic erasure of related critical work, that might well have led to these corrective measures being taken.
The authors tactically left out an entire body of literature that critiqued the ImageNet beginning with the ImageNet audits by Chris Dulhanty and Alexander Wong (and not to mention Chris’ entire thesis) and more recent data-archeological expeditions such as Lines of Sight by Alex Hanna et al. This shouldn’t come as a surprise to anybody because their last inquisition into the ( where they admitted that of the 2832 people categories that are annotated within the subtree, 1593 of them were potentially offensive labels and only 158 of them were visual), they made no mention of the hugely influential ImageNet Roulette project ( that went viral on September 19, 2019 while the paper only hit the ArXiv servers on 16 Dec 2019!). Also, lest we forget that these are being ushered in a good 12 years after the dataset release. T-W-E-L-V-E YEARS!

Concluding thoughts: The real issues

a) Erasure of black-women-scholarship:

AB’s central role in turning a rag-tag set of empirical results and revelations into a cogent peer-review-worthy publication and later investing all the efforts to champion it’s cause via talks, interviews and presentations is one of the main reasons why the paper is even being cited now. The primacy of her contributions is also reflected in the official citation that literally reads:
But, unfortunately for her, undervaluing of her scholarship is not an aberration but a trend. Black women’s intellectual production has historically been ignored and systemically erased. The hierarchical academic structure that devalues Black women’s intellectual contributions makes contesting such injustice a tiresome endeavor discouraging Black women scholars from coming forward. Black feminist theory scholars such as Jennifer Nash, have extensively explored the Citational Desires of scholars whose contributions have been systematically under-emphasized. Initiatives such as the Cite Black Women collective (https://www.citeblackwomencollective.org/) work towards dismantling precisely this behavior in academia and it is unfortunate to see this behavior reinforced by highly esteemed scholars who are supposed to be the torchbearers of hope.

b) Revisiting the horrors of ghost labor:

During our draft revisions, specifically Section-4, AB and I were in the midst of a ‘’ conversation, when we realized two things: In order to truly the dataset, we’d be forced to make two massive compromises:

  • Resort to using the unethical “SoTA” tools from companies like Amazon, Face++ or Clarifai to perform face detection and filter the problematic images
  • Resort to exploiting the ghost labor markets of AMT to hand-annotate the NSFW facet of the dataset.

As it turns out, on the very same day that the Turkopticon fundraising campaign was announced, a few hours later, we see the efforts of this paper falling prey to both the ills. In fact, the gamified HIT (Human
Intelligence Task) details reads 🤢:

(Also see https://www.vice.com/en/article/88apnv/underpaid-workers-are-being-forced-to-train-biased-ai-on-mechanical-turk )

To conclude, we say:
- This is NOT us desperately hoping to drum up some antics to garner more attention
- This is NOT us trying to eke out one more citation
- This is NOT us assuming the proverbial higher pedestal and judging anyone
- This is NOT an ad hominem attack on any member of the ImageNet team.
- This IS us calling out a (with specific verifiable proofs) and highlighting the ethical shortcomings in a paper that will probably be extremely well cited in the near future and much worse, celebrated (wrongly IMHO) as a template for stop-gap fixes.
We call upon the curators of the dataset to pay heed to the issues raised and take corrective measures.

Kindest regards,

  • Abeba Birhane and Vinay Prabhu

PS: If all of this is confusing, here is the VERIFIABLE timeline of events to summarize what happened.

PhD, Carnegie Mellon University. Chief Scientist at UnifyID Inc