Thursday, November 18, 2010

Is Facebook Data Mining Human Subjects Research?

Recent law-school graduate Lauren Solberg finds that "data mining on Facebook likely does not constitute research with human subjects, and therefore does not require IRB review, because a researcher who collects data from Facebook pages does not 'interact' with the individual users, and the information on Facebook that researchers mine from individual users' pages is not 'private information.'"

[Lauren Solberg, "Data Mining on Facebook: A Free Space for Researchers or an IRB Nightmare?" article under review, University of Illinois Journal of Law, Technology & Policy 2010 (2). The article has been accepted for publication, but the journal is still soliciting comments.]

Solberg challenges policies now in place at Indiana University and the University of Massachusetts Boston, where researchers must get Facebook's written permission or the written permission of every individual who is studied. These policies, she argues, impose unnecessary burdens on researchers and IRBs alike. (The two policies are identical, but it's not clear which university borrowed from the other.)

She argues that most data mining projects do not meet the regulatory definition of human subjects research. Reading existing profiles is not interaction with an individual. Nor is a Facebook profile that is open to strangers private information, i.e., "information which has been provided for specific purposes by an individual and which the individual can reasonably expect will not be made public (for example, a medical record)." If a college admissions officer or a potential employer can read your profile, you've lost little by having an anthropologist read it as well.

This analysis seems sound, but it's not clear to me that anyone disagrees. In particular, the third university Solberg mentions, Washington University in St. Louis, applies its policy only to "
Any activity meeting the definition of 'human subject research' which is designed to recruit participants or collect data via the Internet.
" It then lists several examples, most of which involve interaction with living individuals. Thus, I doubt Solberg's claim that "researchers at Washington University need only inform Facebook users that they are recording information that is posted on their pages." Rather, if the project does not meet the definition of human subject research, then Wash U. researchers need not do even that much.

Solberg's article skirts some interesting questions. One concerns the boundaries of a reasonable expectation of privacy. Thus, Michael Zimmer gives the example of a study by Harvard graduate students of the Facebook profiles of Harvard undergraduates. If an undergraduate had made some information visible only to other Harvard students (a choice Facebook's software allows), and a Harvard student-researcher sees it, does that change Solberg's analysis?

A second question concerns the authority of university research offices and IRBs to insist that researchers abide by website terms of service. Notably, the Indiana and UMASS policies do not cite federal human subjects regulations as their authority. Rather, they claim that Facebook and Myspace "explicitly state that their sites are not intended for research but for social networking only."

Solberg writes that evaluating such claims is "outside the scope of this article," but they are interesting in three ways. First, they may be factually false; I could find no such explicit statements in the Facebook or Myspace terms of service. Second, they are divorced from federal regulation. For example, the Facebook terms of service do not distinguish between living and dead Facebook members, whereas federal human subjects protections apply only to the living. Finally, they are internally inconsistent. If Facebook and Myspace did prohibit the use of their sites for research, would not researchers still be violating the terms of service even if they got signed consent from individual members, as allowed by the policies? Just who are these two universities trying to protect?

Solberg concludes that "Unfortunately, and somewhat surprisingly, the OHRP has issued no guidance pertaining to Internet research in general, let alone guidance specifically relating to the issue of data mining on the Internet." To give the feds some credit, in summer 2010 (after Solberg wrote her article), SACHRP did sponsor a panel on the Internet in Human Subjects Research. It can take a long time from a SACHRP presentation to OHRP guidance, but the wheels may be moving on this one.


Note, 19 November 2010: The original version of this post identified Ms. Solberg as a law student. She has in fact graduated. I have also changed the link about Michael Zimmer's work from his SACHRP presentation to his article, "'But the data is already public': on the ethics of research in Facebook," Ethics and Information Technology 12 (2010): 313-325.

No comments: