22.11.15 - 27.11.15, Seminar 15481

Evaluation in the Crowd: Crowdsourcing and Human-Centred Experiments

Human-centred empirical evaluations play an important role in the fields of HCI, visualization, and graphics in testing the effectiveness of visual representations. The advent of crowdsourcing platforms (such as Amazon Mechanical Turk) has provided a revolutionary methodology to conduct human-centred experiments. Through such platforms, experiments can now collect data from hundreds, even thousands, of participants from a diverse user community over a matter of weeks, greatly increasing the ease with which we can collect data as well as the power and generalizability of experimental results. However, the use of such experimental platforms does not come without its problems: ensuring participant investment in the task, defining experimental controls, collecting qualitative data, and understanding the ethics behind deploying such experiments en masse.

The focus of this Dagstuhl seminar is to discuss experiences and methodological considerations when using crowdsourcing platforms to run human-centred experiments. We aim to bring together researchers in areas that use crowdsourcing to run human-centred experiments and aim to have a high degree of interdisciplinarity. We target members of the human-computer interaction, visualization, psychology, and applied perception research communities as typical users of crowdsourcing platforms. We also wish to engage researchers who develop the technology that makes crowdsourcing possible and researchers who have studied the crowdsourcing community. The following topics will be discussed:

  • Crowdsourcing Platforms vs. The Laboratory. The laboratory setting for human-centred experiments has been employed for decades and has a well understood methodology with known advantages and limitations. Studies performed on crowdsourcing platforms provide new opportunities and new challenges. A cross community discussion over the nature of these technologies as well as their advantages and limitations is needed. When should we use crowdsourcing? More importantly, when should we not?
  • Scientifically Rigorous Methodologies. Understanding the strengths and limitations of a crowdsourcing platform can help us refine our human-centred experimental methodologies. When running between-subjects experiments, what considerations do we need to make when allocating our participant pools that will be compared? Are within-subjects experiments too taxing for crowdsourced participants? How do we effectively collect qualitative information?
  • Crowdsourcing Experiments in Human-Computer Interaction, Visualization, and Applied Perception/Graphics. Each of our fields has unique challenges when designing, deploying, and analysing the results of crowdsourcing evaluation. We are especially interested in the experiences and best practice findings of our communities in regards to these methodologies.
  • Getting to Know the Crowd. Much of this seminar examins the ways that our research communities can use the technology in order to evaluate the software systems and techniques that we design. However, it is important to consider the people that accept and perform the jobs that we post on these platforms. What are they like?
  • Ethics in Experiments. Even though the participants of a crowdsourcing study never walk into the laboratory, ethical considerations behind this new platform need to be discussed. What additional considerations are needed beyond standard ethical procedures when running crowdsourcing experiments? How do we ensure that we are compensating our participants adequately for their work, while considering the nature of microtasks?

The intended output of this seminar is an edited volume of articles that will become a primer text on the use of crowdsourcing in our diverse research communities. We also expect that with the range of researchers invited to this seminar new collaborative and interdisciplinary projects will be fostered.