Dagstuhl Seminar 15481: Evaluation in the Crowd: Crowdsourcing and Human-Centred Experiments

Dagstuhl Seminar 15481

Evaluation in the Crowd: Crowdsourcing and Human-Centred Experiments

( Nov 22 – Nov 27, 2015 )

(Click in the middle of the image to enlarge)

Permalink

Please use the following short url to reference this page: https://www.dagstuhl.de/15481

Organizers

Daniel Archambault (Swansea University, GB)
Tobias Hoßfeld (Universität Duisburg-Essen, DE)
Helen C. Purchase (University of Glasgow, GB)

Contact

Susanne Bach-Bernhard (for administrative matters)

Publications

Crowdsourcing and Human-Centred Experiments (Dagstuhl Seminar 15481). Daniel Archambault, Tobias Hoßfeld, and Helen C. Purchase. In Dagstuhl Reports, Volume 5, Issue 11, pp. 103-126, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2016)

Impacts

Schedule

Schedule

Motivation

Show Motivation

Human-centred empirical evaluations play an important role in the fields of HCI, visualization, and graphics in testing the effectiveness of visual representations. The advent of crowdsourcing platforms (such as Amazon Mechanical Turk) has provided a revolutionary methodology to conduct human-centred experiments. Through such platforms, experiments can now collect data from hundreds, even thousands, of participants from a diverse user community over a matter of weeks, greatly increasing the ease with which we can collect data as well as the power and generalizability of experimental results. However, the use of such experimental platforms does not come without its problems: ensuring participant investment in the task, defining experimental controls, collecting qualitative data, and understanding the ethics behind deploying such experiments en masse.

The focus of this Dagstuhl seminar is to discuss experiences and methodological considerations when using crowdsourcing platforms to run human-centred experiments. We aim to bring together researchers in areas that use crowdsourcing to run human-centred experiments and aim to have a high degree of interdisciplinarity. We target members of the human-computer interaction, visualization, psychology, and applied perception research communities as typical users of crowdsourcing platforms. We also wish to engage researchers who develop the technology that makes crowdsourcing possible and researchers who have studied the crowdsourcing community. The following topics will be discussed:

Crowdsourcing Platforms vs. The Laboratory. The laboratory setting for human-centred experiments has been employed for decades and has a well understood methodology with known advantages and limitations. Studies performed on crowdsourcing platforms provide new opportunities and new challenges. A cross community discussion over the nature of these technologies as well as their advantages and limitations is needed. When should we use crowdsourcing? More importantly, when should we not?
Scientifically Rigorous Methodologies. Understanding the strengths and limitations of a crowdsourcing platform can help us refine our human-centred experimental methodologies. When running between-subjects experiments, what considerations do we need to make when allocating our participant pools that will be compared? Are within-subjects experiments too taxing for crowdsourced participants? How do we effectively collect qualitative information?
Crowdsourcing Experiments in Human-Computer Interaction, Visualization, and Applied Perception/Graphics. Each of our fields has unique challenges when designing, deploying, and analysing the results of crowdsourcing evaluation. We are especially interested in the experiences and best practice findings of our communities in regards to these methodologies.
Getting to Know the Crowd. Much of this seminar examins the ways that our research communities can use the technology in order to evaluate the software systems and techniques that we design. However, it is important to consider the people that accept and perform the jobs that we post on these platforms. What are they like?
Ethics in Experiments. Even though the participants of a crowdsourcing study never walk into the laboratory, ethical considerations behind this new platform need to be discussed. What additional considerations are needed beyond standard ethical procedures when running crowdsourcing experiments? How do we ensure that we are compensating our participants adequately for their work, while considering the nature of microtasks?

The intended output of this seminar is an edited volume of articles that will become a primer text on the use of crowdsourcing in our diverse research communities. We also expect that with the range of researchers invited to this seminar new collaborative and interdisciplinary projects will be fostered.

Summary

Show Summary

In various areas of computer science like visualization, graphics, or multimedia, it is often required to involve the users, e.g. to measure the performance of the system with respect to users, e.g. to measure the user perceived quality or usability of a system. A popular and scientifically rigorous method for assessing this performance or subjective quality is through formal experimentation, where participants are asked to perform tasks on visual representations and their performance is measured quantitatively (often through response time and errors). For the evaluation of the user perceived quality, users are conducting some experiments with the system under investigation or are completing user surveys. Also in other scientific areas like psychology, such subjective tests and user surveys are required. One approach is to conduct such empirical evaluations in the laboratory, often with the experimenter present, allowing for the controlled collection of quantitative and qualitative data. Crowdsourcing platforms can address these limitations by providing an infrastructure for the deployment of experiments and the collection of data over diverse user populations and often allows for hundreds, sometimes even thousands, of participants to be run in parallel over one or two weeks. However, when running experiments on this platform, it is hard to ensure that participants are actively engaging with the experiment and experimental controls are difficult to implement. Often, qualitative data is difficult, if not impossible, to collect as the experimenter is not present in the room to conduct an exit survey. Finally, and most importantly, the ethics behind running such experiments require further consideration. When we post a job on a crowdsourcing platform, it is often easy to forget that people are completing the job for us on the other side of the machine.

The focus of this Dagstuhl seminar was to discuss experiences and methodological considerations when using crowdsourcing platforms to run human-centred experiments to test the effectiveness of visual representations in these fields. We primarily target members of the human-computer interaction, visualization, and applied perception research as these communities often engage in human-centred experimental methodologies to evaluate their developed technologies and have deployed such technologies on crowdsourcing platforms in the past. Also, we engaged researchers that study the technology that makes crowdsourcing possible. Finally, researchers from psychology, social science and computer science that study the crowdsourcing community participated and brought another perspective on this topic. In total, 40 researchers from 13 different countries participated in the seminar. The seminar was held over one week, and included topic talks, stimulus talks and flash ('late breaking') talks. In a 'madness' session, all participants introduced themselves in a fast-paced session within 1 minutes. The participants stated their areas of interest, their expectations from the seminar, and their view on crowdsourcing science. The major interests of the participants were focused in different working groups:

Technology to support Crowdsourcing
Crowdworkers and the Crowdsourcing Community
Crowdsourcing experiments vs laboratory experiments
The use of Crowdsourcing in Psychology research
The use of Crowdsourcing in Visualisation research
Using Crowdsoursing to assess Quality of Experience

The abstracts from the different talks, as well as the summary of the working groups can be found on the seminar homepage and this Dagstuhl report. Apart from the report, we will produce an edited volume of articles that will become a primer text on (1) the crowdsourcing technology and methodology, (2) a comparison between crowdsourcing and lab experiments, (3) the use of crowdsourcing for visualization, psychology, and applied perception empirical studies, and (4) the nature of crowdworkers and their work, their motivation and demographic background, as well as the relationships among people forming the crowdsourcing community.

Creative Commons BY 3.0 Unported license

Daniel Archambault, Tobias Hoßfeld, and Helen C. Purchase

Participants

Show Participants

Daniel Archambault (Swansea University, GB) [dblp]
Benjamin Bach (Microsoft Research - Inria Joint Centre, FR) [dblp]
Kathrin Ballweg (TU Darmstadt, DE) [dblp]
Rita Borgo (Swansea University, GB) [dblp]
Alessandro Bozzon (TU Delft, NL) [dblp]
Sheelagh Carpendale (University of Calgary, CA) [dblp]
Remco Chang (Tufts University - Medford, US) [dblp]
Min Chen (University of Oxford, GB) [dblp]
Stephan Diehl (Universität Trier, DE) [dblp]
Darren J. Edwards (Swansea University, GB) [dblp]
Sebastian Egger-Lampl (AIT Austrian Institute of Technology - Wien, AT) [dblp]
Sara Irina Fabrikant (Universität Zürich, CH) [dblp]
Brian D. Fisher (Simon Fraser University - Surrey, CA) [dblp]
Ujwal Gadiraju (Leibniz Universität Hannover, DE) [dblp]
Neha Gupta (University of Nottingham, GB) [dblp]
Matthias Hirth (Universität Würzburg, DE) [dblp]
Tobias Hoßfeld (Universität Duisburg-Essen, DE) [dblp]
Jason Jacques (University of Cambridge, GB) [dblp]
Radu Jianu (Florida International University - Miami, US) [dblp]
Christian Keimel (IRT - München, DE) [dblp]
Andreas Kerren (Linnaeus University - Växjö, SE) [dblp]
Stephen Kobourov (University of Arizona - Tucson, US) [dblp]
Bongshin Lee (Microsoft Research - Redmond, US) [dblp]
David Martin (Xerox Research Centre Europe - Grenoble, FR) [dblp]
Andrea Mauri (Polytechnic University of Milan, IT) [dblp]
Fintan McGee (Luxembourg Inst. of Science & Technology, LU) [dblp]
Luana Micallef (HIIT - Helsinki, FI) [dblp]
Sebastian Möller (TU Berlin, DE) [dblp]
Babak Naderi (TU Berlin, DE) [dblp]
Martin Nöllenburg (TU Wien, AT) [dblp]
Helen C. Purchase (University of Glasgow, GB) [dblp]
Judith Redi (TU Delft, NL) [dblp]
Peter Rodgers (University of Kent, GB) [dblp]
Dietmar Saupe (Universität Konstanz, DE) [dblp]
Ognjen Scekic (TU Wien, AT) [dblp]
Paolo Simonetto (Romano dEzzelino (VI), IT) [dblp]
Tatiana von Landesberger (TU Darmstadt, DE) [dblp]
Ina Wechsung (TU Berlin, DE) [dblp]
Michael Wybrow (Monash University - Caulfield, AU) [dblp]
Michelle X. Zhou (Juji Inc. - Saratoga, US) [dblp]

Classification

computer graphics / computer vision
society / human-computer interaction
world wide web / internet

Keywords

Information Visualization
Data Visualization
Visualization
Graphics
Applied Perception
Human-Computer Interaction
Empirical Evaluations
Crowdsourcing

Seminar 15481

Search the Dagstuhl Website

Schloss Dagstuhl Services

Seminars

Within this website:

External resources:

Publishing

Within this website:

External resources:

dblp

Within this website:

External resources:

Dagstuhl Seminar 15481

Evaluation in the Crowd: Crowdsourcing and Human-Centred Experiments

( Nov 22 – Nov 27, 2015 )

Permalink

Organizers

Contact

Publications

Impacts

Schedule

Motivation

Summary

Participants

Classification

Keywords