Quest for the Gold Par: Minimizing the Number of Gold Questions to distinguish between the Good and the Bad

TitleQuest for the Gold Par: Minimizing the Number of Gold Questions to distinguish between the Good and the Bad
Publication TypeConference Paper
Year of Publication2018
AuthorsMaarry, K. E., and W. - T. Balke
Conference Name10th ACM Conference on Web Science
PublisherACM
Conference LocationAmsterdam, The Netherlands
Abstract

The benefits of crowdsourcing for data science have furthered its widespread use over the past decade. Yet fraudulent workers undermine the emerging crowdsourcing economy: requestors face the choice of either risking low quality results or having to pay extra money for quality safeguards like e.g., gold questions or majority voting. Obviously, the more safeguards injected into the workload, the lower the risks imposed by fraudulent workers, yet the higher the costs are. So, how many of them are really needed? Is there such a ‘one size fits all’ number? The aim of this paper is to identify custom-tailored numbers of gold questions per worker for managing the cost/quality balance. Our new method follows real life experiences: the more we know about workers before assigning a task, the clearer our belief or disbelief in this worker’s reliability gets. Employing probabilistic models, namely Bayesian belief networks and certainty factor models, our method creates worker profiles reflecting different a-priori belief values, and we prove that the actual number of gold questions per worker can indeed be assessed. Our evaluation on real-world crowdsourcing datasets demonstrates our method's efficiency in saving money while maintaining high quality results. Moreover, our methods performs well despite the quite limited information known about workers in today's platforms.

AttachmentSize
WebSci-Quest for the Gold Par.pdf1.73 MB