Assistant Professor of Cyber Studies Weiping Pei is on a quest to enhance the security, privacy, and efficiency of crowdsourcing systems. In support of this research, Pei was recently awarded a National Science Foundation grant.
“Crowdsourcing is an increasingly popular way to gather information from potentially vast numbers of people,” explained Pei. “These systems typically have two groups of users: requesters and crowd workers – or workers, for short. Requesters post tasks in crowdsourcing systems, such as answering a survey or labeling a set of data to train an AI (artificial intelligence) program. Workers then complete those tasks by submitting responses or annotations, for which they get paid.”
Researchers in several disciplines, including psychology and sociology, use crowdsourcing systems such as Amazon Mechanical Turk and Prolific to conduct human and behavioral studies. Outside of academia, businesses and nonprofit organizations deploy these systems to enrich their understanding of customers, improve machine-learning/AI performance, and accelerate product and service development. A recent example of the latter is the Be My Eyes app, which delivers AI-powered aid to people with visual impairments.
Despite its growing use, Pei notes that crowdsourcing is beset by three major problems:
- Inadequate data quality: For example, workers can be careless or even malicious when providing responses.
- Privacy violations: Recent experience has shown that leakage of workers’ and organizations’ private information is pervasive, but few strategies are in place to prevent it.
- Low levels of efficiency when it comes to task completion: Workers lack sufficient support to enable them to efficiently identify and navigate assigned tasks, resulting in task abandonment and a large amount of unpaid labor.
Tackling the first issue, Pei and her team of one doctoral and one undergraduate student plan to develop a “subtask-aware” quality control framework that incorporates the intrinsic characteristics of crowdsourcing subtasks. One of the major benefits of this approach will be, she underscored, a bolstering of systems’ defenses against attacks.
In order to detect and prevent privacy violations, Pei proposes to develop a client-side browser extension that will safeguard workers’ privacy. “In addition, because tasks posted by requesters can contain sensitive information from other people, we plan to develop a server-side model that’s able to help requesters identify tasks associated with privacy disclosure.”
Finally, Pei intends to develop approaches that will guide workers to identify and prioritize tasks efficiently. She foresees these same approaches being available for implementation as a client-side browser extension capable of analyzing tasks and pinpointing the ones that best match workers’ preferences.
In mid-October, Pei presented her team’s findings concerning third-party privacy disclosure at the ACM Conference on Computer-Supported Cooperative Work, a leading gathering of experts in the field. Now back in Tulsa, she said, “we are focused on gauging the feasibility of using human-AI collaboration as a quality control in crowdsourcing. Following that, we will begin research on ways to improve those systems’ efficiency.” Ultimately, Pei intends to draw these threads together to integrate the proposed models, techniques, and mechanisms into a client-side browser extension for workers and a server-side module for requesters and crowdsourcing providers.