A generalized approach for producing, quantifying, and validating citizen science data from wildlife images. Issue 3 (25th April 2016)
- Record Type:
- Journal Article
- Title:
- A generalized approach for producing, quantifying, and validating citizen science data from wildlife images. Issue 3 (25th April 2016)
- Main Title:
- A generalized approach for producing, quantifying, and validating citizen science data from wildlife images
- Authors:
- Swanson, Alexandra
Kosmala, Margaret
Lintott, Chris
Packer, Craig - Abstract:
- Abstract: Citizen science has the potential to expand the scope and scale of research in ecology and conservation, but many professional researchers remain skeptical of data produced by nonexperts. We devised an approach for producing accurate, reliable data from untrained, nonexpert volunteers. On the citizen science websitewww.snapshotserengeti.org, more than 28, 000 volunteers classified 1.51 million images taken in a large‐scale camera‐trap survey in Serengeti National Park, Tanzania. Each image was circulated to, on average, 27 volunteers, and their classifications were aggregated using a simple plurality algorithm. We validated the aggregated answers against a data set of 3829 images verified by experts and calculated 3 certainty metrics—level of agreement among classifications (evenness), fraction of classifications supporting the aggregated answer (fraction support), and fraction of classifiers who reported "nothing here" for an image that was ultimately classified as containing an animal (fraction blank)—to measure confidence that an aggregated answer was correct. Overall, aggregated volunteer answers agreed with the expert‐verified data on 98% of images, but accuracy differed by species commonness such that rare species had higher rates of false positives and false negatives. Easily calculated analysis of variance and post‐hoc Tukey tests indicated that the certainty metrics were significant indicators of whether each image was correctly classified or classifiable.Abstract: Citizen science has the potential to expand the scope and scale of research in ecology and conservation, but many professional researchers remain skeptical of data produced by nonexperts. We devised an approach for producing accurate, reliable data from untrained, nonexpert volunteers. On the citizen science websitewww.snapshotserengeti.org, more than 28, 000 volunteers classified 1.51 million images taken in a large‐scale camera‐trap survey in Serengeti National Park, Tanzania. Each image was circulated to, on average, 27 volunteers, and their classifications were aggregated using a simple plurality algorithm. We validated the aggregated answers against a data set of 3829 images verified by experts and calculated 3 certainty metrics—level of agreement among classifications (evenness), fraction of classifications supporting the aggregated answer (fraction support), and fraction of classifiers who reported "nothing here" for an image that was ultimately classified as containing an animal (fraction blank)—to measure confidence that an aggregated answer was correct. Overall, aggregated volunteer answers agreed with the expert‐verified data on 98% of images, but accuracy differed by species commonness such that rare species had higher rates of false positives and false negatives. Easily calculated analysis of variance and post‐hoc Tukey tests indicated that the certainty metrics were significant indicators of whether each image was correctly classified or classifiable. Thus, the certainty metrics can be used to identify images for expert review. Bootstrapping analyses further indicated that 90% of images were correctly classified with just 5 volunteers per image. Species classifications based on the plurality vote of multiple citizen scientists can provide a reliable foundation for large‐scale monitoring of African wildlife. … (more)
- Is Part Of:
- Conservation biology. Volume 30:Issue 3(2016:Jun.)
- Journal:
- Conservation biology
- Issue:
- Volume 30:Issue 3(2016:Jun.)
- Issue Display:
- Volume 30, Issue 3 (2016)
- Year:
- 2016
- Volume:
- 30
- Issue:
- 3
- Issue Sort Value:
- 2016-0030-0003-0000
- Page Start:
- 520
- Page End:
- 531
- Publication Date:
- 2016-04-25
- Subjects:
- big data -- camera traps -- crowdsourcing -- data aggregation -- data validation -- image processing -- Snapshot Serengeti -- Zooniverse -- cámaras trampa -- conjunto de datos -- crowdsourcing -- datos grandes -- procesamiento de imágenes -- Snapshot Serengeti -- validación de datos -- Zooniverse
Conservation biology -- Periodicals
333.9516 - Journal URLs:
- http://onlinelibrary.wiley.com/journal/10.1111/(ISSN)1523-1739 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1111/cobi.12695 ↗
- Languages:
- English
- ISSNs:
- 0888-8892
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3417.999000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 1117.xml