Model-based Approaches for Addressing False Positive Detection Errors in Fish and Wildlife Studies

Most fish and wildlife surveys are subject to imperfect detection, including both false negative errors (not detecting a species when it is present) and false positive errors (detecting a species when it is not present). Models that account for false negative detections have been widely adopted, but most of these assume that false positives do not occur. However, false positive errors are likely to occur in data collected with a variety of methods, including visual, acoustic, or molecular tools, and in data collected by citizen scientists, non‐experts, and scientists of all experience levels. Monte Carlo simulations have demonstrated that if models fail to account for false positive errors, inference will be biased. Because false positive detections are unlikely to be eliminated through study design alone, it is important to develop and employ methods that account for both types of imperfect detection. The symposium will focus on methodological advances in accounting for false positive detections in fish and wildlife studies. Talks will include presentations of novel methodological advances and demonstrations of applications for a wide range of species and data types. The talks will cover estimation of occupancy, abundance, and survival with applications to single species and multi-species communities.

1:10PM An Introduction to the False Positive Detection Problem and Model-Based Approaches to Deal with It
  Paige Ferguson, Brittany Mosher
Methods that account for false negative detections (not detecting a species when it is present) are well established. However, these methods typically assume data do not have false positive detections (saying a species is present when it is not). False positive detection errors can occur when species or individuals are misidentified due to lack of experience, imperfect detection methods, or when interfering sights, sounds, or signals are present. Evidence is increasing that false positive errors are pervasive in a variety of studies, and simulations have shown that failing to account for these errors results in estimation biases. Therefore, using design- or model-based approaches to address false positive detection errors in fish and wildlife studies is critical. In this talk, we introduce the problem of false positive detection errors and trace the development of models that account for these errors. We describe occupancy models that account for false positive errors through model assumptions or confirmatory datasets and also introduce models developed to estimate abundance or survival. Finally, we provide an overview of the contributions the subsequent talks in the symposium will make, ranging from study design considerations to model applications for a variety of data types. This symposium will expose practitioners to the variety of approaches that now exist to address false positive detection errors, in addition to false negative detection errors, in fish and wildlife studies.
1:30PM Designing False-Positive Occupancy Surveys
  Matthew Clement
False-positive occupancy models can improve parameter estimates when false-positive detections occur during presence-absence surveys. However, false positives greatly complicate study design because popular estimators combine different types of survey data to improve estimates of occupancy. Accordingly, there is a tradeoff between the number of sample units surveyed, and the number and type of surveys at each sample unit. To assist with study design, I identified survey designs that minimized the mean square error of the estimate of occupancy, across a range of scenarios. I considered a false-positive estimator that uses one survey method and two observation states and a false-positive estimator that uses two survey methods. For each approach, I used numerical methods to identify optimal survey designs 1) when model assumptions were met and parameter values were correctly anticipated, 2) when parameter values were not correctly anticipated, and 3) when the assumption of no un-modeled detection heterogeneity was violated. Under the approach with two observation states, false-positive detections increased the number of recommended surveys, relative to standard occupancy models. If parameter values could not be anticipated, pessimism about detection probabilities avoided poor designs. Detection heterogeneity could require more or fewer repeat surveys, depending on parameter values. If model assumptions were met, the approach with two survey methods was inefficient. However, with poor anticipation of parameter values, with detection heterogeneity, or with removal sampling schemes, combining two survey methods could improve occupancy estimates. The guidance provided here can be used to design efficient monitoring programs and studies of species occurrence, species distribution, or habitat selection, when false positives occur during surveys.
1:50PM Accounting for Incomplete Detection and Species Misidentification in Aerial Surveys of Ice-Associated Seals: A Confused Approach
  Paul Conn
Ice-associated seals in the Arctic are of conservation concern because of declines in spring-summer sea ice extent, but are difficult to study given their remoteness and expansive range. In 2012 and 2013, we used a combination of thermal imaging and digital photography to study the abundance and distribution of 4 species of seals (bearded seals, Erignathus barbatus; ribbon seals, Histriophoca fasciata; ringed seals, Phoca hispida; and spotted seals, Phoca largha) from aircraft in the eastern (U.S.) Bering Sea. Species identification was performed by trained observers, but confidence varied depending on photo quality and seal position; observers also sometimes came to different conclusions about species identity. In order to account for false positive and false negative errors, we used a double observer experiment to estimate a confusion matrix, and embedded this observation process within a larger species distribution model for seal counts that also accounted for incomplete detection. Estimates of seal abundance suggest a possible decline of spotted and ribbon seals between 2012 and 2013 surveys.
2:10PM Modeling Species Classification Errors When Estimating Occupancy and Relative Activity Rates with Acoustic Data
  Wilson Wright, Kathryn Irvine, Emily Almberg, Andrea Litt
Monitoring wildlife communities provides data for making informed conservation and management decisions that will affect multiple species. Autonomous recording units (ARUs) are used to gather community data for a variety of taxa, but current statistical approaches for analyzing these data do not explicitly model the species classification process or fully utilize the information available in counts of call recordings. In ARU data, false positive detections can result specifically from species misclassification but erroneous detections are generally attributed to an omnibus source in analyses. Motivated by acoustic data for bats, we developed a model to account for these nuances when analyzing ARU data. Our model is related to other false positive occupancy models, but these alternative approaches differ because they analyze binary observations instead of counts and/or are unable to explicitly incorporate data from multiple species. The species classification process is incorporated in the observation component of our model using the multinomial distribution and describes how individual call recordings may be misclassified as another species. We applied our model to acoustic data for eight bat species in Montana. This analysis illustrates the flexibility in our model while highlighting the assumptions and data needed to use this approach. Specifically, we describe options and considerations for informing the species classification probabilities. We used simulations to compare our model to other false positive occupancy models for an example scenario where ARU data are collected for two species. Single-species models resulted in biased estimates of occupancy and analyzing binary observations ignored available information on relative activity rates. Directly modeling the species classification process allows for reliable ecological inferences for both occupancy and relative activity rates using community ARU data. Our statistical framework helps address the challenges posed by acoustic data, allowing ecologists to better utilize this technology when monitoring wildlife communities.
2:30PM Robust Inference on Large-Scale Species Habitat Use with Interview Data Prone to False Positives
  Lisanne Petracca, Jacqueline Frair
Evaluating range-wide habitat use by a target species requires information on species occurrence over broad geographic regions, a process made difficult by species rarity, large spatiotemporal sampling domains, and imperfect detection. We address these challenges in an assessment of habitat use for jaguars (Panthera onca) outside protected areas in Central America. Occurrence records were acquired within 12 putative corridors using interviews with knowledgeable corridor residents. We developed a Bayesian hierarchical occupancy model to gain robust inference, allowing for heterogeneity introduced in the sampling process over space and time, using records of jaguar occurrence prone to false positives and false negatives. Probability of false detection of jaguars increased with the number of interviews conducted per unit (from 5.42% to 7.74% given <4 and ≥4 observers per unit). True probability of detection (mean=0.58) increased with the number of days interviewees spent in a survey unit per year. Failing to account for false positives biased predicted habitat use high (~1.8x), especially where occurrence records were sparse. Probability of site use by jaguars increased with greater forest cover, prey richness, and distance from human settlements, and decreased with greater agricultural cover, elevation, and distance from protected areas. Site use probabilities averaged 0.15-0.97 by corridor, providing relatively fine-scale resolution of predicted jaguar occurrence consistent with known patterns of jaguar gene flow across Central America. Model validation, accounting for both false positives and negatives in the observation process, indicated moderate correspondence between model-predicted observations and actual observations for withheld data (0.65, 95% CRI 0.59–0.71). These results demonstrate that reliable predictions can be achieved despite the complexity of large-scale, interview-based analyses of species occurrence. Our approach is applicable to any wide-ranging and readily identifiable species and has particular utility for rare species in human-dominated landscapes where traditional survey techniques (e.g., camera traps) may be impractical.
2:50PM Refreshment Break
3:20PM Addressing False-Positive Error in Citizen-Science: A Case Study Using Ebird Data and False-Positive Occupancy Models
  Viviana Ruiz-Guitierez
The increasing popularity of occupancy models as a tool for understanding changes in patterns of species abundance and distribution has prompted model development to address significant sources of error, such as species misidentification. However, the types of information that are required to address false-positive errors are often hard to collect. Most recent applications require data sources that are either known or assumed to not be subject to false-positive errors, which can require additional validation steps post data collection. Another approach is to get these error rates from independent data sources through experimental trials in the field or via an online quiz or test. Here, we apply a recently developed occupancy model that can account for heterogeneity in false-positive and false-negative errors to draw inference on habitat relationships of bird species in the Northeastern US. We used independent, auxiliary data on false-positive errors collected via an online bird identification quiz ( using both auditory trials as well as verified images of birds. To date, quiz results have generated false-positive error rates for 584 bird species worldwide, ranging from p0=0.008 for black swan (Cygnus atratus) to p0=0.129 for eastern wood-pewee (Contopus virens), with a mean value of p0=0.048. On average, we have about 27,000 independent trials for each species, which has been shown to meet requirements on the quantity of auxiliary needed for improved inference. We combined auxiliary data on false-positive errors from the eBird Quiz with citizen science data in eBird ( collected in the Northeastern US. The ability to correct for false-positive errors allowed us to make more accurate inferences on the habitat associations and related dynamics of forest associated bird species.
3:40PM When Occupancy Isn’t the (only) End-Goal: Generalized Confirmation Models for Dealing with False Positives in Detection-Nondetection Data
  John Clare, Benjamin Zuckerberg, Philip Townsend
Some level of false positive error is common within detection-nondetection data, and several model-based approaches to account for false positive error when estimating species distribution have been developed. Detection-nondetection data is also commonly used to estimate other state variables like species abundance, density, or the timing of migratory arrival or emergence from hibernation; false positive error presumably also biases these estimators, but model extension has been slow. I reconceptualize the observation confirmation model described by Chambert et al. (2015) to describe a flexible false positive extension to a variety of site-structured estimators using detection-nondetection data. I use simulation to compare the performance of the Royle-Nichols model, its spatially-explicit extension, and a phenological occupancy model to the extended models across varying levels of false positive error and confirmation effort. Consistent with previous simulation studies focused on occupancy models, results confirm that estimators ignoring false positive error are biased when false positives are randomly present at low prevalence (e.g., 5% of detections are false positives), although bias was sensitive to true and false positive covariance and less pronounced when true and false positive detection probability were more likely to occur at the same place and time. The proposed extensions exhibited less bias but more estimate uncertainty under all scenarios considered. Furthermore, results suggest that sub-models for or estimates of false positive error may be transferable across distinct model types, as there was little difference between estimates produced by Royle-Nichols or phenological models incorporating confirmation data within the likelihood and models using an informed prior for false positive parameters derived from estimates from an occupancy model. In whole, results indicate that while false positive error is problematic when ignored for many estimators reliant upon detection-nondetection data, it may also be easier to conceptualize and quicker to fit model-based solutions than previously recognized.
4:00PM Dealing with Misidentification Errors When Two Sister Species Co-Occur: Applications to Occupancy and Abundance Estimation
  Thierry Chambert
A common goal of many wildlife-monitoring studies is to estimate species presence and/or local abundance at various sites of suitable habitat. To obtain unbiased estimates, it is important to account for the imperfect detectability of free-ranging animals, but another issue, which has received much less attention, concerns the occurrence of false detections. This is especially a concern when two closely related species live in sympatry. One species can often been mistaken for the other, thus creating false positives that will cause important biases in species-specific estimates of site occupancy and local abundance. In this talk, I will present some recent methodological developments that allow accounting for species misidentification in Occupancy and N-mixture models, two approaches that rely on unmarked animal data. Although they each focus on a different state variable (occupancy vs. local abundance), the framework used to accommodate for species misidentification has interesting similarities. I will illustrate the N-mixture model extension using a case study on two sister species of freshwater fish that co-occur in wetland complexes of Northern Manitoba, Canada. For the occupancy model, I will use a study on two species of terrestrial salamanders that live in sympatry in the Blue Ridge Mountains of Virginia, USA. I will discuss the costs (added model complexity, data requirements) and benefits (improved estimation accuracy) of these new developments and provide some guidance for dealing with the risks of species misidentification in these types of studies.
4:20PM Counting on Data Challenges: Estimating Species Abundance without Consistent Species Identification
  Alison Johnston
Many forms of ecological data collection have individuals that cannot be identified to species and these individuals are often assumed to be ‘undetected’. However, this issue is more common and problematic with technology-based passive ecological monitoring. In order to take advantage of the huge power of passive monitoring, it is important to consider unidentified species. Here we use the example of monitoring marine birds with aerial photos, in which 75% of birds in photos were not identified to species. We combined these aerial photos with data from boat surveys, which had high species identification. We estimated species abundance from the aerial photos, using the boat data to estimate the likely species in the aerial photos. Combining both boat data and aerial photos enabled us to produce population estimates for all species, even those with low species identification rates. This novel approach combined the strengths of aerial photos and boat surveys, propagating the uncertainties through each stage. Final species abundance distributions and population estimates accounted for several sources of uncertainty, including uncertainty in species identification. This approach has potential to be used with other forms of passive monitoring with uncertain species identification.
4:40PM Encounter Type and Frequency Determine the Effects of Individual Misidentification on Survival Estimation
  Anna Tucker
Wildlife ecologists have many methods for marking individual animals to estimate demographic parameters including survival, movement, or reproductive output. Many mark-recapture methods rely on noninvasive encounters of individuals for all or some of the capture events. Examples of these encounter types include camera trapping, noninvasive genetic sampling, and resighting of field-readable tags. These types of noninvasive encounters may be more prone to identification errors than physical captures, due to field conditions or limited time to confirm ID. Such errors could be problematic for analysis of these data, but the effects of these errors on estimation of survival depends on both the type and frequency of encounter. For encounter types in which the potential for misidentification occurs on the first encounter (e.g. noninvasive genetic sampling), misidentification leads to a constant negative bias in survival estimates. For encounter types in which the potential for misidentification occurs after the first capture (e.g. mark-resight studies), misidentification leads to positively biased survival estimates and apparent negative trends in survival over time. The frequency of encounters (which is related to detection probability) during the sampling period determines the magnitude of the bias due to these errors. If detection probability is high and individuals are encountered several times within the sampling period, the opportunity for errors to enter the dataset is greater. Here I discuss these different capture-recapture encounter types and use simulations to illustrate the biases we expect to see as a result of misidentification. I also present options for filtering or conditioning data before analysis that could mitigate these biases and discuss the trade-offs involved in implementing them.

Organizers: Paige Ferguson, Brittany Mosher, David Miller
Supported by: TWS Biometric Working Group

Location: Reno-Sparks CC Date: October 3, 2019 Time: 1:10 pm - 5:00 pm