Leveraging Multiple Data Sources with Hierarchical Models

Symposium
ROOM: RSCC, D10
SESSION NUMBER: 8130
 
Hierarchical models generally permit incorporating multiple data sets in a cohesive modeling framework. Integrated population models (IPMs) are a specific type of hierarchical model that use demographic data and abundance data, and have gained momentum from population ecologists. IPMs have now been used in many diverse applications, and more recently, several studies have evaluated their behavior and statistical properties. The first half of this symposium will feature a series of presentations that discuss wildlife case studies that have used IPMs, the added insight these models can provide, as well as simulation-based approaches that provide insight into the behavior of IPMs.   Although IPMs are the most common models that leverage multiple data types, Bayesian hierarchical models generally permit  leveraging multiple data sets. The second half of the symposium will feature presentations on applications using hierarchical models, other than IPMs, that use multiple data sets, and the resulting inference obtained from those models.  Applications include, for example, state-of-the-art developments in spatio-temporal point process models, landscape genetics, and distance sampling.

1:10PM Estimating Juvenile Survival of Emperor Penguins Using Bayesian Integrated Population Modeling
  Fitsum Abadi, Christophe Barbraud, Olivier Gimenez
Survival of immature individuals is an important component of the population dynamics of seabird species. However, obtaining reliable estimates of immature survival remains challenging, as immatures stay at sea and undetectable for several years. We developed a Bayesian integrated population model (IPM) to estimate the survival of juvenile emperor penguins (Aptenodytes forsteri) from long-term capture-recapture data of adults, population count and fecundity data collected at Dumont d’Urville, Antarctica. We also assessed the effects of environmental covariates on juvenile survival using our Bayesian IPM approach. Our findings showed that the posterior mean of juvenile survival was 0.401 (SD = 0.126). The southern annular mode during the rearing period was positively correlated with juvenile survival (ß = 1.639; 95% credible interval: [0.489, 2.664]). Estimating the influence of environmental covariates on immature survival is an important step in understanding the impact of climate change on the population dynamics of emperor penguins and seabirds in general. Our study also demonstrated the potential of IPM in estimating juvenile survival of emperor penguins without direct data on this parameter.
1:30PM Spatio-Temporal Dynamics of Sage-Grouse Populations in Nevada
  Cheyenne Acevedo, Perry Williams
Greater sage-grouse (Centrocercus urophasianus) are symbolic of western landscapes and an umbrella species indicative of sagebrush ecosystem health. The Nevada Department of Wildlife, and other state and federal agencies regularly monitor greater sage-grouse leks multiple times per season by visiting them, and counting the number of birds at each lek. Lek counts have provided important information on relative abundance and population trends across their range. Lek count data contain spatial and temporal information critical to understand greater sage-grouse ecology, and information required for examining and predicting future population trends. Statistical analyses capable of harnessing this information will improve sage grouse conservation and management. Our objective was to leverage existing data and methods to develop a spatio-temporal statistical model of sage grouse lek attendance across Nevada, and to fit our model to the lek data to make state-wide inference on sage grouse lek dynamics. Specifically, we used point-referenced spatio-temporal models to examine lek dynamics. By employing these spatio-temporal statistical models for sage grouse, we address spatial and temporal correlations among lek counts, reveal potential mechanistic drivers of lek dynamics, and account for multiple levels of statistical uncertainty. Our results indicate the importance of leveraging data to inform statistical uncertainty.
1:50PM Building Integrated Population Models with Harvest Data
  Todd Arnold
For birds and mammals, integrated population models (IPMs) are typically constructed from population counts and mark-recapture data. By contrast, integrated stock assessment models in fisheries are typically based on age-at-harvest and catch-per-unit-effort (CPUE) data. For harvested wildlife, there is a wealth of population information in harvest data, even for birds that can typically only be segregated into two age classes. When combined with modest amounts of tag-recovery data, IPMs can be constructed to estimate abundance, survival, fecundity, and age- and sex-composition of the population. By tagging and releasing animals during two or more seasons, survival and abundance can be estimated during more than one season, providing opportunities to assess harvest versus natural mortality, or breeding versus non-breeding season survival. I illustrate these concepts using harvest and band-recovery data from northern pintails (Anas acuta) and American woodcock (Scolopax minor), and explore the relative value of different quality data streams on total harvest, harvest composition, and tag recoveries to provide guidance for future monitoring efforts.
2:10PM Using an Integrated Population Model to Inform Sage Grouse Management
  Peter Coates, Brian Prochazka, Mark Ricca, Scott Gardner, Shawn Espinosa, David Delehanty
Identifying drivers responsible for temporal variation in population abundance (N) is often challenging, especially for species that experience natural population cycles over relatively short time periods. Understanding whether such variation is associated with natural environmental stochasticity that drives cycling or deterministic factors that can drive declines (e.g., anthropogenic disturbance) is crucial to inform conservation actions. Additionally, disentangling demographic processes helps further inform management efforts. We provide examples of using integrated population models (IPMs), which unify multiple sources of data (e.g., count surveys and demographic data), to estimate N and population rate of change (λ) across multiple spatial and temporal scales for greater sage-grouse (Centrocercus urophasianus; hereafter, sage-grouse) within the Great Basin. First, we used N-mixture models with repeated counts at lek sites (traditional breeding grounds) to improve accuracy of N within an IPM framework, which accounts for detection probability. Second, we investigated effects of climatic conditions on population dynamics at multiple spatial extents by extending IPM models to fit climatic covariates. At relatively large extents, λ was driven by changes in precipitation (1-yr lag effect) but substantial variation existed across subpopulations. Third, we compared trends across spatial extents to identify when local λ decoupled from λ estimated across larger population aggregations. This resulted in signals where local conservation actions were most needed and avoided implementing actions where declines were caused by factors operating at larger scales, such as widespread drought. Lastly, we compared trends between a geographically-isolated population of sage-grouse on the border of California and Nevada (Bi-State) to other sage-grouse populations within the Great Basin using 24 years of data. We discuss the perils of estimating trends in N and λ for cycling populations over relatively short time periods, especially if data are missing and not uniformly distributed through time. This information is preliminary and is subject to revision.
2:30PM Integrating Distance Sampling and Presence-Only Data to Estimate Abundance
  Matthew Farr, David Green, Kay Holekamp, Elise Zipkin
Many species are monitored by more than one program leading to multiple but disparate datatypes, which can be combined to estimate abundance. The use of multiple sampling schemes can provide a wealth of information, but it also creates difficulties for analysis. Integrated distribution models (IDM) were developed to combine multiple datatypes to estimate species abundance and relationships to landscape variables. These models combine presence-only and structured data (e.g., detection-nondetection data, count data) while accounting for sampling biases in each observation process. We extend the current IDM framework to include distance sampling data by describing species spatial abundance patterns with a thinned point process model. We combine distance sampling data with presence-only data by specifying a joint likelihood for the latent abundance process and connecting the data hierarchically with separate observation processes for each datatype. We develop our IDM using a Bayesian framework and evaluate the data requirements and utility of our model through a simulation study. We then apply our model to a case study of black-backed jackals in the Masai Mara National Reserve, Kenya. A portion of the reserve is exposed to anthropogenic disturbances due to passive management enforcement, which likely affects jackals. Distance sampling was conducted to estimate jackal abundance within a limited range of the Reserve, while presence-only data are available from opportunistic observations of jackals across most of the reserve. Thus, structured distance sampling data on jackals is sparse and spatially limited but unstructured presence-only data exist across a large spatial extent. By combing distance sampling and presence-only data, we were able to precisely estimate the spatial abundance patterns of black-backed jackals in the reserve including the effects of management enforcement, distance to Reserve border, lion density, NDVI, and distance to water sources.
2:50PM Refreshment Break
3:20PM Stitching the Pieces Together: Combining Multiple Sources of Data When Spatial Scale and Location Are Uncertain
  Trevor Hefley
Building a multi-source data model requires defining the support of the spatial process and desired scale of inference. For example, integrated species distributions models are used to predict abundance and occupancy with higher accuracy by combining multiple sources of data. A common theme among these approaches is that the location of the individual is conceptualized as a point in geographic space where environmental conditions and habitat characteristics are measured. Those location-specific conditions and characteristics are used to specify an intensity function which enables statistical inference regarding species-habitat relationships and provides estimates of abundance and occupancy that can be mapped at any spatial resolution. In many studies, however, the exact locations of individuals are not available either due to aggregation or location error. For example, the exact locations of individuals are unrecoverable from traditional distance sampling data. Regardless of the mechanism that obscures an individual’s location, uncertainty limits the usefulness of the data and hinders efforts to combine multiple sources of data with variable location accuracy and spatial resolution. I will highlight recent work that demonstrates complications resulting from combining multiple sources of data and outline a general hierarchical modeling approach to implement commonly used statistical model, but that account for location uncertainty and variable spatial resolution. This will enable multiple sources of data to be combined regardless of the spatial scale or location accuracy with which the data were collected.
3:40PM Integrated Population Models: Model Assumptions and Inference
  Thomas Riecke, Perry Williams, Tessa Behnke, Dan Gibson, Alan Leach, Benjamin Sedinger, Phillip Street, James Sedinger
Integrated population models (hereafter, IPMs) have become increasingly popular for the modeling of populations, as investigators seek to combine survey and demographic data to understand processes governing population dynamics. These models are particularly useful for identifying and exploring knowledge gaps within life histories, because they allow investigators to estimate biologically meaningful parameters, such as immigration or reproduction, that were previously unidentifiable without additional data. As IPMs have been developed relatively recently, there is much to learn about model behavior. Behavior of parameters, such as estimates near boundaries, and the consequences of varying degrees of dependency among datasets, has been explored. However, reliability of parameter estimates remains underexamined, particularly when models include parameters that are not identifiable from one data source, but are indirectly identifiable from multiple datasets and a presumed model structure, such as the estimation of immigration using capture-recapture, fecundity, and count data, combined with a life-history model. We simulated two scenarios that might induce error into survival estimates: marker induced bias in the capture-mark-recapture data, and heterogeneity in the mortality process. We subsequently fit capture-mark-recapture, state-space, and fecundity models, as well as IPMs that estimated additional parameters. Simulation results suggested that when model assumptions are violated, estimation of additional, previously unidentifiable, parameters using IPMs may be extremely sensitive to these violations of model assumption. For example, when annual marker loss was simulated, estimates of survival rates were low, and estimates of immigration rate from an IPM were high. Our results have important implications for biological inference when using IPMs. Specifically, using multiple datasets to identify additional parameters resulted in the posterior distributions of additional parameters directly reflecting the effects of the violations of model assumptions in integrated modelling frameworks. We suggest that investigators interpret posterior distributions of these parameters as a combination of biological process and systematic error.
4:00PM Multi-Scale Population Assessment of White Tailed Ptarmigan
  Phillip Street
Often the primary interest of wildlife managers is the current status of wildlife populations. One challenge of researchers addressing this question is what constitutes a population and how to best collect data to address that question. Colorado’s high concentration of alpine habitats support a subspecies of White-tailed ptarmigan that is genetically and geographically distinct from the rest of the species, but delineating this population into smaller spatial scales is not as straight forward. We split this population into two subpopulations that showed little genetic mixing, and were spatially associated with two major mountain ranges. At the finest spatial scale, we split these two subpopulations into 6 survey sites delineated by drainage basins separated by high ridges. These alpine basins are important habitat for breeding ptarmigan. We collected mark-recapture data at each of these sites to estimate the number of breeding birds associated with a basin. Concomitantly, we independently collected radio telemetry data to inform demographic rates and movement on and off of our sites. Combined in a hierarchical framework, we were able to evaluate trends in abundance as well as the processes driving them. We observed strong site fidelity from year to year. At the population scale, there was little evidence of an increase or decrease during our study. At the subpopulation scale, we again found evidence for stability within these two geographic regions. At the site level, five sites were relatively stable, while at one site we observed declines in the number of breeding birds. We believe this observed decline to be the result of localized increases in human recreation. Given this observation, we suggest managers develop practices to monitor and address negative impacts of increased human recreation.
4:20PM Combining Multiple Data Sets to Improve Inferences on the Abundance and Dynamics of Populations: Overview and Synthesis
  Elise Zipkin
Data integration is a statistical approach that simultaneously analyzes multiple data sources within a single, unified modeling framework. Ecological research increasingly employs data integration techniques to expand the spatiotemporal scope of studies, increase precision of parameter estimates, account for multiple sources of uncertainty, and produce reliable predictions about future ecosystem states and processes. In this talk, I will provide an overview of current research on data integration to estimate the abundance and dynamics of animal populations including background on data integration approaches and why it is useful in ecological research. I will highlight several studies that have used data integration to understand factors influencing species trends, estimate parameters for which data are unavailable, determine how processes change through space and time, and predict future vulnerabilities. I will also review a number of ongoing challenges to the integration of multiple data sources including spatio-temporal mismatch of data, nonstationarity of parameter estimates, and discrepancies in the quantity and quality of data available from various sources. I will conclude by discussing integration opportunities and potential avenues of future research. It can be difficult to determine how multiple, disparate sources of data can contribute to the ecological understanding of populations and their habitats. Despite this, data integration techniques have expanded the scope of questions that can be asked about the dynamics of populations. Thus, the development of integrated analyses is likely to continue and grow. Further use and easier implementation has the potential to allow data integration approaches to greatly enhance contemporary ecological research.
4:40PM Integrating Data Sources across the Annual Cycle to Understand Population Dynamics of the Eastern Monarch Butterfly
  Erin Zylstra, Leslie Ries, Naresh Neupane, Sarah Saunders, Elise Zipkin
Monarch butterflies (Danaus plexippus) in eastern North America are well known for their multi-generational, long-distance migration between wintering locations in Mexico and breeding locations in the northern U.S. and southern Canada. The overwintering population has declined severely since monitoring began in 1993; the causes of this decline have been the subject of substantial debate. Understanding the relative contributions of abiotic factors across the annual cycle to population declines is critical to designing effective conservation strategies. To assess the relative importance of each threat, we used multiple datasets on monarch abundance collected throughout the annual cycle and modeled changes in population size as a function of environmental and anthropogenic factors in a Bayesian hierarchical framework. First, we integrated monarch data from several volunteer-based monitoring programs across the summer breeding grounds and modeled counts as a function of land use, spring and summer climate, and population size during the prior overwintering period. We incorporated random effects to account for differences in sampling methodologies and effort among data sources. We used weighted means of predicted counts from the summer component of the model as an annual index of peak summer abundance. We then modeled winter population sizes in 19 colonies relative to the summer population index, nectar availability along the fall migratory corridor, and local forest cover. Our results indicate a strong link between butterfly abundance in summer and size of overwintering colonies, but also suggest that population dynamics are governed by multiple factors operating in different seasons and at different spatial scales. Our study highlights the importance of modeling population dynamics of migratory species throughout the annual cycle and will allow for spatially-explicit predictions of how management actions are likely to affect future population dynamics of monarchs in eastern North America.

 
Organizers: Perry Williams
 
Supported by: TWS Biometrics Working Group

Symposium
Location: Reno-Sparks CC Date: October 2, 2019 Time: 1:10 pm - 5:00 pm