New spatial abundance models inform distribution, population, and trends for forest birds in Canada

Oct 8, 2020 21:47 · 5219 words · 25 minute read trend estimates distance local habitat

Hi my name is Diana Stralberg and I’m with the Boreal Avian Modelling Project or BAM. My colleague Peter Solymos and I will be talking to you about our work to develop new spatial abundance models that inform distribution population and trends for forest birds in Canada. I’d like to start by acknowledging that I’m located in Edmonton, Alberta, Canada, which is located on Treaty 6 territory, the traditional lands of first nations and metis people. Peter and I will talk to you today about a new set of models that builds on our previous work. Our models are always a work in progress so we’ve set up a data portal on github that should facilitate feedback and accommodate frequent updates.

You’ll hear more about the site from 00:39 - Peter later in the presentation. As we produce new versions, we will update the site accordingly. Older versions will be archived and metadata can be found on Zenodo at the link shown here. The Boreal Avian Modelling Project is focused on research but one of the core goals is to inform population assessment with our new models. We aimed to develop an integrated approach that could be used to achieve a range of objectives at once, specifically our models are intended to result in population estimates but also habitat-specific density estimates as well as Canada-wide distribution maps that capture both broad-scale climate gradients as well as local habitat differences, and finally the intention is to use our abundance models to improve regional trend estimates.

01:25 - Speaking of trends, everyone has probably heard about the paper in Science last year, led by Ken Rosenberg and other Partners in Flight members. There’s nothing new about bird declines but what was new about this paper was that the authors combined abundance and trend estimates to put a number on bird loss which is all of a sudden very powerful. The paper estimated that 3 billion birds have been lost over the last 50 years with the boreal region shown here in dark green contributing about half a billion birds to the overall loss so that’s a really big responsibility but we also have a lot more uncertainty in the north as you can see by the error bars here. These error bars are focused on the trend but we also have a lot of uncertainty about the underlying population numbers. The science paper used the best available population estimates from Partners in Flight but for land birds the numbers were based on BBS [Breeding Bird Survey] data which are very sparse in the north due to the limited road network.

At the same time 02:19 - the vast boreal forest region is recognized as providing extremely important habitat for birds with over 300 regularly breeding bird species the large majority of which are migratory. So due to this lack of coordinated monitoring, data gaps are first on our list of data challenges. I’ve highlighted five central challenges here which I will address one by one. This lack of coordinated monitoring is what prompted the creation of the Boreal Avian Modelling Project or BAM in 2004. The project was initiated to address the significant gaps in knowledge required to effectively manage and conserve boreal birds.

The first step in this was 03:00 - to assemble a data set representing a compilation of point count survey data. The data set which spans the boreal and hemiboreal regions of North America or most of Canada is a compilation of publicly available data such as breeding bird atlases and BBS as well as individual data sets that were generously contributed by numerous independent researchers and partners. By assembling enough individual point count data sets shown in dark red on this map we’ve been able to cover most of the environmental space in Canada south of the arctic where we’ve not collected survey data this map shows percentiles of predicted survey effort based on over 100 environmental covariates and a single boosted regression tree model that was based on a random sample of pixels drawn from the study area. So you can see that environmental conditions of northern regions are not very well sampled relative to southern Canada but according to a different metric based on environmental similarity there are just a few areas that really lie outside of the environmental space that we’ve sampled [which are shown] which is shown in dark blue on this map. This environmental similarity surface which emphasizes the magnitude of difference rather than sampling effort shows us that the environmental conditions found in the far north and in west coast mountains are most different from what we have sampled in our data set.

We’re currently working on filling in these gaps in the west since the data do exist. Some of our earlier efforts took advantage of this environmental coverage to develop species distribution models based on climate and land cover using a method called MaxEnt but these were habitat suitability models and did not incorporate abundance information by modeling abundance we can compare habitat value more accurately but there are still many challenges involved. Because we’re working with an ad hoc data set, our data are collected with a variety of different protocols under a range of conditions and so this standardization problem is one that BAM has spent much of its early years working on. Most of this work was led by Peter Solymos and there are now several papers you can refer to for more information about how we’ve standardized our data set. Although refinements are ongoing, the basic approach is summarized in a paper published in Methods in Ecology and Evolution in 2013.

05:14 - What we’ve done is to develop a method to convert count data from disparate sources into density using detectability-based correction factors. Our approach separates and models the two primary aspects of detectability first is p(t): the probability of an individual bird singing within a given time interval. Obviously, the longer you count the more birds you detect and we used a removal model to estimate the singing rate parameter phi and the shape of this curve shown here. The second one is q®: the probability of detecting a singing bird within a given distance or radius. Of course the farther away from the observer a bird is the lower the probability that it will be heard and counted.

05:55 - We use distance sampling to estimate the shape of this curve based on the effective detection radius or the the distance for which the probability of missing a species within that distance is equal to the probability of detecting a species that is outside the distance. For a given survey the resulting values p and q can then be combined with sampling area A to generate a correction factor that when multiplied by density yields the estimated count or in other words the density can be estimated as a function of survey count N and these correction factors the logarithm of the correction factors can be used as an offset in count regression models used to estimate density as we have done. So with the standardization problem addressed, the next challenge we had to overcome was the fact that birds exhibit complex responses to environmental factors especially over the large ranges occupied by most northern forest species. We found that complex generalized linear models based on hierarchical variable selection approaches can work well for single species or smaller regions as demonstrated by this application for the Canada Warbler in alberta but models can be time-consuming to parameterize properly for multiple species and so what we’ve concluded is that machine learning is required to automate spatial predictions for multiple species over large areas. Specifically we’re using boosted regression trees which is a type of ensemble modeling based on developing a sequence of regression trees each of which is focused on capturing the unexplained [variant unexplained] variation from the previous tree.

Shown here is an 07:31 - earlier example from my PhD work which was focused on climate change projection. Now we’re looking at a much larger suite of predictors to capture more environmental complexity. By incorporating this environmental complexity in our models we can minimize the influence of sample bias on density estimates; in effect we’re doing an informed interpolation of our data across the study area. Another challenge is that landscape change can be pretty rapid and extensive especially in the boreal region. The boreal forest is particularly dynamic with very active natural disturbance regimes as well as extensive industrial development including forestry oil and gas and mining.

[These meth] these maps here show the 08:14 - extent of forest disturbance over 25-year period and the resulting change in mapped land cover types so we wanted to capture this rather than assuming static vegetation types. And finally we recognize that there is regional variation in species habitat relationships. Boreal birds generally have large ranges although the boreal region is quite diverse climatically and physiographically so naturally there are differences in habitat associations. These figures from a paper by Andy Crosby show differences in the densities of six boreal species across boreal regions on the left and the relatively low level of niche overlap between Quebec and Alberta for the Canada Warbler on the right. It’s not clear if these differences are related to differences in habitat preference or differences in habitat availability but they can be different enough that it’s inappropriate to assume constant habitat relationships across the country.

Model interactions can address this 09:08 - but it’s difficult to capture everything so we opted for a regional approach to modeling. To address these various challenges that I’ve outlined we’ve developed a generalized national model approach focused on these key components. First we use machine learning to deal with complex variable interactions and non-linear habitat responses in an automated fashion. We include many continuous covariates to capture more nuanced habitat associations and improve the temporal correspondence between avian and environmental data and we use regional submodels to accommodate differential habitat selection reduce out-of-range predictions and achieve better sampling balance. The key elements of our methods are listed here.

Additional methods and 09:56 - codes are available on GitHub at the link below. We built separate models for each Bird Conservation Region or BCR sub-region which consisted primarily of the interaction intersections between BCRs and provinces with some aggregation of smaller units. Each of these units was buffered by 100 kilometers so that we had regions of overlap among multiple models along the edges. We used primarily human point count survey data with a few ARU [recording] data sets included. The ARU data were treated in the same way as regular point counts.

To be able to quantify prediction uncertainty 10:33 - we developed models for 32 different bootstrap samples of the data within each sampling unit. Each of these samples was stratified by year in spatial cluster to improve balance. Avian data was matched with the corresponding vegetation data from one of two time periods either 2001 or 2011 and we will in future iterations we’ll include annual inputs. For each of these data samples we built boosted regression trees for the counts specifying a Poisson distribution and incorporating the detectability offsets that I described earlier. Prediction diagnostics were calculated based on 10-fold cross- validation.

To predict density we 11:12 - averaged across bootstrap replicates and smoothed across BCR subregion boundaries. We considered a total of 216 potential covariates in our models automatically eliminating those that were most highly correlated in each sub-sample of the data. In all models we included effects for year and survey type either ARU or human point count, and then we considered 21 climate variables 92 stand level vegetation covariates which consisted of things like tree species biomass and age and the same 92 covariates averaged across the surrounding landscape using what’s referred to as a Gaussian filter which is essentially a moving window weighted by proximity based on a Normal distribution. We also included three simple land cover variables, five terrain variables at 100 meter resolution and a course scale road layer at one kilometer resolution. Each individual model contained a subset of approximately half of these covariates many of which had little or no predictive power.

Variable 12:19 - importance scores are available online or as a download as a download file. The primary outputs that we’ve produced and shared are one kilometer pixel level density predictions expressed as males or pairs per hectare for a snapshot in time in our case 2011. These are accompanied by static maps which you will see in a bit. we’re also generating 250 meter predictions for each bcr subregion which are not yet posted on the website but they can be requested from us in addition we have habitat specific density estimates produced by what we refer to as post hoc binning which involves overlaying land cover classes of interest with the raster predictions to calculate mean densities for each class because most sources of environmental variation are captured in the models these mean densities reflect the full range of conditions across the landscape rather than those of a biased sample we’ve done this for a 2005 modis based north american land cover layer but a user could also do this with any other categorical land cover layer and finally we have produced population estimates that were calculated by summing up individual pixel level densities within each spatial spatial unit we’re still working on annual predictions from which trends can be estimated and we’re doing this in collaboration with adam smith and dave viles at cws you’ll hear you’ll hear more about dave’s work later in this presentation and with that i’ll turn it uh over to peter thank you diana hi everyone i’m peter solemos with the boreal avm modeling project and after the introduction i’m going to show you how to view and navigate these results that diana has just introduced our first example species is going to be Canada border in this map you can see the results from 16 regional models put together and the map for the study area you can see this pale yellow region which is outside of the species range and within that different shades of green representing different levels of population density what’s interesting to note here is although the predictions are stitched together from these regional models these thresholds are based on the whole study area so those represents how density varies across the whole region and you can see here this sharp change along the manitoba ontario border the western population represented with considerable smaller average densities compared to the eastern parts of the population within Canada we can overlay the species range on top of this density map and also the dots here represent the known detections for Canada this gives us an idea how the known range compares to our predictions and here we would expect the species range to extend more westwards because these results are based on one square kilometer pixel level predictions we can summarize these pixel level predictions across regions or within regions for example if we overlay some kind of land cover classification we can calculate mean density within those land cover classes so here for example for the whole study area Canada you can see that density reaches highest levels in mixed wood and deciduous forests this takes into account the whole study area eastern and western parts for Canada warbler what’s interesting here you can see intermediate levels of density in wetlands and cropland and conifer forests we wouldn’t expect Canada or producing croplands so this is a result of this post-hoc binning procedure and possibly how our environmental coharits represent a wider area so for example if there is some deciduous forest adjacent to a cropland then our gaussian filter at the landscape level might pick it up the same way as we can calculate mean density over the whole study area we can look at smaller regions within that so for example good conservation region six we can see how density varies across land cover types deciduous forest having highest densities as compared to this if we look at the eastern part of the species range and within bird conservation region 12 we can see that there’s slightly higher density and mixed with forest and this just highlights how this approach is really useful in highlighting these regional differences between habitat selection we can also look at smaller sub-regions for example for bcr6 how the sudden part where population density is much higher compares to the northern part where you can see across the land covered types very uniform and low density levels deciduous is somewhat higher in the southern portion of bcr6 you can see mixed wood and decisions for stopping this chart the same way as we could summarize the densities across land covered types we can look at how those numbers end up within these bcr subunits if we add those up for the whole study area all the pixels then we get four Canada order 4.81 million males over the whole study area with lower and upper bounds in parenthesis we can do the same exercise for different bird conservation regions or sub-regions where we have these estimates and if we divide the original numbers with the area of that region then we get average population density for that unit now let’s have a look at the website to view the website you need to go to borealbirds.github.

io once you land on this page 18:57 - you can browse the results by species or read our methods in the top navigation the most important part part is this search bar if you start typing the name of your favorite species then click on it now we are looking at the oven bird website page you can also by knowing the aou code of the species just go to species slash code in this case slash oven to view well known species you can see here the same national distribution and density map as we’ve seen before for Canada warbler robin bird you can see the gradient of varying levels of density and to overlay the range map and the detections in this right corner we have this show detections tab if you click on it it hides and toggles these detections and the range map you can see there is much better correspondence between the range map and the detections and our estimates then for the previous species if you scroll down then these are the land cover based mean density values for Canada or for any other specific bird conservation region unit or subunit within that if you scroll even further down then you see the population size table for Canada 38.8 million males for oven bird with lower and upper bounds based on the bootstrap distribution underneath this table you will find some links where you can grab for example the raster layers shown in the map in got format this link should take you to the google drive where you can find those species specific distribution maps if you want to download the summarized results population size estimates and densities and other useful information for example variable importances and validation metrics and the list of predictors you can download this excel file the various sheets are going to help you browse the results by species so this file actually contains the results for all the species the tiffs are for individual species if you want to access these results programmatically then i really recommend you checking out this json api which is described in detail in this technical report if you click on this doi link it should take you to the xenodo website where we have a pdf outlining the methods and if you read through there are applications with some ver worked examples using or how to manipulate these results now going back to the oven birth page at the very bottom of the page you can see a discussion area where you can leave some comments have a discussion about these results if i go back to the top in this navigation click on methods this is the brief outline of how we created these results description of the subregions and our methodology the last link you can see in the top is context which takes you to the bam website that you can learn more about contributors and the data set itself and also under communications you can find those papers that you have mentioned as part of the talk now back to the slides so we’ve talked about these pixel based estimates these are called pixel based because we make predictions for one square kilometer units of the land base then we add these up in larger units to get population sizes we did a similar study in northern alberta which was published recently on the pages of the condor where we looked at how pixel-based estimates within bcr-6 of alberta compared to population size estimates for partnering from partners in flight and their approach uses roadside bps data so we expected to find differences between the two approaches once we took the ratio of the population size estimates for more than 90 species those are dots in this graph then we found that the pixel-based estimates were much higher than the partners in flight estimates and this was to some extent expected because we know that these were driven largely by the differences between the maximum detection distance used by partners in flight and our effective detection radius for the species which are consistently lower than the maximum detection distance which makes this difference between the population sizes there were however species where the pif estimates were higher than the pixel-based one and we can see huge variation across the species which is mostly attributed to other sources of biases which in this case is due to species reacting differently like liking or disliking the presence of roads and their behavior and detection distances might change along the roads and also road sample different habitats and if we have a south heavy sample of bbs routes then in the north we have habitats which might get less represented in our roadside sample which contributes uh to this difference now we’ve looked at this across species in a single region given our current results where we have 16 such regions and we can even make smaller ones and 140 species across all of these we can look at how these pif estimates versus the pixel based ones compared to each other you can see here similar violin plots the canadian average is roughly two times the pif estimate is what we get for the pixel based one although there is huge variation across regions for example bird conservation region 11 shows somewhat higher values for certain species which might be grassland specialists other regions you see lower numbers and even there are species where the pif estimates are higher than the pixel based ones so now we are looking at this and we also are in the process of updating our estimates for effective detection distances which is also then in turn going to be used for updating the pif estimates so if you happen to have distance sampling data that we could use we are part of a collaboration with adam smith and his colleagues who are working diligently to update these numbers using fresh new data from not just Canada but from various parts of the us so please get in touch this leads us to some limitations and trade-offs that we wanted to mention regarding our national models first off these density offsets that we’ve used were developed for pesterings in mind and as a result population numbers may be overestimated for other species due to the different territorial behavior their aggregation pattern for example along water edges or overlapping home ranges which might lead to double accounting we can address these concerns using independent estimates that we can use to calibrate our approaches to get better population sizes for these non-pestering species another issue might arise as a result of a regional modeling approach that we took to address certain spatial data gaps this approach can lead to hard boundaries between bcr subunits and also in some cases it is difficult to capture range limits let’s go back to the canadore blur example in this map we can see the results from the 16 regional models put together and there is this hard edge that we’ve identified along the manitoba ontario border if we however put together a data set that involves all the data from Canada and we don’t take this regional approach just fit a single booster regression tree model then now this hot boundary disappears but what we get instead is over prediction outside of the species range here in the western mountains northwest territories and newfoundland so there’s this trade-off between having hard edges versus over prediction and the national model results are heavily dominated by some regions where we have a lot more data from so strong regional influences are mediated by going this regional approach way what you can also see in our maps here for black-throated green warbler besides these edges alongside the manitoba ontario border that density is a lot lower in the western part of the species range as compared to the eastern part of the range now if you look at the map in the left which is distribution map based on the detections only so there is no count or abundance information involved in this maximum map that deanna described before which is very similar to just showing the detections from for example ebird you can see what’s inside the species range and what’s outside where the species occurs versus not there’s our models as opposed to that indicate the different levels of density inside the range so in the west we are still inside the species range for black throat and green but densities are much lower than in the east the next example for the tennessee warbler highlights how distribution versus abundance compares in the northern parts of the species range where our regional density model approach is having a hard time of finding that northern range edge which is clearly visible in the maximum maps in the left maybe this is a result of the species range extending into the subarctic regions or maybe we just have very few samples to support that our density predictions are meant to be used in various conservation applications for example this can be used to generate various indices of land bird diversity and intactness across Canada one application of this work is led by wcs Canada to identify key biodiversity areas according to what’s called criterion c ecological integrity we are combining our bird maps with human footprint maps to generate an index of biotic intactness this work in progress should eventually help inform the identification of key biodiversity areas in Canada the layers can also be included in a variety of systematic conservation planning exercises at scales ranging from regional to national it will be particularly valuable to compare areas of landfill diversity with areas of importance for private species like caribou a new project initiated by e triple c will look at these synergies and gaps explicitly required by becky stewart and erin campfield similarly we are working with the prairie habitat joint venture to compare waterfall and land build priorities for the western border region in order to understand how areas of importance for waterfall coincide or not with privacy areas for land birds these analyses are led by barry robinson based on models produced by nicole barker at all our models were not intended to be predicted in the future but habited specific density estimates from our predictions can be applied to land use change simulations to anticipate future habitat value for birds we are collaborating with the western boreal project initiative a partnership between e triple c and arkhan and spate’s team and academic researchers to apply our models to future change scenarios this project is led by samuel lashi elliot mcintyre and tati micheletti we also modeled the impacts of forest management and natural disturbances using boosted regression trees for Canada warbler and we use these population size estimates uh to apply on simulated future conditions and quantify the likelihood of regional population persistence on the different scenarios with an applied site selection algorithm that maximizes uh positive future trends for the species in each of the regions this is work led by francisco dennis and is going to continue to inform species at risk critical habitat identification as diana mentioned in the beginning of the talk we are working towards integrating our population estimates into official population trend estimates produced by the canadian wildlife service this slide shows some expiratory work by dave ios at cws cws to combine migration monitoring data with stable isotope maps to develop regional trend estimates our density models can be used to weight regional trends in a national analysis this example shown here is for blackpool warbler as we are working on our models we are anticipating next versions based on a new set of the data which is going to include more years of bps additional regional data and automated recording unit based detections we are also going to incorporate annual climate and land cover coverage to better match predictors and survey years and also we want to capture trends in landscape change we are also planning to extend our modeling into neighboring u.s regions to incorporate data from the us that you already have and also we want to extend to the full borea hemi boreal so that we can cover the breeding ranges of these borer species future versions of the models would also possibly look at smaller sub-regions but data gaps might still dictate what size of regions we can handle in this regard and we are also thinking about using unclassified spectra data inputs which will also minimize the classification error that we can have in our data input layers key take-home messages are that we use these pixel-based population estimates which we think is an improvement over sample-based methods when the sample is biased with respect to habitat as we can see with roadside samples the bam density models generate predictions from disparate data sets that can be rolled up into population estimates because we are using detectability offsets to standardize point counts across different methodologies we employed machine learning algorithms to predict in unsampled areas and many current applications are underway and probably more coming in the future we wanted to thank the bam members partners and funders including environment and climate change Canada and the u.s fish and wildlife service see the full list of funders and contributors on our website you can see the bam team here with core members and contributing scientists and grad students thank you all for listening to this talk .