Title: | Access and download data on plant and animal populations from NatureCounts |
---|---|
Description: | Access and download data on plant and animal populations from various databases through NatureCounts, a service managed by Bird Studies Canada. |
Authors: | Steffi LaZerte [aut], Denis Lepage [aut, cre] |
Maintainer: | Denis Lepage <[email protected]> |
License: | GPL-3 |
Version: | 0.4.1 |
Built: | 2024-11-10 06:03:59 UTC |
Source: | https://github.com/birdscanada/naturecounts |
naturecounts is an R package for accessing and downloading data on plant and animal populations from various databases through NatureCounts, a service managed by Bird Studies Canada.
See the vignettes (vignettes(package = "naturecounts")
, or at
https://birdscanada.github.io/naturecounts for getting started!
Maintainer: Denis Lepage [email protected]
Authors:
Steffi LaZerte [email protected]
Useful links:
An example of black-capped chickadee data downloaded from NatureCounts
bcch
bcch
A data frame with 160 rows and 57 variables:
Creates a plot of COSEWIC ranges for illustration and checking.
cosewic_plot( ranges, points = NULL, grid = NULL, map = NULL, species = "species_id", title = "" )
cosewic_plot( ranges, points = NULL, grid = NULL, map = NULL, species = "species_id", title = "" )
ranges |
List. Output of |
points |
Data frame. Optional naturecounts data used to compute ranges. Raw data points will be added to the plot if provided. |
grid |
sf data frame. Optional grid over which to summarize IAO values (useful for species with many points over a broad distribution). |
map |
sf data frame. Optional base map over which to plot the values. |
species |
Character. Name of the column containing species identification. |
title |
Character. Optional title to add to the map. Can be a named by species vector to supply different titles for different species. |
ggplot2 map
r <- cosewic_ranges(bcch) cosewic_plot(r) cosewic_plot(r, points = bcch) cosewic_plot(r, grid = grid_canada(50), map = map_canada(), title = "Black-capped chickadees") m <- rbind(bcch, hofi) r <- cosewic_ranges(m) cosewic_plot(r) cosewic_plot(r, points = m) p <- cosewic_plot(r, grid = grid_canada(50), map = map_canada(), title = c("14280" = "Black-capped chickadees", "20350" = "House Finches")) p[[1]] p[[2]]
r <- cosewic_ranges(bcch) cosewic_plot(r) cosewic_plot(r, points = bcch) cosewic_plot(r, grid = grid_canada(50), map = map_canada(), title = "Black-capped chickadees") m <- rbind(bcch, hofi) r <- cosewic_ranges(m) cosewic_plot(r) cosewic_plot(r, points = m) p <- cosewic_plot(r, grid = grid_canada(50), map = map_canada(), title = c("14280" = "Black-capped chickadees", "20350" = "House Finches")) p[[1]] p[[2]]
The COSEWIC Index of Area of Occupancy (IAO; also called Area of Occupancy, AOO by the IUCN) and Extent of Occurrence (EOO; IUCN as well) are metrics used to support status assessments for potentially endangered species.
cosewic_ranges( df_db, record = "record_id", coord_lon = "longitude", coord_lat = "latitude", species = "species_id", iao_grid_size_km = 2, eoo_p = 0.95, filter_unique = FALSE, spatial = TRUE )
cosewic_ranges( df_db, record = "record_id", coord_lon = "longitude", coord_lat = "latitude", species = "species_id", iao_grid_size_km = 2, eoo_p = 0.95, filter_unique = FALSE, spatial = TRUE )
df_db |
Either data frame or a connection to database with
|
record |
Character. Name of the column containing record identification. |
coord_lon |
Character. Name of the column containing longitude. |
coord_lat |
Character. Name of the column containing latitude. |
species |
Character. Name of the column containing species identification. |
iao_grid_size_km |
Numeric. Size of grid (km) to use when calculating IAO. Default is COSEWIC requirement (2). Use caution if changing. |
eoo_p |
Numeric. The percentile to calculate the convex hull over. Defaults to 0.95 for a 95% convex hull to ensure outlier points do not artificially inflate the EOO. Note that for a final COSEWIC report, this may not be appropriate. Set to 1 to include all points. |
filter_unique |
Logical. Whether to filter observations to unique
locations. Use this only if there are too many data points to work with.
This changes the nature of what an observation is, and also may bias
EOO calculations if using less than 100% of points (see |
spatial |
Logical. Whether to return sf spatial objects showing
calculations. If |
Note that the while the IUCN calls this metric AOO, in COSEWIC, AOO is actually a different measure, the biological area of occupancy. See the Distribution section in 'Instructions for preparing COSEWIC status reports' for more details.
By default the EOO is calculated only using the inner 95% of points (based on
distance to the centroid). This is to ensure that a first-pass of the EOO
does not reject a species from consideration if there are any outlier
observations. However, for a final COSEWIC assessment report, it is likely
better to carefully explore the data to ensure there are no outliers and then
use the full data set (i.e. set eoo_p = 1
).
The IAO is calculated by first assessing large grids (10x large than the specified size). Only then are smaller grids created within large grid cells containing observations. This speeds up the process by avoiding the creation of grids in areas where there are no observations. This means that the plots and spatial objects may not have grids over large areas lacking observations. See examples.
Details on how IAO and EOO are calculated and used
COSEWIC - Guidelines for use of the Index of Area of Occupancy in COSEWIC Assessments
COSEWIC - Table 2 COSEWIC quantitative criteria and guidelines for the status assessment of Wildlife Species
Summarized data frame (ranges) or list containing ranges
, a
summarized data frame, and spatial
, a list of two spatial data frames.
ranges
contains columns
n_records_total
- Total number of records used to create ranges
min_record
- Minimum number of records within IAO cells
max_record
- Maximum number of records within IAO cells
median_record
- Median number of records within IAO cells
grid_size_km
- IAO cell size (area is this squared)
n_occupied
- Number of IAO cells with at least one record
iao
- IAO value (grid_size_km
^2 * n_occupied
)
eoo_pXX
- EOO area calculated with a convex hull at percentile eoo_p
(e.g., 95%)
spatial
contains spatial data frames
iao_sf
- Polygons of the IAO grids with the n_records
per cell
eoo_sf
- Polygon of the Convex Hull at percentile eoo_p
# Using the included, test data on black-capped chickadees bcch # look at the data r <- cosewic_ranges(bcch) r <- cosewic_ranges(bcch, spatial = FALSE) # Calculate for multiple species mult <- rbind(bcch, hofi) r <- cosewic_ranges(mult) r <- cosewic_ranges(mult, spatial = FALSE)
# Using the included, test data on black-capped chickadees bcch # look at the data r <- cosewic_ranges(bcch) r <- cosewic_ranges(bcch, spatial = FALSE) # Calculate for multiple species mult <- rbind(bcch, hofi) r <- cosewic_ranges(mult) r <- cosewic_ranges(mult, spatial = FALSE)
Creates and adds columns date
and doy
(day-of-year) to the data source
(either data frame or database table naturecounts
).
format_dates(df_db, overwrite = FALSE)
format_dates(df_db, overwrite = FALSE)
df_db |
Either data frame or a connection to database with
|
overwrite |
Logical. Overwrite existing columns |
If df_db
was a data frame, return a data frame with new columns
date
and doy
. Otherwise return database connection.
bcch_with_dates <- format_dates(bcch)
bcch_with_dates <- format_dates(bcch)
Zero-fill the species presence data by adding zero observation counts (absences) data to an existing naturecounts dataset.
format_zero_fill( df_db, by = "SamplingEventIdentifier", species = "all", fill = "ObservationCount", extra_species = NULL, extra_event = NULL, warn = TRUE, verbose = TRUE )
format_zero_fill( df_db, by = "SamplingEventIdentifier", species = "all", fill = "ObservationCount", extra_species = NULL, extra_event = NULL, warn = TRUE, verbose = TRUE )
df_db |
Either data frame or a connection to database with
|
by |
Character vector. By default, "SamplingEventIdentifier" or a vector of specific column names to fill by (see details) |
species |
Character vector. Either "all", for species in the data, or a vector of species ID codes to fill in. |
fill |
Character. The column name to fill in. Defaults to "ObservationCount". |
extra_species |
Character vector. Extra columns/fields uniquely
associated with |
extra_event |
Character vector. Extra columns/fields uniquely associated
with the Sampling Event (the field defined by |
warn |
Logical. If TRUE, stop zero-filling if >100 species and >1000 unique sampling events. If FALSE, ignore and proceed. |
verbose |
Logical. Show messages? |
by
refers to the combination of columns which are used to detect
missing values. By default SamplingEventIdentifier
is used. Otherwise
users can specify their own combination of columns.
If species
is supplied, all records will be used to determine observation
events, but only records (zero-filled or otherwise) which correspond to a
species in species
will be returned (all others will be omitted). Note
that records where species_id
is NA (generally for 0 counts for
presence/absence), will be converted to a list of 0's for the individual
species.
Data frame
# Download data (with "core" fields to include 'CommonName') sample <- nc_data_dl(collection = c("SAMPLE1", "SAMPLE2"), fields_set = "core", username = "sample", info = "nc_example") # Remove casual observations (i.e. 'AllSpeciesReported' = "No") library(dplyr) # For filter function sample <- filter(sample, AllSpeciesReported == "Yes") # Remove data with "X" ObservationCount (only keep numeric obs) sample <- filter(sample, ObservationCount != "X") # Zero fill by all species present sample_all_zeros <- format_zero_fill(sample) # Zero fill only for Canada Goose goose <- format_zero_fill(sample, species = "230") # Keep species-specific variables goose <- format_zero_fill(sample, species = "230", extra_species = "CommonName") # Keep sampling-event-specific variables coords <- format_zero_fill(sample, extra_event = c("latitude", "longitude")) # By species, keeping extra species variables and event variables goose_coords <- format_zero_fill(sample, species = "230", extra_species = "CommonName", extra_event = c("latitude", "longitude")) # Only return event information events <- format_zero_fill(sample, fill = NA, extra_event = c("latitude", "longitude"))
# Download data (with "core" fields to include 'CommonName') sample <- nc_data_dl(collection = c("SAMPLE1", "SAMPLE2"), fields_set = "core", username = "sample", info = "nc_example") # Remove casual observations (i.e. 'AllSpeciesReported' = "No") library(dplyr) # For filter function sample <- filter(sample, AllSpeciesReported == "Yes") # Remove data with "X" ObservationCount (only keep numeric obs) sample <- filter(sample, ObservationCount != "X") # Zero fill by all species present sample_all_zeros <- format_zero_fill(sample) # Zero fill only for Canada Goose goose <- format_zero_fill(sample, species = "230") # Keep species-specific variables goose <- format_zero_fill(sample, species = "230", extra_species = "CommonName") # Keep sampling-event-specific variables coords <- format_zero_fill(sample, extra_event = c("latitude", "longitude")) # By species, keeping extra species variables and event variables goose_coords <- format_zero_fill(sample, species = "230", extra_species = "CommonName", extra_event = c("latitude", "longitude")) # Only return event information events <- format_zero_fill(sample, fill = NA, extra_event = c("latitude", "longitude"))
Create grid across Canada
grid_canada(cell_size = 200, buffer = 500)
grid_canada(cell_size = 200, buffer = 500)
cell_size |
Numeric. Size of grid (km) to use when creating grid.
If using this grid as input to |
buffer |
Numeric. Extra buffer (km) to add around the outline of Canada before calculating grid. |
sf data frame with polygon grid
gc <- grid_canada(200) gc_buff <- grid_canada(200, buffer = 0) # Plot to illustrate library(ggplot2) ggplot() + geom_sf(data = map_canada()) + geom_sf(data = gc, fill = NA) + labs(caption = "200km buffer") ggplot() + geom_sf(data = map_canada()) + geom_sf(data = gc_buff, fill = NA) + labs(caption = "No buffer")
gc <- grid_canada(200) gc_buff <- grid_canada(200, buffer = 0) # Plot to illustrate library(ggplot2) ggplot() + geom_sf(data = map_canada()) + geom_sf(data = gc, fill = NA) + labs(caption = "200km buffer") ggplot() + geom_sf(data = map_canada()) + geom_sf(data = gc_buff, fill = NA) + labs(caption = "No buffer")
An example of house finch data downloaded from NatureCounts
hofi
hofi
A data frame with 19 rows and 57 variables:
Wrapper around rnaturalearth::ne_countries()
to creates a simple features
basic map of Canada with CRS 3347 (Statistics Canada Lambert).
map_canada()
map_canada()
Sf data frame
map_canada() plot(map_canada()) library(ggplot2) ggplot(data = map_canada()) + geom_sf()
map_canada() plot(map_canada()) library(ggplot2) ggplot(data = map_canada()) + geom_sf()
These functions return metadata codes, names, descriptions, and information associated with the data downloaded from NatureCounts.
meta_country_codes() meta_statprov_codes() meta_subnational2_codes() meta_iba_codes() meta_bcr_codes() meta_utm_squares() meta_species_authority() meta_species_codes() meta_species_taxonomy() meta_collections() meta_breeding_codes() meta_project_protocols() meta_projects() meta_protocol_types() meta_bmde_versions() meta_bmde_fields(version = "minimum")
meta_country_codes() meta_statprov_codes() meta_subnational2_codes() meta_iba_codes() meta_bcr_codes() meta_utm_squares() meta_species_authority() meta_species_codes() meta_species_taxonomy() meta_collections() meta_breeding_codes() meta_project_protocols() meta_projects() meta_protocol_types() meta_bmde_versions() meta_bmde_fields(version = "minimum")
version |
Character. BMDE version for which to return fields. NULL returns all versions |
Some of these metadata are stored locally and can be updated with
the nc_metadata()
function. Others are downloaded as requested.
Data frame
meta_country_codes()
: Country codes
meta_statprov_codes()
: State/Province codes
meta_subnational2_codes()
: Subnational2 codes
meta_iba_codes()
: Important Bird Area (IBA) codes
meta_bcr_codes()
: Bird Conservation Region (BCR) codes
meta_utm_squares()
: UTM Square codes
meta_species_authority()
: Species taxonomic authorities
meta_species_codes()
: Alpha-numeric codes for avian species
meta_species_taxonomy()
: Codes and taxonomic information for all species
meta_collections()
: Collections names and descriptions
meta_breeding_codes()
: Breeding codes and descriptions
meta_project_protocols()
: Project protocols
meta_projects()
: Projects ids, names, websites, and descriptions
meta_protocol_types()
: Protocol types and descriptions
meta_bmde_versions()
: Names and descriptions of the available versions of BMDE
(Bird Monitoring Data Exchange). These refer to sets of fields/columns
which can be downloaded for a given group of data. See nc_data_dl()
for
more details.
meta_bmde_fields()
: Fields/columns associated with a particular BMDE (Bird
Monitoring Data Exchange) version. See meta_bmde_versions()
for the
different versions available, meta_collections()
for which version is
used by which project, and nc_data_dl()
for more details on downloading
data with a given set of fields/columns.
# Return fields/columns in the 'minimum' version meta_bmde_fields() # Retrun fields/columns in the 'core' version meta_bmde_fields(version = "core") # Return all possible fields meta_bmde_fields(version = "extended")
# Return fields/columns in the 'minimum' version meta_bmde_fields() # Retrun fields/columns in the 'core' version meta_bmde_fields(version = "core") # Return all possible fields meta_bmde_fields(version = "extended")
Download the number of records available for different collections filtered
by location (if provided). If authorization is provided, the collections are
filtered to only those available to the user (unless using show = "all"
).
Without authorization all collections are returned.
nc_count( collections = NULL, project_ids = NULL, species = NULL, years = NULL, doy = NULL, region = NULL, site_type = NULL, show = "available", username = NULL, timeout = 120, verbose = TRUE )
nc_count( collections = NULL, project_ids = NULL, species = NULL, years = NULL, doy = NULL, region = NULL, site_type = NULL, show = "available", username = NULL, timeout = 120, verbose = TRUE )
collections |
Character vector. The collection codes from which to download data. NULL (default) downloads data from all available collections |
project_ids |
Character/Numeric vector. The |
species |
Numeric vector. Numeric species ids (see details) |
years |
Numeric vector. The start/end years of data to download. Can use NA for either start or end, or a single value to return data from a single year. |
doy |
Character/Numeric vector. The start/end day-of-year to download (1-366 or dates that can be converted to day of year). Can use NA for either start or end |
region |
List. Named list with one of the following options:
|
site_type |
Character vector. The type of site to return (e.g., |
show |
Character. Either "all" or "available". "all" returns counts from all data sources. "available" only returns counts for data available for the username provided. If no username is provided, defaults to "all". |
username |
Character vector. Username for http://naturecounts.ca. If provided, the user will be prompted for a password. If left NULL, only public collections will be returned. |
timeout |
Numeric. Number of seconds before connecting to the server times out. |
verbose |
Logical. Show messages? |
The akn_level
column describes the level of data access for that collection
(see descriptions online).
The access
column describes the accessibility of a collection for a given
username (or no access if no username supplied). See the section on Access
and request_id
s for more details.
Data frame
All public data is available with a username/password
(sign up
for a free NatureCounts account). However, to access private/semi-public
projects/collections you must request access. See the Access and
request_id
s section for more information.
species
)Numeric species id codes can determined from the functions
search_species()
or search_species_code()
. See also the article on
species codes
for more information.
doy
)The format for day of year (doy
) is fairly flexible and can be a whole
number between 1 and 366 or anything recognized by
lubridate-package
's ymd()
function. However, it must have the order of year, month, day. Note that
year is ignored when converting to day of year, except that it will result
in a 1 day offset for leap years.
region
)Regions are defined by codes reflecting the country, state/province,
subnational (level 2), Important Bird Areas (IBA), and Bird Conservation
Regions (BCR) (see search_region()
for codes). They can also be defined
by providing specific UTM squares to download or a bounding box area which
specifies the min/max longitude and min/max latitude (bbox
). See the
article on regional filters
for more information.
request_id
sAccess to a data collection is either available as "full" or "by request".
Use nc_count(username = "USER", show = "all")
, to see the accessibility of
collections.
"Full" access means that data can be immediately requested directly through
the naturecounts
R package. "By request" means that a request must be
submitted online and
approved before the data can be downloaded through naturecounts
.
This means that there are two types of data requests: ones made through this
naturecounts
R package (API requests) and those made through the online
Web Request Form (Web
requests). Every request (from either method) generates a request_id
which
identifies the filter set and collections requested. Details of all of
requests can be reviewed with the nc_requests()
function.
To download data with "full" access, users can either specify filters, or if
they are repeating a download, can use the request_id
from nc_requests()
.
Otherwise, if the user doesn't have "full" access, they must supply an
approved request_id
to the nc_data_dl()
function (e.g.,
nc_data_dl(request_id = 152000, username = "USER")
). Use nc_requests()
to
see request_id
s, filters, and approval status.
Requests for "full" access to additional collections can be made online through the Web Request Form by checking the "Full access?" box in Step 2 of the form.
# Count all publicly available records: nc_count() # Count publicly available records for Manitoba, Canada nc_count(region = list(statprov = "MB")) # Count all records for all collections user "sample" has access to ## Not run: nc_count(username = "sample") ## End(Not run) # Count records with house finches in Ontario search_species("house finch") nc_count(species = 20350, region = list(statprov = "ON"), username = "sample") # Count all records available in the Christmas Bird Count and Breeding Bird # Survey collections (regardless of user permissions) nc_count(collections = c("CBC", "BBS"), show = "all", username = "sample")
# Count all publicly available records: nc_count() # Count publicly available records for Manitoba, Canada nc_count(region = list(statprov = "MB")) # Count all records for all collections user "sample" has access to ## Not run: nc_count(username = "sample") ## End(Not run) # Count records with house finches in Ontario search_species("house finch") nc_count(species = 20350, region = list(statprov = "ON"), username = "sample") # Count all records available in the Christmas Bird Count and Breeding Bird # Survey collections (regardless of user permissions) nc_count(collections = c("CBC", "BBS"), show = "all", username = "sample")
Download data records from various collections filtered by various options.
In order to ease the load on the server, note that only three of
collections
/project_ids
, species
, years
, doy
, region
, and
site_type
can be used in any one request. See the vignette for filtering
your data after download for more options:
vignette("filtering_data", package = "naturecounts")
.
nc_data_dl( collections = NULL, project_ids = NULL, species = NULL, years = NULL, doy = NULL, region = NULL, site_type = NULL, fields_set = "minimum", fields = NULL, username, info = NULL, request_id = NULL, sql_db = NULL, warn = TRUE, timeout = 120, verbose = TRUE )
nc_data_dl( collections = NULL, project_ids = NULL, species = NULL, years = NULL, doy = NULL, region = NULL, site_type = NULL, fields_set = "minimum", fields = NULL, username, info = NULL, request_id = NULL, sql_db = NULL, warn = TRUE, timeout = 120, verbose = TRUE )
collections |
Character vector. The collection codes from which to download data. NULL (default) downloads data from all available collections |
project_ids |
Character/Numeric vector. The |
species |
Numeric vector. Numeric species ids (see details) |
years |
Numeric vector. The start/end years of data to download. Can use NA for either start or end, or a single value to return data from a single year. |
doy |
Character/Numeric vector. The start/end day-of-year to download (1-366 or dates that can be converted to day of year). Can use NA for either start or end |
region |
List. Named list with one of the following options:
|
site_type |
Character vector. The type of site to return (e.g., |
fields_set |
Character. Set of fields/columns to download. See details. |
fields |
Character vector. If |
username |
Character vector. Username for http://naturecounts.ca. If provided, the user will be prompted for a password. If left NULL, only public collections will be returned. |
info |
Character vector. Short description of reason for the download.
E.g., "COSEWIC report", "Impact Assessment Study", "School project", etc.
This kind of information helps NatureCounts.ca justify the utility of the
database. Required unless resuming/re-downloaded with a |
request_id |
Numeric. Specific request id to check or download. |
sql_db |
Character vector. Name and location of SQLite database to either create or add to |
warn |
Logical. Interactive warning if request more than 1,000,000 records to download. |
timeout |
Numeric. Number of seconds before connecting to the server times out. |
verbose |
Logical. Show messages? |
Data frame or connection to SQLite database
All public data is available with a username/password
(sign up
for a free NatureCounts account). However, to access private/semi-public
projects/collections you must request access. See the Access and
request_id
s section for more information.
species
)Numeric species id codes can determined from the functions
search_species()
or search_species_code()
. See also the article on
species codes
for more information.
doy
)The format for day of year (doy
) is fairly flexible and can be a whole
number between 1 and 366 or anything recognized by
lubridate-package
's ymd()
function. However, it must have the order of year, month, day. Note that
year is ignored when converting to day of year, except that it will result
in a 1 day offset for leap years.
region
)Regions are defined by codes reflecting the country, state/province,
subnational (level 2), Important Bird Areas (IBA), and Bird Conservation
Regions (BCR) (see search_region()
for codes). They can also be defined
by providing specific UTM squares to download or a bounding box area which
specifies the min/max longitude and min/max latitude (bbox
). See the
article on regional filters
for more information.
fields_set
and fields
)By default data is downloaded with the minimum
set of fields/columns.
However, for more advanced applications, users may wish to specify which
fields/columns to return. The Bird Monitoring Data Exchange (BMDE) schema
keeps track of variables used to augment observation data. There are
different versions reflecting different collections of variables which can
be specified for download in one of four ways:
fields_set
can be a specific shorthand reflecting a BMDE version:
core
, extended
or minimum
(default). See meta_bmde_versions()
to see
which BMDE version the shorthand refers to.
fields_set
can be default
which uses the default BMDE version for a
particular collection (note that if you download more than one collection,
the field sets will expand to cover all fields/columns in the combined
collections)
fields_set
can be the exact BMDE version. See meta_bmde_versions()
for options.
fields_set
can be custom
and the fields
argument can be a
character vector specifying the exact fields/columns to return. See
meta_bmde_fields()
) for potential fields
values.
Note that in all cases there are a set of fields/columns that are always
returned, no matter what fields_set
is used.
request_id
sAccess to a data collection is either available as "full" or "by request".
Use nc_count(username = "USER", show = "all")
, to see the accessibility of
collections.
"Full" access means that data can be immediately requested directly through
the naturecounts
R package. "By request" means that a request must be
submitted online and
approved before the data can be downloaded through naturecounts
.
This means that there are two types of data requests: ones made through this
naturecounts
R package (API requests) and those made through the online
Web Request Form (Web
requests). Every request (from either method) generates a request_id
which
identifies the filter set and collections requested. Details of all of
requests can be reviewed with the nc_requests()
function.
To download data with "full" access, users can either specify filters, or if
they are repeating a download, can use the request_id
from nc_requests()
.
Otherwise, if the user doesn't have "full" access, they must supply an
approved request_id
to the nc_data_dl()
function (e.g.,
nc_data_dl(request_id = 152000, username = "USER")
). Use nc_requests()
to
see request_id
s, filters, and approval status.
Requests for "full" access to additional collections can be made online through the Web Request Form by checking the "Full access?" box in Step 2 of the form.
# All observations part of the SAMPLE1 and SAMPLE2 collections sample <- nc_data_dl(collections = c("SAMPLE1", "SAMPLE2"), username = "sample", info = "nc_example") # All observations part of project_id 1042 accessible by "testuser" p1042 <- nc_data_dl(project_ids = 1042, username = "testuser", info = "nc_example") # Black-capped Chickadees (BCCH) in SAMPLE2 collection in 2013 search_species("black-capped chickadee") # Find the species_id bcch <- nc_data_dl(collection = "SAMPLE2", species = 14280, year = 2013, username = "sample", info = "nc_example") # All BCCH observations since 2015 accessible to user "sample" bcch <- nc_data_dl(species = 14280, years = c(2015, NA), username = "sample", info = "nc_example") # All BCCH observations from mid-July to late October in all years for user "sample" bcch <- nc_data_dl(species = 14280, doy = c(200, 300), username = "sample", info = "nc_example") # All BCCH observations from a specific bounding box for user "sample" bcch <- nc_data_dl(species = 14280, username = "sample", region = list(bbox = c(left = -100, bottom = 45, right = -80, top = 60)), info = "nc_example") # All American Bittern observations from user "sample" search_species("american bittern") bittern <- nc_data_dl(species = 2490, username = "sample", info = "nc_example") # Different fields/columns bittern <- nc_data_dl(species = 2490, fields_set = "core", username = "sample", info = "nc_example") bittern <- nc_data_dl(species = 2490, fields_set = "custom", fields = c("Locality", "AllSpeciesReported"), username = "sample", info = "nc_example") ## Not run: # All collections by request id # Specific collection by request id my_data <- nc_data_dl(collections = "ABATLAS1", request_id = 000000, username = "USER", info = "MY REASON") ## End(Not run)
# All observations part of the SAMPLE1 and SAMPLE2 collections sample <- nc_data_dl(collections = c("SAMPLE1", "SAMPLE2"), username = "sample", info = "nc_example") # All observations part of project_id 1042 accessible by "testuser" p1042 <- nc_data_dl(project_ids = 1042, username = "testuser", info = "nc_example") # Black-capped Chickadees (BCCH) in SAMPLE2 collection in 2013 search_species("black-capped chickadee") # Find the species_id bcch <- nc_data_dl(collection = "SAMPLE2", species = 14280, year = 2013, username = "sample", info = "nc_example") # All BCCH observations since 2015 accessible to user "sample" bcch <- nc_data_dl(species = 14280, years = c(2015, NA), username = "sample", info = "nc_example") # All BCCH observations from mid-July to late October in all years for user "sample" bcch <- nc_data_dl(species = 14280, doy = c(200, 300), username = "sample", info = "nc_example") # All BCCH observations from a specific bounding box for user "sample" bcch <- nc_data_dl(species = 14280, username = "sample", region = list(bbox = c(left = -100, bottom = 45, right = -80, top = 60)), info = "nc_example") # All American Bittern observations from user "sample" search_species("american bittern") bittern <- nc_data_dl(species = 2490, username = "sample", info = "nc_example") # Different fields/columns bittern <- nc_data_dl(species = 2490, fields_set = "core", username = "sample", info = "nc_example") bittern <- nc_data_dl(species = 2490, fields_set = "custom", fields = c("Locality", "AllSpeciesReported"), username = "sample", info = "nc_example") ## Not run: # All collections by request id # Specific collection by request id my_data <- nc_data_dl(collections = "ABATLAS1", request_id = 000000, username = "USER", info = "MY REASON") ## End(Not run)
Updates the local copies of meta data used by the package.
nc_metadata(force = FALSE, utm = FALSE, verbose = TRUE)
nc_metadata(force = FALSE, utm = FALSE, verbose = TRUE)
force |
Logical. Force update even if the remote version matches local? |
utm |
Logical. Update |
verbose |
Logical. Show progress messages? |
nc_metadata()
nc_metadata()
Returns a list of collections accessible by 'username'.
nc_permissions(username = NULL, timeout = 60)
nc_permissions(username = NULL, timeout = 60)
username |
Character vector. Username for http://naturecounts.ca. If provided, the user will be prompted for a password. If left NULL, only public collections will be returned. |
timeout |
Numeric. Number of seconds before connecting to the server times out. |
All public data is available with a username/password
(sign up
for a free NatureCounts account). However, to access private/semi-public
projects/collections you must request access. See the Access and
request_id
s section for more information.
request_id
sAccess to a data collection is either available as "full" or "by request".
Use nc_count(username = "USER", show = "all")
, to see the accessibility of
collections.
"Full" access means that data can be immediately requested directly through
the naturecounts
R package. "By request" means that a request must be
submitted online and
approved before the data can be downloaded through naturecounts
.
This means that there are two types of data requests: ones made through this
naturecounts
R package (API requests) and those made through the online
Web Request Form (Web
requests). Every request (from either method) generates a request_id
which
identifies the filter set and collections requested. Details of all of
requests can be reviewed with the nc_requests()
function.
To download data with "full" access, users can either specify filters, or if
they are repeating a download, can use the request_id
from nc_requests()
.
Otherwise, if the user doesn't have "full" access, they must supply an
approved request_id
to the nc_data_dl()
function (e.g.,
nc_data_dl(request_id = 152000, username = "USER")
). Use nc_requests()
to
see request_id
s, filters, and approval status.
Requests for "full" access to additional collections can be made online through the Web Request Form by checking the "Full access?" box in Step 2 of the form.
nc_permissions() nc_permissions(username = "sample")
nc_permissions() nc_permissions(username = "sample")
Generate custom table queries with the table name and filter arguments.
nc_query_table( table = NULL, ..., username = NULL, timeout = 120, verbose = FALSE )
nc_query_table( table = NULL, ..., username = NULL, timeout = 120, verbose = FALSE )
table |
Character. Table to query (see details) |
... |
Name/value pairs for custom queries/filters (see details) |
username |
Character vector. Username for http://naturecounts.ca. If provided, the user will be prompted for a password. If left NULL, only public collections will be returned. |
timeout |
Numeric. Number of seconds before connecting to the server times out. |
verbose |
Logical. Show messages? |
nc_query_table(username = "sample")
for available options
data.frame()
# What tables are available? What 'filters' do they take? Are any 'required'? nc_query_table(username = "sample") # Query the bmdefilter_bad_dates table d <- nc_query_table(table = "bmde_filter_bad_dates", username = "sample") head(d) # Filter our query d <- nc_query_table(table = "bmde_filter_bad_dates", SiteCode = "DMBO", username = "sample") d # Filter our query d <- nc_query_table(table = "bmde_filter_bad_dates", species_id = 15770, username = "sample") # Want more than one species? Either filter after, or combine two queries # Filter after library(dplyr) d <- nc_query_table(table = "bmde_filter_bad_dates", username = "sample") d <- filter(d, species_id %in% c(15770, 9750)) # Combine two queries d1 <- nc_query_table(table = "bmde_filter_bad_dates", species_id = 15770, username = "sample") d2 <- nc_query_table(table = "bmde_filter_bad_dates", species_id = 9750, username = "sample") d <- rbind(d1, d2)
# What tables are available? What 'filters' do they take? Are any 'required'? nc_query_table(username = "sample") # Query the bmdefilter_bad_dates table d <- nc_query_table(table = "bmde_filter_bad_dates", username = "sample") head(d) # Filter our query d <- nc_query_table(table = "bmde_filter_bad_dates", SiteCode = "DMBO", username = "sample") d # Filter our query d <- nc_query_table(table = "bmde_filter_bad_dates", species_id = 15770, username = "sample") # Want more than one species? Either filter after, or combine two queries # Filter after library(dplyr) d <- nc_query_table(table = "bmde_filter_bad_dates", username = "sample") d <- filter(d, species_id %in% c(15770, 9750)) # Combine two queries d1 <- nc_query_table(table = "bmde_filter_bad_dates", species_id = 15770, username = "sample") d2 <- nc_query_table(table = "bmde_filter_bad_dates", species_id = 9750, username = "sample") d <- rbind(d1, d2)
All server queries are cached for four hours to reduce server load. You can
reset the cache at any time by either restarting your R session or running
nc_remove_cache()
.
nc_remove_cache()
nc_remove_cache()
TRUE
if it worked
nc_remove_cache()
nc_remove_cache()
List pending or completed data requests for a given user.
nc_requests(request_id = NULL, type = "web", username)
nc_requests(request_id = NULL, type = "web", username)
request_id |
Numeric. Specific request id to check or download. |
type |
Character One of "web", "api", or "all" specifying which types of request to return (defaults to "web"). |
username |
Character vector. Username for http://naturecounts.ca. If provided, the user will be prompted for a password. If left NULL, only public collections will be returned. |
data frame
request_id
sAccess to a data collection is either available as "full" or "by request".
Use nc_count(username = "USER", show = "all")
, to see the accessibility of
collections.
"Full" access means that data can be immediately requested directly through
the naturecounts
R package. "By request" means that a request must be
submitted online and
approved before the data can be downloaded through naturecounts
.
This means that there are two types of data requests: ones made through this
naturecounts
R package (API requests) and those made through the online
Web Request Form (Web
requests). Every request (from either method) generates a request_id
which
identifies the filter set and collections requested. Details of all of
requests can be reviewed with the nc_requests()
function.
To download data with "full" access, users can either specify filters, or if
they are repeating a download, can use the request_id
from nc_requests()
.
Otherwise, if the user doesn't have "full" access, they must supply an
approved request_id
to the nc_data_dl()
function (e.g.,
nc_data_dl(request_id = 152000, username = "USER")
). Use nc_requests()
to
see request_id
s, filters, and approval status.
Requests for "full" access to additional collections can be made online through the Web Request Form by checking the "Full access?" box in Step 2 of the form.
nc_requests(username = "sample") nc_requests(request_id = 152446, username = "sample")
nc_requests(username = "sample") nc_requests(request_id = 152446, username = "sample")
Search for the correct codes to identify countries, states/provinces,
subnational2 areas, Important Bird Areas (IBA), or Bird Conservation Regions
(BCR). These are then used in the nc_data_dl()
and
nc_count()
functions.
search_region(name = NULL, type = "country")
search_region(name = NULL, type = "country")
name |
Character. The location name to search for |
type |
Character. One of "country", "statprov", "subnational2", "iba", or "bcr". The type of information to return. |
region_search()
is deprecated in favour of search_region()
A data frame with the relevant codes and other information
search_region("Mexico", type = "country") # MX search_region("Yucatan", type = "statprov") # Yucatán search_region("Alberta", type = "statprov") # AB search_region("Edmonton", type = "subnational2") # CA.AB.11 search_region("Brandon", type = "subnational2") # CA.MB.07 search_region("hays reservoir", type = "iba") # AB075 search_region("rainforest", type = "bcr") # 5 # Show all codes search_region(type = "country") search_region(type = "statprov") search_region(type = "subnational2") search_region(type = "iba") search_region(type = "bcr") # Using the codes nc_count(region = list(statprov = "AB"), years = 2010)
search_region("Mexico", type = "country") # MX search_region("Yucatan", type = "statprov") # Yucatán search_region("Alberta", type = "statprov") # AB search_region("Edmonton", type = "subnational2") # CA.AB.11 search_region("Brandon", type = "subnational2") # CA.MB.07 search_region("hays reservoir", type = "iba") # AB075 search_region("rainforest", type = "bcr") # 5 # Show all codes search_region(type = "country") search_region(type = "statprov") search_region(type = "subnational2") search_region(type = "iba") search_region(type = "bcr") # Using the codes nc_count(region = list(statprov = "AB"), years = 2010)
Find species id codes by searching for scientific, English and French species names.
search_species(name = NULL, show = "names", authority = NULL)
search_species(name = NULL, show = "names", authority = NULL)
name |
Character. The species name to search for |
show |
Character. Either "all" or "names" (default). Whether to return all taxonomic information or only a subset with species names |
authority |
Character. If not NULL (default), return the alphanumeric code associated with avian species for this taxonomic authority. |
species_search()
is deprecated in favour of search_species()
Data frame of species ids and taxonomic information
# Show all ids search_species() search_species("chickadee") search_species("black-capped chickadee") # Add alphanumeric code for BSCDATA authority search_species("black-capped chickadee", authority = "BSCDATA") # Show all taxonomic information search_species("black-capped chickadee", show = "all") # Using the codes nc_count(species = 14280)
# Show all ids search_species() search_species("chickadee") search_species("black-capped chickadee") # Add alphanumeric code for BSCDATA authority search_species("black-capped chickadee", authority = "BSCDATA") # Show all taxonomic information search_species("black-capped chickadee", show = "all") # Using the codes nc_count(species = 14280)
This is an advanced function for returning all Bird-related species id codes based on the various alphanumeric codes used by different authorities.
search_species_code(code = NULL, authority = "BSCDATA", results = "all")
search_species_code(code = NULL, authority = "BSCDATA", results = "all")
code |
Vector. Character or numeric code indicating a species for a given authority. |
authority |
Character. The authority to compare codes against (defaults to "BSCDATA") |
results |
Character. "all" returns codes for all related species (including subspecies and main species). "exact" returns only the code for exact species indicated by the code. |
species_code_search()
is deprecated in favour of search_species_code()
Species ids returned reflect both species and sub-species levels.
A data frame of numeric species id codes and names
# Show all ids search_species_code() # Get all species ids for house finches search_species_code("HOFI") # Get all species ids for Dark-eyed Juncos search_species_code("DEJU") # Get all species ids related to Yellow-rumped Warbler (Myrtle) # NOTE! This includes Audubon's and the main, Yellow-rumped Warbler species search_species_code("MYWA") # Get ONLY specific id related to Yellow-rumped Warbler (Myrtle) search_species_code("MYWA", results = "exact") # Use the Christmas Bird Count authority search_species_code(11609, authority = "CBC") # Look in more than one authority (note that the code only needs to match on # of the authorities) search_species_code("MYWA", authority = c("BCMA", "CBC"))
# Show all ids search_species_code() # Get all species ids for house finches search_species_code("HOFI") # Get all species ids for Dark-eyed Juncos search_species_code("DEJU") # Get all species ids related to Yellow-rumped Warbler (Myrtle) # NOTE! This includes Audubon's and the main, Yellow-rumped Warbler species search_species_code("MYWA") # Get ONLY specific id related to Yellow-rumped Warbler (Myrtle) search_species_code("MYWA", results = "exact") # Use the Christmas Bird Count authority search_species_code(11609, authority = "CBC") # Look in more than one authority (note that the code only needs to match on # of the authorities) search_species_code("MYWA", authority = c("BCMA", "CBC"))