Title: | Continuous Mapping of Genetic Diversity |
---|---|
Description: | Generate continuous maps of genetic diversity using moving windows with options for rarefaction, interpolation, and masking as described in Bishop et al. (2023) <doi:10.1111/2041-210X.14090>. |
Authors: | Anusha Bishop [aut, cre] |
Maintainer: | Anusha Bishop <[email protected]> |
License: | MIT + file LICENSE |
Version: | 2.1.2 |
Built: | 2025-02-22 05:59:50 UTC |
Source: | https://github.com/anushapb/wingen |
Generate a continuous raster map of genetic diversity using circle moving windows
circle_gd( gen, coords, lyr, maxdist, distmat = NULL, stat = "pi", fact = 0, rarify = FALSE, rarify_n = 2, rarify_nit = 5, min_n = 2, fun = mean, L = "nvariants", rarify_alleles = TRUE, sig = 0.05 )
circle_gd( gen, coords, lyr, maxdist, distmat = NULL, stat = "pi", fact = 0, rarify = FALSE, rarify_n = 2, rarify_nit = 5, min_n = 2, fun = mean, L = "nvariants", rarify_alleles = TRUE, sig = 0.05 )
gen |
genetic data either as an object of type vcf or a path to a vcf file (note: order matters! The coordinate and genetic data should be in the same order; there are currently no checks for this) |
coords |
coordinates of samples as sf points, a two-column matrix, or a data.frame representing x and y coordinates (see Details for important information about projections) |
lyr |
SpatRaster or RasterLayer to slide the window across (see Details for important information about projections) |
maxdist |
maximum geographic distance used to define neighborhood; any samples further than this distance will not be included (this can be thought of as the neighborhood radius)
Can either be (1) a single numeric value or (2) a SpatRaster where each pixel is the maximum distance to be used for that cell on the landscape (must be the same spatial scale as |
distmat |
distance matrix output from get_geodist (optional; can be used to save time on distance calculations) |
stat |
genetic diversity statistic(s) to calculate (see Details, defaults to |
fact |
aggregation factor to apply to |
rarify |
if rarify = TRUE, rarefaction is performed (defaults to FALSE) |
rarify_n |
if rarify = TRUE, number of points to use for rarefaction (defaults to min_n) |
rarify_nit |
if rarify = TRUE, number of iterations to use for rarefaction (defaults to 5). Can also be set to |
min_n |
minimum number of samples to use in calculations (any focal cell with a window containing less than this number of samples will be assigned a value of NA; defaults to 2) |
fun |
function to use to summarize rarefaction results (defaults to mean, must take |
L |
for calculating |
rarify_alleles |
for calculating |
sig |
for calculating |
Coordinates and rasters should be in a Euclidean coordinate system (i.e., UTM coordinates) such that raster cell width and height are equal distances. As such, longitude-latitude systems should be transformed before using dist_gd. Transformation can be performed using st_set_crs for coordinates or project for rasters (see vignette for more details).
Coordinates and rasters should be in a projected (planar) coordinate system such that raster cells are of equal sizes. Therefore, spherical systems (including latitute-longitude coordinate systems) should be projected prior to use. Transformation can be performed using st_set_crs for coordinates or project for rasters (see vignette for more details).
Current genetic diversity metrics that can be specified with stat
include:
"pi"
for nucleotide diversity (default) calculated using hierfstat
pi.dosage. Use the L
argument to set the sequence length (defaults to dividing by the number of variants).
"Ho"
for average observed heterozygosity across all sites
"allelic_richness"
for average number of alleles across all sites
"biallelic_richness"
for average allelic richness across all sites for a biallelic dataset (this option is faster than "allelic_richness"
)
"hwe"
for the proportion of sites that are not in Hardy–Weinberg equilibrium, calculated using pegas
hw.test at the 0.05 level (other alpha levels can be specified by adding the sig argument; e.g., sig = 0.10
).
"basic_stats"
for a series of statistics produced by hierfstat
basic.stats including
mean observed heterozygosity (same as Ho), mean gene diversities within population (Hs),
Gene diversities overall (Ht), and Fis following Nei (1987). Population-based statistics (e.g., FST) normally reported by basic.stats
are not included as they are not meaningful within the individual-based moving windows.
SpatRaster that includes a raster layer of genetic diversity and a raster layer of the number of samples within the window for each cell
load_mini_ex() cpi <- circle_gd(mini_vcf, mini_coords, mini_lyr, fact = 2, maxdist = 5)
load_mini_ex() cpi <- circle_gd(mini_vcf, mini_coords, mini_lyr, fact = 2, maxdist = 5)
Generate a continuous raster map using circular moving windows.
While resist_gd is built specifically for making maps
of genetic diversity from vcfs,circle_general
can be used to make maps
from different data inputs. Unlike resist_gd
, resist_general
will not convert your data into the correct format for calculations of different
diversity metrics. See details for how to format data inputs for different statistics.
circle_general( x, coords, lyr, maxdist, distmat = NULL, stat, fact = 0, rarify = FALSE, rarify_n = 2, rarify_nit = 5, min_n = 2, fun = mean, L = "nvariants", rarify_alleles = TRUE, sig = 0.05, ... )
circle_general( x, coords, lyr, maxdist, distmat = NULL, stat, fact = 0, rarify = FALSE, rarify_n = 2, rarify_nit = 5, min_n = 2, fun = mean, L = "nvariants", rarify_alleles = TRUE, sig = 0.05, ... )
x |
data to be summarized by the moving window (note: order matters! |
coords |
coordinates of samples as sf points, a two-column matrix, or a data.frame representing x and y coordinates (see Details for important information about projections) |
lyr |
SpatRaster or RasterLayer to slide the window across (see Details for important information about projections) |
maxdist |
maximum geographic distance used to define neighborhood; any samples further than this distance will not be included (this can be thought of as the neighborhood radius)
Can either be (1) a single numeric value or (2) a SpatRaster where each pixel is the maximum distance to be used for that cell on the landscape (must be the same spatial scale as |
distmat |
distance matrix output from get_geodist (optional; can be used to save time on distance calculations) |
stat |
moving window statistic to calculate (see details). |
fact |
aggregation factor to apply to |
rarify |
if rarify = TRUE, rarefaction is performed (defaults to FALSE) |
rarify_n |
if rarify = TRUE, number of points to use for rarefaction (defaults to min_n) |
rarify_nit |
if rarify = TRUE, number of iterations to use for rarefaction (defaults to 5). Can also be set to |
min_n |
minimum number of samples to use in calculations (any focal cell with a window containing less than this number of samples will be assigned a value of NA; defaults to 2) |
fun |
function to use to summarize rarefaction results (defaults to mean, must take |
L |
for calculating |
rarify_alleles |
for calculating |
sig |
for calculating |
... |
if a function is provided for |
To calculate genetic diversity statistics with the built in wingen functions, data must be formatted as such:
for "pi"
or "biallelic_richness"
, x
must be a dosage matrix with values of 0, 1, or 2
for "Ho"
, x
must be a heterozygosity matrix where values of 0 = homozygosity and values of 1 = heterozygosity
for "allelic_richness"
or "hwe
, x
must be a genind
type object
for "basic_stats"
, x
must be a hierfstat
type object
Otherwise, stat
can be any function that takes a matrix or data frame and outputs a
single numeric value (e.g., a function that produces a custom diversity index);
however, this should be attempted with caution since this functionality has
not have been tested extensively and may produce errors.
SpatRaster that includes a raster layer of genetic diversity and a raster layer of the number of samples within the window for each cell
Generate a raster layer from coordinates which can be used in window_gd as the RasterLayer to move the window across
coords_to_raster( coords, buffer = 0, res = 1, agg = NULL, disagg = NULL, plot = FALSE )
coords_to_raster( coords, buffer = 0, res = 1, agg = NULL, disagg = NULL, plot = FALSE )
coords |
coordinates of samples as sf points, a SpatVector, a two-column matrix, or a data.frame with x and y coordinates |
buffer |
size of buffer to add to edge of raster (defaults to 0) |
res |
desired resolution of raster (defaults to 1). Can be a single value for square cells or a vector with two values representing x and y resolutions |
agg |
aggregation factor to apply to raster (defaults to NULL) |
disagg |
disaggregation factor to apply to raster (defaults to NULL) |
plot |
whether to plot resulting raster with coords (defaults to FALSE) |
RasterLayer
load_mini_ex() coords_to_raster(mini_coords, buffer = 1, plot = TRUE)
load_mini_ex() coords_to_raster(mini_coords, buffer = 1, plot = TRUE)
Create a distance matrix based on coordinates and a raster layer. The output is a distance matrix where rows represent cells on the landscape and columns represent individual locations on the landscape. Each value is the geographic distance between each individual and each cell calculated using st_distance. This matrix is used by circle_gd. If coords_only = TRUE, the result is a distance matrix for the sample coordinates only.
get_geodist(coords, lyr = NULL, fact = 0, coords_only = FALSE)
get_geodist(coords, lyr = NULL, fact = 0, coords_only = FALSE)
coords |
coordinates of samples as sf points, a two-column matrix, or a data.frame representing x and y coordinates (see Details for important information about projections) |
lyr |
SpatRaster or RasterLayer for generating distances (not required if coords_only = TRUE) |
fact |
aggregation factor to apply to |
coords_only |
whether to return distances only for sample coordinates |
a distance matrix used by circle_gd
load_mini_ex() distmat <- get_geodist(mini_coords, mini_lyr)
load_mini_ex() distmat <- get_geodist(mini_coords, mini_lyr)
Create a distance matrix based on coordinates and a connectivity layer. The output is a distance matrix where rows represent cells on the landscape and columns represent individual locations on the landscape. Each value is the resistance distance between each sample and each cell calculated using the gdistance package. This matrix is used by resist_gd. If coords_only = TRUE, the result is a distance matrix for the sample coordinates only.
get_resdist( coords, lyr, fact = 0, transitionFunction = mean, directions = 8, geoCorrection = TRUE, coords_only = FALSE )
get_resdist( coords, lyr, fact = 0, transitionFunction = mean, directions = 8, geoCorrection = TRUE, coords_only = FALSE )
coords |
coordinates of samples as sf points, a two-column matrix, or a data.frame representing x and y coordinates (see Details for important information about projections) |
lyr |
conductivity layer (higher values should mean greater conductivity) for generating distances. Can be either a SpatRaster or RasterLayer. |
fact |
aggregation factor to apply to |
transitionFunction |
function to calculate transition values from grid values (defaults to mean) |
directions |
directions in which cells are connected (4, 8, 16, or other), see adjacent (defaults to 8) |
geoCorrection |
whether to apply correction to account for local distances (defaults to TRUE). Geographic correction is necessary for all objects of the class Transition that are either: (1) based on a grid in a geographic (lonlat) projection and covering a large area; (2) made with directions > 4 (see geoCorrection for more details). |
coords_only |
whether to return distances only for sample coordinates |
a distance matrix used by resist_gd
load_mini_ex() distmat <- get_resdist(mini_coords, mini_lyr)
load_mini_ex() distmat <- get_resdist(mini_coords, mini_lyr)
Plot sample counts layer produced by window_gd or krig_gd
ggplot_count(x, index = NULL, col = viridis::mako(100))
ggplot_count(x, index = NULL, col = viridis::mako(100))
x |
single SpatRaster of counts or SpatRaster where indexed layer is sample counts |
index |
index of raster layers to plot (defaults to plotting the one called "sample_count", if more than one layer is provided) |
col |
color palette to use for plotting (defaults to viridis::mako palette) |
list of ggplots
data("mini_lyr") ggplot_count(mini_lyr)
data("mini_lyr") ggplot_count(mini_lyr)
Plot genetic diversity layer produced by window_gd or krig_gd
ggplot_gd(x, bkg = NULL, index = NULL, col = viridis::magma(100))
ggplot_gd(x, bkg = NULL, index = NULL, col = viridis::magma(100))
x |
output from window_gd or krig_gd (RasterStack where first layer is genetic diversity) |
bkg |
optional raster or sf polygon |
index |
index of raster layers to plot (defaults to plotting all of the layers except the one called "sample_count", if more than one layer is provided) |
col |
color palette to use for plotting (defaults to magma palette) |
list of ggplots
data("mini_lyr") ggplot_gd(mini_lyr)
data("mini_lyr") ggplot_gd(mini_lyr)
Perform interpolation of the raster(s) produced by window_gd using autoKrige
krig_gd( r, grd = NULL, index = 1, coords = NULL, agg_grd = NULL, disagg_grd = NULL, agg_r = NULL, disagg_r = NULL, autoKrige_output = FALSE, lower_bound = TRUE, upper_bound = TRUE, krig_method = "ordinary", resample = FALSE, resample_first = TRUE )
krig_gd( r, grd = NULL, index = 1, coords = NULL, agg_grd = NULL, disagg_grd = NULL, agg_r = NULL, disagg_r = NULL, autoKrige_output = FALSE, lower_bound = TRUE, upper_bound = TRUE, krig_method = "ordinary", resample = FALSE, resample_first = TRUE )
r |
SpatRaster produced by window_gd |
grd |
object to create grid for kriging; can be a SpatRaster or RasterLayer. If undefined, will use |
index |
integer indices of layers in raster stack to krige (defaults to 1; i.e., the first layer) |
coords |
if provided, kriging will occur based only on values at these coordinates. Can be provided as an sf points, a two-column matrix, or a data.frame representing x and y coordinates |
agg_grd |
factor to use for aggregation of |
disagg_grd |
factor to use for disaggregation of |
agg_r |
factor to use for aggregation of |
disagg_r |
factor to use for disaggregation, of |
autoKrige_output |
whether to return full output from autoKrige including uncertainty rasters (defaults to FALSE). If TRUE, returns a list with the kriged input raster layer ("raster"), kriged variance ("var"), kriged standard deviation ("stdev"), and full autoKrige output ("autoKrige_output"). |
lower_bound |
if TRUE (default), converts all values in the kriged raster less than the minimum value of the input raster, to that minimum |
upper_bound |
if TRUE (default), converts all values in the kriged raster greater than the maximum value of the input raster, to that maximum |
krig_method |
method to use for kriging. If |
resample |
whether to resample |
resample_first |
if aggregation or disaggregation is used in addition to resampling, specifies whether to resample before (resample_first = TRUE) or after (resample_first = FALSE) aggregation/disaggregation (defaults to TRUE) |
a SpatRaster object or a list of autoKrige outputs (if autoKrige_output = TRUE)
load_mini_ex() wpi <- window_gd(mini_vcf, mini_coords, mini_lyr, L = 10, rarify = TRUE) kpi <- krig_gd(wpi, mini_lyr) plot_gd(kpi, main = "Kriged Pi")
load_mini_ex() wpi <- window_gd(mini_vcf, mini_coords, mini_lyr, L = 10, rarify = TRUE) kpi <- krig_gd(wpi, mini_lyr) plot_gd(kpi, main = "Kriged Pi")
Loads middle earth example data
load_middle_earth_ex(quiet = FALSE)
load_middle_earth_ex(quiet = FALSE)
quiet |
whether to hide message (defaults to FALSE) |
three objects are loaded (lotr_vcf, lotr_coords, and lotr_lyrs)
load_middle_earth_ex()
load_middle_earth_ex()
Loads mini middle earth example data
load_mini_ex(quiet = FALSE)
load_mini_ex(quiet = FALSE)
quiet |
whether to hide message (defaults to FALSE) |
three objects are assigned in the GlobalEnv (vcf, coords, and lyr)
load_mini_ex()
load_mini_ex()
Middle earth example coordinates
lotr_coords
lotr_coords
A data frame with 100 rows and 2 columns
x coordinate
y coordinate
created from simulations in Bishop et al. (2023)
RasterLayer of middle earth based on an example digital elevation model of Tolkien’s Middle Earth produced by the Center for Geospatial Analysis at William & Mary (Robert, 2020).
lotr_lyr
lotr_lyr
RasterLayer
created from simulations in Bishop et al. (2023) based on Rose, Robert A. (2020) GIS & Middle Earth Presentation & Data Set. William & Mary. https://doi.org/10.21220/RKEZ-X707
sf polygon of range map
lotr_range
lotr_range
sf
created from simulations in Bishop et al. (2023)
A Variant Call Format data set
lotr_vcf
lotr_vcf
Object of class vcfR with 100 individuals and 1000 loci
created from simulations in Bishop et al. (2023)
Mask genetic diversity layer produced by window_gd or krig_gd
mask_gd(x, y, minval = NULL, maxval = NULL)
mask_gd(x, y, minval = NULL, maxval = NULL)
x |
Raster object to mask |
y |
Raster object or Spatial object to use as mask |
minval |
if y is a Raster object, value of y below which to mask |
maxval |
if y is a Raster object, value of y above which to mask |
RasterLayer
data("mini_lyr") mpi <- mask_gd(mini_lyr, mini_lyr, minval = 0.01)
data("mini_lyr") mpi <- mask_gd(mini_lyr, mini_lyr, minval = 0.01)
Mini middle earth example coordinates
mini_coords
mini_coords
A data frame with 10 rows and 2 columns
x coordinate
y coordinate
created from simulations in Bishop et al. (2023)
Small RasterLayer of middle earth based on an example digital elevation model of Tolkien's Middle Earth produced by the Center for Geospatial Analysis at William & Mary (Robert, 2020).
mini_lyr
mini_lyr
A RasterLayer of middle earth
created from simulations in Bishop et al. (2023)
A Variant Call Format data set
mini_vcf
mini_vcf
Object of class vcfR with 10 individuals and 10 loci
created from simulations in Bishop et al. (2023)
A Variant Call Format data set with NA values
mini_vcf_NA
mini_vcf_NA
Object of class vcfR with 10 individuals and 10 loci
created from simulations in Bishop et al. (2023)
Plot sample counts layer produced by window_gd or krig_gd
plot_count( x, index = NULL, breaks = 100, col = viridis::mako(breaks), main = NULL, box = FALSE, range = NULL, legend = TRUE, ... )
plot_count( x, index = NULL, breaks = 100, col = viridis::mako(breaks), main = NULL, box = FALSE, range = NULL, legend = TRUE, ... )
x |
single SpatRaster of counts or SpatRaster where indexed layer is sample counts |
index |
if a raster stack is provided, index of the sample count layer to plot (defaults to plotting the layer named "sample_count" or the last layer of the stack) |
breaks |
number of breaks to use in color scale (defaults to 10) |
col |
color palette to use for plotting (defaults to viridis::magma palette) |
main |
character. Main plot titles (one for each layer to be plotted). You can use arguments |
box |
whether to include a box around the raster plot (defaults to FALSE) |
range |
numeric. minimum and maximum values to be used for the continuous legend |
legend |
whether to include legend |
... |
arguments passed to |
plot of sample counts
data("mini_lyr") plot_count(mini_lyr)
data("mini_lyr") plot_count(mini_lyr)
Plot genetic diversity layer produced by window_gd or krig_gd
plot_gd( x, bkg = NULL, index = NULL, col = viridis::magma(breaks), breaks = 100, main = NULL, box = FALSE, range = NULL, legend = TRUE, ... )
plot_gd( x, bkg = NULL, index = NULL, col = viridis::magma(breaks), breaks = 100, main = NULL, box = FALSE, range = NULL, legend = TRUE, ... )
x |
output from window_gd or krig_gd (SpatRaster where first layer is genetic diversity) |
bkg |
optional SpatRaster or other spatial object that will be plotted as the "background" in gray |
index |
if a raster stack is provided, index of the layer to plot (defaults to plotting all layers except layers named "sample_count") |
col |
color palette to use for plotting (defaults to magma palette) |
breaks |
number of breaks to use in color scale (defaults to 100) |
main |
character. Main plot titles (one for each layer to be plotted). You can use arguments |
box |
whether to include a box around the Raster plot (defaults to FALSE) |
range |
numeric. minimum and maximum values to be used for the continuous legend |
legend |
whether to include legend |
... |
arguments passed to |
plot of genetic diversity
data("mini_lyr") plot_gd(mini_lyr)
data("mini_lyr") plot_gd(mini_lyr)
Generate a preview of moving window size and sample counts based on the coordinates and
parameters to be supplied to window_gd, circle_gd, or resist_gd.
The method to be used should be specified with method = "window"
, "circle"
, or "resist"
. For method = "window"
,
wdim
must be specified. For method = "circle"
or "resist"
, maxdist
must be specified and
distmat
can also optionally be specified.
preview_gd( lyr, coords, method = "window", wdim = 3, maxdist = NULL, distmat = NULL, fact = 0, sample_count = TRUE, min_n = 0, plot = TRUE )
preview_gd( lyr, coords, method = "window", wdim = 3, maxdist = NULL, distmat = NULL, fact = 0, sample_count = TRUE, min_n = 0, plot = TRUE )
lyr |
SpatRaster or RasterLayer to slide the window across (see Details for important information about projections). For |
coords |
coordinates of samples as sf points, a two-column matrix, or a data.frame representing x and y coordinates (see Details for important information about projections) |
method |
which method to use to create preview ( |
wdim |
if |
maxdist |
if |
distmat |
if |
fact |
aggregation factor to apply to |
sample_count |
whether to create plot of sample counts for each cell (defaults to TRUE) |
min_n |
minimum number of samples to use in calculations (any focal cell with a window containing less than this number of samples will be assigned a value of NA) |
plot |
whether to plot results (default = TRUE) |
Coordinates and rasters should be in a projected (planar) coordinate system such that raster cells are of equal sizes. Therefore, spherical systems (including latitute-longitude coordinate systems) should be projected prior to use. Transformation can be performed using st_set_crs for coordinates or project for rasters (see vignette for more details).
Plots preview of window and returns SpatRaster with sample counts layer (if sample_count = TRUE)
load_mini_ex() preview_gd(mini_lyr, mini_coords, wdim = 3, fact = 3, sample_count = TRUE, min_n = 2)
load_mini_ex() preview_gd(mini_lyr, mini_coords, wdim = 3, fact = 3, sample_count = TRUE, min_n = 2)
Generate a continuous raster map of genetic diversity using resistance distances calculated with a conductivity surface
resist_gd( gen, coords, lyr, maxdist, distmat = NULL, stat = "pi", fact = 0, rarify = FALSE, rarify_n = 2, rarify_nit = 5, min_n = 2, fun = mean, L = "nvariants", rarify_alleles = TRUE, sig = 0.05, transitionFunction = mean, directions = 8, geoCorrection = TRUE )
resist_gd( gen, coords, lyr, maxdist, distmat = NULL, stat = "pi", fact = 0, rarify = FALSE, rarify_n = 2, rarify_nit = 5, min_n = 2, fun = mean, L = "nvariants", rarify_alleles = TRUE, sig = 0.05, transitionFunction = mean, directions = 8, geoCorrection = TRUE )
gen |
genetic data either as an object of type vcf or a path to a vcf file (note: order matters! The coordinate and genetic data should be in the same order; there are currently no checks for this) |
coords |
coordinates of samples as sf points, a two-column matrix, or a data.frame representing x and y coordinates (see Details for important information about projections) |
lyr |
conductivity layer (higher values should mean greater conductivity) to move window across. Can be either a SpatRaster or RasterLayer. |
maxdist |
maximum cost distance used to define neighborhood; any samples further than this cost distance will not be included (this can be thought of as the neighborhood radius, but in terms of cost distance).
Can either be (1) a single numeric value or (2) a SpatRaster where each pixel is the maximum distance to be used for that cell on the landscape (must be the same spatial scale as |
distmat |
distance matrix output from get_resdist (optional; can be used to save time on distance calculations) |
stat |
genetic diversity statistic(s) to calculate (see Details, defaults to |
fact |
aggregation factor to apply to |
rarify |
if rarify = TRUE, rarefaction is performed (defaults to FALSE) |
rarify_n |
if rarify = TRUE, number of points to use for rarefaction (defaults to min_n) |
rarify_nit |
if rarify = TRUE, number of iterations to use for rarefaction (defaults to 5). Can also be set to |
min_n |
minimum number of samples to use in calculations (any focal cell with a window containing less than this number of samples will be assigned a value of NA; defaults to 2) |
fun |
function to use to summarize rarefaction results (defaults to mean, must take |
L |
for calculating |
rarify_alleles |
for calculating |
sig |
for calculating |
transitionFunction |
function to calculate transition values from grid values (defaults to mean) |
directions |
directions in which cells are connected (4, 8, 16, or other), see adjacent (defaults to 8) |
geoCorrection |
whether to apply correction to account for local distances (defaults to TRUE). Geographic correction is necessary for all objects of the class Transition that are either: (1) based on a grid in a geographic (lonlat) projection and covering a large area; (2) made with directions > 4 (see geoCorrection for more details). |
Coordinates and rasters should be in a Euclidean coordinate system (i.e., UTM coordinates) such that raster cell width and height are equal distances. As such, longitude-latitude systems should be transformed before using dist_gd. Transformation can be performed using st_set_crs for coordinates or project for rasters (see vignette for more details).
Coordinates and rasters should be in a projected (planar) coordinate system such that raster cells are of equal sizes. Therefore, spherical systems (including latitute-longitude coordinate systems) should be projected prior to use. Transformation can be performed using st_set_crs for coordinates or project for rasters (see vignette for more details).
Current genetic diversity metrics that can be specified with stat
include:
"pi"
for nucleotide diversity (default) calculated using hierfstat
pi.dosage. Use the L
argument to set the sequence length (defaults to dividing by the number of variants).
"Ho"
for average observed heterozygosity across all sites
"allelic_richness"
for average number of alleles across all sites
"biallelic_richness"
for average allelic richness across all sites for a biallelic dataset (this option is faster than "allelic_richness"
)
"hwe"
for the proportion of sites that are not in Hardy–Weinberg equilibrium, calculated using pegas
hw.test at the 0.05 level (other alpha levels can be specified by adding the sig argument; e.g., sig = 0.10
).
"basic_stats"
for a series of statistics produced by hierfstat
basic.stats including
mean observed heterozygosity (same as Ho), mean gene diversities within population (Hs),
Gene diversities overall (Ht), and Fis following Nei (1987). Population-based statistics (e.g., FST) normally reported by basic.stats
are not included as they are not meaningful within the individual-based moving windows.
SpatRaster that includes a raster layer of genetic diversity and a raster layer of the number of samples within the window for each cell
load_mini_ex() rpi <- resist_gd(mini_vcf, mini_coords, mini_lyr, maxdist = 50)
load_mini_ex() rpi <- resist_gd(mini_vcf, mini_coords, mini_lyr, maxdist = 50)
Generate a continuous raster map using resistance distances.
While resist_gd is built specifically for making maps
of genetic diversity from vcfs,resist_general
can be used to make maps
from different data inputs. Unlike resist_gd
, resist_general
will not convert your data into the correct format for calculations of different
diversity metrics. See details for how to format data inputs for different statistics.
resist_general( x, coords, lyr, maxdist, distmat = NULL, stat, fact = 0, rarify = FALSE, rarify_n = 2, rarify_nit = 5, min_n = 2, fun = mean, L = "nvariants", rarify_alleles = TRUE, sig = 0.05, transitionFunction = mean, directions = 8, geoCorrection = TRUE, ... )
resist_general( x, coords, lyr, maxdist, distmat = NULL, stat, fact = 0, rarify = FALSE, rarify_n = 2, rarify_nit = 5, min_n = 2, fun = mean, L = "nvariants", rarify_alleles = TRUE, sig = 0.05, transitionFunction = mean, directions = 8, geoCorrection = TRUE, ... )
x |
data to be summarized by the moving window (note: order matters! |
coords |
coordinates of samples as sf points, a two-column matrix, or a data.frame representing x and y coordinates (see Details for important information about projections) |
lyr |
SpatRaster or RasterLayer to slide the window across (see Details for important information about projections) |
maxdist |
maximum cost distance used to define neighborhood; any samples further than this cost distance will not be included (this can be thought of as the neighborhood radius, but in terms of cost distance).
Can either be (1) a single numeric value or (2) a SpatRaster where each pixel is the maximum distance to be used for that cell on the landscape (must be the same spatial scale as |
distmat |
distance matrix output from get_resdist (optional; can be used to save time on distance calculations) |
stat |
moving window statistic to calculate (see details). |
fact |
aggregation factor to apply to |
rarify |
if rarify = TRUE, rarefaction is performed (defaults to FALSE) |
rarify_n |
if rarify = TRUE, number of points to use for rarefaction (defaults to min_n) |
rarify_nit |
if rarify = TRUE, number of iterations to use for rarefaction (defaults to 5). Can also be set to |
min_n |
minimum number of samples to use in calculations (any focal cell with a window containing less than this number of samples will be assigned a value of NA; defaults to 2) |
fun |
function to use to summarize rarefaction results (defaults to mean, must take |
L |
for calculating |
rarify_alleles |
for calculating |
sig |
for calculating |
transitionFunction |
function to calculate transition values from grid values (defaults to mean) |
directions |
directions in which cells are connected (4, 8, 16, or other), see adjacent (defaults to 8) |
geoCorrection |
whether to apply correction to account for local distances (defaults to TRUE). Geographic correction is necessary for all objects of the class Transition that are either: (1) based on a grid in a geographic (lonlat) projection and covering a large area; (2) made with directions > 4 (see geoCorrection for more details). |
... |
if a function is provided for |
To calculate genetic diversity statistics with the built in wingen functions, data must be formatted as such:
for "pi"
or "biallelic_richness"
, x
must be a dosage matrix with values of 0, 1, or 2
for "Ho"
, x
must be a heterozygosity matrix where values of 0 = homozygosity and values of 1 = heterozygosity
for "allelic_richness"
or "hwe
, x
must be a genind
type object
for "basic_stats"
, x
must be a hierfstat
type object
Otherwise, stat
can be any function that takes a matrix or data frame and outputs a
single numeric value (e.g., a function that produces a custom diversity index);
however, this should be attempted with caution since this functionality has
not have been tested extensively and may produce errors.
SpatRaster that includes a raster layer of genetic diversity and a raster layer of the number of samples within the window for each cell
Convert a vcf to a dosage matrix
vcf_to_dosage(x)
vcf_to_dosage(x)
x |
can either be an object of class 'vcfR' or a path to a .vcf file |
dosage matrix
Generate a continuous raster map of genetic diversity using moving windows.
window_gd( gen, coords, lyr, stat = "pi", wdim = 3, fact = 0, rarify = FALSE, rarify_n = NULL, rarify_nit = 5, min_n = 2, fun = mean, L = "nvariants", rarify_alleles = TRUE, sig = 0.05, crop_edges = FALSE, ... )
window_gd( gen, coords, lyr, stat = "pi", wdim = 3, fact = 0, rarify = FALSE, rarify_n = NULL, rarify_nit = 5, min_n = 2, fun = mean, L = "nvariants", rarify_alleles = TRUE, sig = 0.05, crop_edges = FALSE, ... )
gen |
genetic data either as an object of type vcf or a path to a vcf file (note: order matters! The coordinate and genetic data should be in the same order; there are currently no checks for this) |
coords |
coordinates of samples as sf points, a two-column matrix, or a data.frame representing x and y coordinates (see Details for important information about projections) |
lyr |
SpatRaster or RasterLayer to slide the window across (see Details for important information about projections) |
stat |
genetic diversity statistic(s) to calculate (see Details, defaults to |
wdim |
dimensions (height x width) of window; if only one value is provided, a square window is created (defaults to 3 x 3 window) |
fact |
aggregation factor to apply to |
rarify |
if rarify = TRUE, rarefaction is performed (defaults to FALSE) |
rarify_n |
if rarify = TRUE, number of points to use for rarefaction (defaults to min_n) |
rarify_nit |
if rarify = TRUE, number of iterations to use for rarefaction (defaults to 5). Can also be set to |
min_n |
minimum number of samples to use in calculations (any focal cell with a window containing less than this number of samples will be assigned a value of NA; defaults to 2) |
fun |
function to use to summarize rarefaction results (defaults to mean, must take |
L |
for calculating |
rarify_alleles |
for calculating |
sig |
for calculating |
crop_edges |
whether to remove cells on the edge of the raster where the window is incomplete (defaults to FALSE) |
... |
deprecated this was intended to be used to pass additional arguments to the |
Coordinates and rasters should be in a projected (planar) coordinate system such that raster cells are of equal sizes. Therefore, spherical systems (including latitute-longitude coordinate systems) should be projected prior to use. Transformation can be performed using st_set_crs for coordinates or project for rasters (see vignette for more details).
Current genetic diversity metrics that can be specified with stat
include:
"pi"
for nucleotide diversity (default) calculated using hierfstat
pi.dosage. Use the L
argument to set the sequence length (defaults to dividing by the number of variants).
"Ho"
for average observed heterozygosity across all sites
"allelic_richness"
for average number of alleles across all sites
"biallelic_richness"
for average allelic richness across all sites for a biallelic dataset (this option is faster than "allelic_richness"
)
"hwe"
for the proportion of sites that are not in Hardy–Weinberg equilibrium, calculated using pegas
hw.test at the 0.05 level (other alpha levels can be specified by adding the sig argument; e.g., sig = 0.10
).
"basic_stats"
for a series of statistics produced by hierfstat
basic.stats including
mean observed heterozygosity (same as Ho), mean gene diversities within population (Hs),
Gene diversities overall (Ht), and Fis following Nei (1987). Population-based statistics (e.g., FST) normally reported by basic.stats
are not included as they are not meaningful within the individual-based moving windows.
SpatRaster that includes raster layers of genetic diversity and a raster layer of the number of samples within the window for each cell
load_mini_ex() wpi <- window_gd(mini_vcf, mini_coords, mini_lyr, rarify = TRUE)
load_mini_ex() wpi <- window_gd(mini_vcf, mini_coords, mini_lyr, rarify = TRUE)
Generate a continuous raster map using moving windows. While window_gd is
built specifically for making moving window maps of genetic diversity from vcfs,
window_general
can be used to make moving window maps from different data inputs.
See details for how to format data inputs for different statistics.
window_general( x, coords, lyr, stat, wdim = 3, fact = 0, rarify = FALSE, rarify_n = NULL, rarify_nit = 5, min_n = 2, fun = mean, L = "nvariants", rarify_alleles = TRUE, sig = 0.05, crop_edges = FALSE, ... )
window_general( x, coords, lyr, stat, wdim = 3, fact = 0, rarify = FALSE, rarify_n = NULL, rarify_nit = 5, min_n = 2, fun = mean, L = "nvariants", rarify_alleles = TRUE, sig = 0.05, crop_edges = FALSE, ... )
x |
data to be summarized by the moving window (note: order matters! |
coords |
coordinates of samples as sf points, a two-column matrix, or a data.frame representing x and y coordinates (see Details for important information about projections) |
lyr |
SpatRaster or RasterLayer to slide the window across (see Details for important information about projections) |
stat |
moving window statistic to calculate (can either be |
wdim |
dimensions (height x width) of window; if only one value is provided, a square window is created (defaults to 3 x 3 window) |
fact |
aggregation factor to apply to |
rarify |
if rarify = TRUE, rarefaction is performed (defaults to FALSE) |
rarify_n |
if rarify = TRUE, number of points to use for rarefaction (defaults to min_n) |
rarify_nit |
if rarify = TRUE, number of iterations to use for rarefaction (defaults to 5). Can also be set to |
min_n |
minimum number of samples to use in calculations (any focal cell with a window containing less than this number of samples will be assigned a value of NA; defaults to 2) |
fun |
function to use to summarize rarefaction results (defaults to mean, must take |
L |
for calculating |
rarify_alleles |
for calculating |
sig |
for calculating |
crop_edges |
whether to remove cells on the edge of the raster where the window is incomplete (defaults to FALSE) |
... |
if a function is provided for |
To calculate genetic diversity statistics with the built in wingen functions, data must be formatted as such:
for "pi"
or "biallelic_richness"
, x
must be a dosage matrix with values of 0, 1, or 2
for "Ho"
, x
must be a heterozygosity matrix where values of 0 = homozygosity and values of 1 = heterozygosity
for "allelic_richness"
or "hwe
, x
must be a genind
type object
for "basic_stats"
, x
must be a hierfstat
type object
Otherwise, stat
can be any function that takes a matrix or data frame and outputs a
single numeric value (e.g., a function that produces a custom diversity index);
however, this should be attempted with caution since this functionality has
not have been tested extensively and may produce errors.
SpatRaster that includes a raster layer of genetic diversity and a raster layer of the number of samples within the window for each cell