ldsc_rg()
uses ldscore regression to estimate the pairwise genetic correlations between traits. The function relies on named lists of traits, sample prevalences, and population prevalences. The name of each trait should be consistent across each argument.
Usage
ldsc_rg(
munged_sumstats,
ancestry,
sample_prev = NA,
population_prev = NA,
ld,
wld,
n_blocks = 200,
chisq_max = NA,
chr_filter = seq(1, 22, 1)
)
Arguments
- munged_sumstats
(list) A named list of dataframes, or paths to files containing munged summary statistics. Each set of munged summary statistics contain at least columns named
SNP
(rsid),A1
(effect allele),A2
(non-effect allele),N
(total sample size) andZ
(Z-score)- ancestry
(character) One of "AFR", "AMR", "CSA", "EAS", "EUR", or "MID", which will utilize the appropriate built-in
ld
andwld
files from Pan-UK Biobank. If empty orNULL
, the user must specify paths told
andwld
files.- sample_prev
(list) A named list containing the prevalence of cases in the current sample, used for conversion from observed heritability to liability-scale heritability. The default is
NA
, which is appropriate for quantitative traits or estimating heritability on the observed scale.- population_prev
(list) A named list containing the population prevalence of the trait, used for conversion from observed heritability to liability-scale heritability. The default is
NA
, which is appropriate for quantitative traits or estimating heritability on the observed scale.- ld
(character) Path to directory containing ld score files, ending in
*.l2.ldscore.gz
. Default isNA
, which will utilize the built-in ld score files from Pan-UK Biobank for the ancestry specified inancestry
.- wld
(character) Path to directory containing weight files. Default is
NA
, which will utilize the built-in weight files from Pan-UK Biobank for the ancestry specified inancestry
.- n_blocks
(numeric) Number of blocks used to produce block jackknife standard errors. Default is
200
- chisq_max
(numeric) Maximum value of Z^2 for SNPs to be included in LD-score regression. Default is to set
chisq_max
to the maximum of 80 and N*0.001.- chr_filter
(numeric vector) Chromosomes to include in analysis. Separating even/odd chromosomes may be useful for exploratory/confirmatory factor analysis.
Value
A list of class ldscr_list
containing heritablilty and genetic correlation information
h2
= tibble containing heritability information for each trait. Ifsample_prev
andpopulation_prev
were provided, the heritability estimates will also be returned on the liability scale.rg
= tibble containing pairwise genetic correlations information.raw
= A list of correlation/covariance matrices
Details
This function estimates the pairwise genetic correlations between an arbitrary number of traits. The function also estimates heritability for each individual trait. There is a ggplot2::autoplot()
method for visualizing a heatmap of the results.
Examples
if (FALSE) {
# Estimate genetic correlations between "APOB" and "LDL"
ldsc_res <- ldsc_rg(munged_sumstats = list("APOB" = sumstats_munged_example(example = "APOB"), "LDL" = sumstats_munged_example(example = "LDL")), ancestry = "EUR")
# Plot heatmap of results
autoplot(ldsc_res)
}