Projection test for query interval overlap.

bed_projection(x, y, genome, by_chrom = FALSE)

Arguments

x

tbl_interval()

y

tbl_interval()

genome

tbl_genome()

by_chrom

compute test per chromosome

Value

tbl_interval() with the following columns:

  • chrom the name of chromosome tested if by_chrom = TRUE, otherwise has a value of whole_genome

  • p.value p-value from a binomial test. p-values > 0.5 are converted to 1 - p-value and lower_tail is FALSE

  • obs_exp_ratio ratio of observed to expected overlap frequency

  • lower_tail TRUE indicates the observed overlaps are in the lower tail of the distribution (e.g., less overlap than expected). FALSE indicates that the observed overlaps are in the upper tail of the distribution (e.g., more overlap than expected)

Details

Interval statistics can be used in combination with dplyr::group_by() and dplyr::do() to calculate statistics for subsets of data. See vignette('interval-stats') for examples.

See also

Examples

genome <- read_genome(valr_example('hg19.chrom.sizes.gz')) x <- bed_random(genome, seed = 1010486) y <- bed_random(genome, seed = 9203911) bed_projection(x, y, genome)
#> # A tibble: 1 x 4 #> chrom p.value obs_exp_ratio lower_tail #> <chr> <dbl> <dbl> <chr> #> 1 whole_genome 0.481 1.000 TRUE
bed_projection(x, y, genome, by_chrom = TRUE)
#> # A tibble: 25 x 4 #> chrom p.value obs_exp_ratio lower_tail #> <chr> <dbl> <dbl> <chr> #> 1 chr1 0.425 1.00 FALSE #> 2 chr10 0.0303 0.986 TRUE #> 3 chr11 0.0718 1.01 FALSE #> 4 chr12 0.0600 1.01 FALSE #> 5 chr13 0.00708 1.02 FALSE #> 6 chr14 0.457 0.999 TRUE #> 7 chr15 0.345 1.00 FALSE #> 8 chr16 0.0979 0.988 TRUE #> 9 chr17 0.167 1.01 FALSE #> 10 chr18 0.423 0.998 TRUE #> # ... with 15 more rows