Computes the absolute distance between the midpoint of each x interval and the midpoints of each closest y interval.

bed_absdist(x, y, genome)

Arguments

x

tbl_interval()

y

tbl_interval()

genome

tbl_genome()

Value

tbl_interval() with .absdist and .absdist_scaled columns.

Details

Absolute distances are scaled by the inter-reference gap for the chromosome as follows. For Q query points and R reference points on a chromosome, scale the distance for each query point i to the closest reference point by the inter-reference gap for each chromosome. If an x interval has no matching y chromosome, .absdist is NA.

$$d_i(x,y) = min_k(|q_i - r_k|)\frac{R}{Length\ of\ chromosome}$$

Both absolute and scaled distances are reported as .absdist and .absdist_scaled.

Interval statistics can be used in combination with dplyr::group_by() and dplyr::do() to calculate statistics for subsets of data. See vignette('interval-stats') for examples.

See also

Examples

genome <- read_genome(valr_example('hg19.chrom.sizes.gz')) x <- bed_random(genome, seed = 1010486) y <- bed_random(genome, seed = 9203911) bed_absdist(x, y, genome)
#> # A tibble: 1,000,000 x 5 #> chrom start end .absdist .absdist_scaled #> <chr> <int> <int> <dbl> <dbl> #> 1 chr1 5255 6255 16723 5.39 #> 2 chr1 7381 8381 14597 4.70 #> 3 chr1 13993 14993 7985 2.57 #> 4 chr1 14102 15102 7876 2.54 #> 5 chr1 14273 15273 7705 2.48 #> 6 chr1 16621 17621 5357 1.73 #> 7 chr1 17724 18724 4254 1.37 #> 8 chr1 22801 23801 823 0.265 #> 9 chr1 30996 31996 1379 0.444 #> 10 chr1 41948 42948 954 0.307 #> # ... with 999,990 more rows