Applies a permutation-based jackstraw test to determine which features in
each data block have statistically significantly nonzero joint loadings
from a Rajive decomposition.
Usage
jackstraw_rajive(
ajive_output,
blocks,
alpha = 0.05,
n_null = 10,
correction = c("bonferroni", "BH", "none")
)Arguments
- ajive_output
List returned by
Rajive.- blocks
List of data matrices (same list passed to
Rajive).- alpha
Numeric scalar; desired significance level. Default
0.05.- n_null
Positive integer; number of null F-statistics generated per feature per joint component. Larger values give more stable p-values at the cost of computation time. Default
10; recommended50–100for publication-quality results.- correction
Character string controlling multiple-testing correction. One of
"bonferroni"(default),"BH"(Benjamini–Hochberg FDR), or"none"."bonferroni"dividesalphaby \(d_k \times \text{joint\_rank}\) for each block, matching the original Python implementation.
Value
An object of class "jackstraw_rajive": a named list with one
element per block (block1, block2, ...). Each element
is itself a list with one element per joint component
(comp1, comp2, ...) containing:
f_obslength-\(d_k\) numeric vector of observed F-statistics.
f_null\(d_k \times\)
n_nullmatrix of null F-statistics.p_valuesEmpirical p-values (length \(d_k\)).
p_adjMultiple-testing adjusted p-values (length \(d_k\)).
significantNamed logical vector (length \(d_k\)) indicating significance.
significant_varsInteger indices (or column names when available) of significant features.
The object also carries attributes alpha, correction,
joint_rank, and n_blocks.
Details
For each data block \(k\) and each joint component \(j\), the observed F-statistic for the regression feature ~ joint_score_j + 1 is compared to a null distribution generated by permuting randomly sampled feature values, thereby breaking the association with the joint scores. Empirical p-values are computed and optionally corrected for multiple testing.
References
Yang X, Hoadley KA, Hannig J, Marron JS (2021). Statistical inference for data integration. arXiv:2109.12272.
Chung NC, Storey JD (2015). Statistical significance of variables driving systematic variation in high-dimensional data. Bioinformatics, 31(4):545–554.
Examples
# \donttest{
set.seed(42)
n <- 50
pks <- c(100, 80)
Y <- ajive.data.sim(K = 2, rankJ = 2, rankA = c(5, 4), n = n,
pks = pks, dist.type = 1)
data.ajive <- Y$sim_data
initial_signal_ranks <- c(5, 4)
ajive_result <- Rajive(data.ajive, initial_signal_ranks)
js <- jackstraw_rajive(ajive_result, data.ajive, alpha = 0.05, n_null = 10)
print(js)
#> JIVE Jackstraw Significance Test
#> Joint rank: 1 Alpha: 0.05 Correction: bonferroni
#>
#> Block Component N features N significant
#> ----------------------------------------------------
#> block1 comp1 100 60
#> block2 comp1 80 51
summary(js)
#> block component n_features n_significant alpha correction
#> block1 comp1 100 60 0.05 bonferroni
#> block2 comp1 80 51 0.05 bonferroni
# }