Title: | Latent Space-Based Transfer Learning |
---|---|
Description: | Implements transfer learning methods for low-rank matrix estimation. These methods leverage similarity in the latent row and column spaces between the source and target populations to improve estimation in the target population. The methods include the LatEnt spAce-based tRaNsfer lEaRning (LEARNER) method and the direct projection LEARNER (D-LEARNER) method described by McGrath et al. (2024) <doi:10.48550/arXiv.2412.20605>. |
Authors: | Sean McGrath [aut, cre] , Cenhao Zhu [aut], Rui Duan [aut] |
Maintainer: | Sean McGrath <[email protected]> |
License: | GPL (>=3) |
Version: | 0.2.0 |
Built: | 2025-01-12 20:27:01 UTC |
Source: | https://github.com/stmcg/learner |
This function performs k-fold cross-validation to select the nuisance parameters for
learner
.
cv.learner( Y_source, Y_target, r, lambda_1_all, lambda_2_all, step_size, n_folds = 4, n_cores = 1, control = list() )
cv.learner( Y_source, Y_target, r, lambda_1_all, lambda_2_all, step_size, n_folds = 4, n_cores = 1, control = list() )
Y_source |
matrix containing the source population data, as in |
Y_target |
matrix containing the target population data, as in |
r |
(optional) integer specifying the rank of the knowledge graphs, as in |
lambda_1_all |
vector of numerics specifying the candidate values of |
lambda_2_all |
vector of numerics specifying the candidate values of |
step_size |
numeric scalar specifying the step size for the Newton steps in the numerical optimization algorithm, as in |
n_folds |
an integer specify the number of cross-validation folds. The default is |
n_cores |
an integer specifying the number of CPU cores in parallelization. Parallelization is performed across the different candidate |
control |
a list of parameters for controlling the stopping criteria for the numerical optimization algorithm, as in |
Given sets of candidate values of and
, this function performs k-fold cross-validation to select the pair
with the smallest held out error. This function randomly partitions the entries of
Y_target
into (approximately) equally sized subsamples. The training data sets are obtained by removing one of the
subsamples and the corresponding test data sets are based on the held out subsamples. The
learner
function is applied to each training data set. The held out error is computed by the mean squared error comparing the entries in the test data sets with those imputed from the LEARNER estimates. See McGrath et al. (2024) for further details.
A list with the following elements:
lambda_1_min |
value of |
lambda_2_min |
value of |
mse_all |
matrix containing MSE value for each |
r |
rank value used. |
McGrath, S., Zhu, C,. Guo, M. and Duan, R. (2024). LEARNER: A transfer learning method for low-rank matrix estimation. arXiv preprint arXiv:2412.20605.
res <- cv.learner(Y_source = dat_highsim$Y_source, Y_target = dat_highsim$Y_target, lambda_1_all = c(1, 10, 100), lambda_2_all = c(1, 10, 100), step_size = 0.003)
res <- cv.learner(Y_source = dat_highsim$Y_source, Y_target = dat_highsim$Y_target, lambda_1_all = c(1, 10, 100), lambda_2_all = c(1, 10, 100), step_size = 0.003)
This data set contains simulated data in the source and target populations where there is a high degree of similarity in the underlying latent spaces between these populations.
dat_highsim
dat_highsim
A list containing the observed and true matrices in the source and target populations. The list contains the following components:
Y_source
A matrix of size representing the observed source population matrix.
Y_target
A matrix of size representing the observed target population matrix.
Theta_source
A matrix of size (rank 3) representing the true source population matrix.
Theta_target
A matrix of size (rank 3) representing the true target population matrix.
In this data set, there is a high degree of similarity in the latent spaces between the source and target populations. Specifically, the true source population matrix was obtained by reversing the order of the singular values of the true target population matrix. The observed target population matrix was obtained by adding independent and identically distributed noise to the entries of the true source population matrix. The noise was generated from a normal distribution with mean 0 and standard deviation of 1. The observed source population matrix was generated analogously, where the noise had a standard deviation of 0.5.
This data set contains simulated data in the source and target populations where there is a moderate degree of similarity in the underlying latent spaces between these populations.
dat_modsim
dat_modsim
A list containing the observed and true matrices in the source and target populations. The list contains the following components:
Y_source
A matrix of size representing the observed source population matrix.
Y_target
A matrix of size representing the observed target population matrix.
Theta_source
A matrix of size (rank 3) representing the true source population matrix.
Theta_target
A matrix of size (rank 3) representing the true target population matrix.
In this data set, there is a moderate degree of similarity in the latent spaces between the source and target populations. Specifically, the true source population matrix was obtained by (i) reversing the order of the singular values of the true target population matrix and (ii) adding perturbations to the left and right singular vectors of the true target population matrix. The observed target population matrix was obtained by adding independent and identically distributed noise to the entries of the true source population matrix. The noise was generated from a normal distribution with mean 0 and standard deviation of 1. The observed source population matrix was generated analogously, where the noise had a standard deviation of 0.5.
This function applies the Direct project LatEnt spAce-based tRaNsfer lEaRning (D-LEARNER) method (McGrath et al. 2024) to leverage data from a source population to improve estimation of a low rank matrix in an underrepresented target population.
dlearner(Y_source, Y_target, r)
dlearner(Y_source, Y_target, r)
Y_source |
matrix containing the source population data |
Y_target |
matrix containing the target population data |
r |
(optional) integer specifying the rank of the knowledge graphs. By default, ScreeNOT (Donoho et al. 2023) is applied to the source population knowledge graph to select the rank. |
Data and notation:
The data consists of a matrix in the target population and the source population
.
Let
denote the truncated singular value decomposition (SVD) of
,
.
For , one can view
as a noisy version of
, referred to as the knowledge graph. The target of inference is the target population knowledge graph,
.
Estimation:
This method estimates by
.
A list with the following components:
dlearner_estimate |
matrix containing the D-LEARNER estimate of the target population knowledge graph. |
r |
rank value used. |
Donoho, D., Gavish, M. and Romanov, E. (2023). ScreeNOT: Exact MSE-optimal singular value thresholding in correlated noise. The Annals of Statistics, 51(1), pp.122-148.
res <- dlearner(Y_source = dat_highsim$Y_source, Y_target = dat_highsim$Y_target)
res <- dlearner(Y_source = dat_highsim$Y_source, Y_target = dat_highsim$Y_target)
This function applies the LatEnt spAce-based tRaNsfer lEaRning (LEARNER) method (McGrath et al. 2024) to leverage data from a source population to improve estimation of a low rank matrix in an underrepresented target population.
learner(Y_source, Y_target, r, lambda_1, lambda_2, step_size, control = list())
learner(Y_source, Y_target, r, lambda_1, lambda_2, step_size, control = list())
Y_source |
matrix containing the source population data |
||||||
Y_target |
matrix containing the target population data |
||||||
r |
(optional) integer specifying the rank of the knowledge graphs. By default, ScreeNOT (Donoho et al. 2023) is applied to the source population knowledge graph to select the rank. |
||||||
lambda_1 |
numeric scalar specifying the value of |
||||||
lambda_2 |
numeric scalar specifying the value of |
||||||
step_size |
numeric scalar specifying the step size for the Newton steps in the numerical optimization algorithm |
||||||
control |
a list of parameters for controlling the stopping criteria for the numerical optimization algorithm. The list may include the following components:
|
Data and notation:
The data consists of a matrix in the target population and the source population
.
Let
denote the truncated singular value decomposition (SVD) of
,
.
For , one can view
as a noisy version of
, referred to as the knowledge graph. The target of inference is the target population knowledge graph,
.
Estimation:
This method estimates by
, where
is the solution to the following optimization problem
where and
.
This function uses an alternating minimization strategy to solve the optimization problem. That is, this approach updates by minimizing the objective function (via a gradient descent step) treating
as fixed. Then,
is updated treating
as fixed. These updates of
and
are repeated until convergence.
A list with the following elements:
learner_estimate |
matrix containing the LEARNER estimate of the target population knowledge graph |
objective_values |
numeric vector containing the values of the objective function at each iteration |
convergence_criterion |
integer specifying the criterion that was satisfied for terminating the numerical optimization algorithm. A value of 1 indicates the convergence threshold was satisfied; A value of 2 indicates that the maximum number of iterations was satisfied; A value of 3 indicates that the maximum value of the objective function was satisfied. |
r |
rank value used. |
McGrath, S., Zhu, C,. Guo, M. and Duan, R. (2024). LEARNER: A transfer learning method for low-rank matrix estimation. arXiv preprint arXiv:2412.20605.
Donoho, D., Gavish, M. and Romanov, E. (2023). ScreeNOT: Exact MSE-optimal singular value thresholding in correlated noise. The Annals of Statistics, 51(1), pp.122-148.
res <- learner(Y_source = dat_highsim$Y_source, Y_target = dat_highsim$Y_target, lambda_1 = 1, lambda_2 = 1, step_size = 0.003)
res <- learner(Y_source = dat_highsim$Y_source, Y_target = dat_highsim$Y_target, lambda_1 = 1, lambda_2 = 1, step_size = 0.003)