% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/mpp_CV.R
\name{mpp_CV}
\alias{mpp_CV}
\title{MPP cross-validation}
\usage{
mpp_CV(
  pop.name = "MPP_CV",
  trait.name = "trait1",
  mppData,
  trait = 1,
  her = 1,
  Rep = 10,
  k = 5,
  Q.eff = "cr",
  thre.cof = 3,
  win.cof = 50,
  N.cim = 1,
  window = 20,
  thre.QTL = 3,
  win.QTL = 20,
  backward = TRUE,
  alpha.bk = 0.05,
  n.cores = 1,
  verbose = TRUE,
  output.loc
)
}
\arguments{
\item{pop.name}{\code{Character} name of the studied population.
Default = "MPP_CV".}

\item{trait.name}{\code{Character} name of the studied trait.
Default = "trait1".}

\item{mppData}{An object of class \code{mppData}.}

\item{trait}{\code{Numerical} or \code{character} indicator to specify which
trait of the \code{mppData} object should be used. Default = 1.}

\item{her}{\code{Numeric} value between 0 and 1 representing the heritability
of the trait. \code{her} can be a single value or a vector specifying each
within cross heritability. Default = 1.}

\item{Rep}{\code{Numeric} value representing the number of repetition of the
k-fold procedure. Default = 10.}

\item{k}{\code{Numeric} value representing the number of folds for the within
cross partition of the population. Default = 5.}

\item{Q.eff}{\code{Character} expression indicating the assumption concerning
the QTL effects: 1) "cr" for cross-specific; 2) "par" for parental; 3) "anc"
for ancestral; 4) "biall" for a bi-allelic. For more details see
\code{\link{mpp_SIM}}. Default = "cr".}

\item{thre.cof}{\code{Numeric} value representing the -log10(p-value)
threshold above which a position can be peaked as a cofactor. Default = 3.}

\item{win.cof}{\code{Numeric} value in centi-Morgan representing the minimum
distance between two selected cofactors. Default = 50.}

\item{N.cim}{\code{Numeric} value specifying the number of time the CIM
analysis is repeated. Default = 1.}

\item{window}{\code{Numeric} distance (cM) on the left and the right of a
cofactor position where it is not included in the model. Default = 20.}

\item{thre.QTL}{\code{Numeric} value representing the -log10(p-value)
threshold above which a position can be selected as QTL. Default = 3.}

\item{win.QTL}{\code{Numeric} value in centi-Morgan representing the minimum
distance between two selected QTLs. Default = 20.}

\item{backward}{\code{Logical} value. If \code{backward = TRUE},
the function performs a backward elimination on the list of selected QTLs.
Default = TRUE.}

\item{alpha.bk}{\code{Numeric} value indicating the significance level for
the backward elimination. Terms with p-values above this value will
iteratively be removed. Default = 0.05.}

\item{n.cores}{\code{Numeric}. Specify here the number of cores you like to
use. Default = 1.}

\item{verbose}{\code{Logical} value indicating if the progresses of the CV
should be printed. Default = TRUE.}

\item{output.loc}{Path where a folder will be created to save the results.}
}
\value{
\code{List} containing the following results items:

\item{CV_res}{\code{Data.frame} containing for each CV run: 1) the number
of detected QTL; 2) the proportion of explained genetic variance in the TS
(p.ts); 3) the proportion of predicted genetic variance in the VS (p.vs) at
the population level (average of within cross prediction); the bias between
p.ts and p.vs (bias = 1-(p.vs/p.ts)).}

\item{p.vs.cr}{\code{Matrix} containing the within cross p.vs for each CV run.}

\item{QTL}{\code{Data.frame} containing: 1) the list of QTL position detected
at least one time during the entire CV process; 2) the number of times
the position has been detected; 3) the average partial p.ts of the QTL
position; 4) the average partial p.vs of the QTL position; 5) the average
partial bias of the QTL position.}

\item{QTL.profiles}{\code{Data.frame} -log10(p-value) QTL profiles of the
different CV runs.}


The results elements return as R object are also saved as text
files at the specified output location (\code{output.loc}). A transparency
plot of the CV results (plot.pdf) is also saved.
}
\description{
Evaluation of MPP QTL detection procedure by cross-validation (CV).
}
\details{
For details on the MPP QTL detection models see \code{\link{mpp_SIM}}
documentation. The CV scheme is adapted from Utz et al. (2000) to the MPP
context. A single CV run works like that:

\enumerate{

\item{Generation of a k-fold partition of the data. The partition is done
within crosses. Each cross is divided into k subsets. Then for the kth
repetition, the kth subset is used as validation set, the rest goes into the
training set.}

\item{For the kth repetition, utilization of the training set for cofactor
selection and multi-QTL model determination (\code{\link{mpp_SIM}} and
\code{\link{mpp_CIM}}). If \code{backward = TRUE}, the final list of QTLs is
tested simultaneously using a backward elimination
(\code{\link{mpp_back_elim}}).}

\item{Use the list of detected QTLs in the training set to calculate
the proportion of genetic variance explained by all detected QTLs in the
training set (p.ts = R2.ts/h2). Where R2.ts is the adjusted
R squared and h2 is the average within cross heritability (\code{her}). By
default, her = 1, which mean that

For each single QTL effect, difference partial R squared are also
calculated. Difference R squared are computed by doing the difference between
a model with all QTLs and a model without the ith position. For details about R
squared computation and adjustment look at \code{\link{QTL_R2}}.}

\item{Use the estimates of the QTL effects in the training set (B.ts) to
predict the phenotypic values of the validation set. y.pred.vs = X.vs*B.ts.
Computes the predicted R squared  in the validation set using the squared
Pearson correlation coefficient between the real values (y.vs) and the
predicted values (y.pred.vs). R2.vs = cor(y.ts,y.pred.ts)^2. Then
the predicted genetic variance in the validation set (p.vs) is equal to
p.vs = R2.vs/h2. For heritability correction, the user can provide a single
value for the average within cross heritability or a vector specifying each
within cross heritability. By default, \code{her = 1}, which means that the
results represent the proportion of phenotypic variance explained (predicted)
in the training (validation) sets.

The predicted R squared is computed per cross and then averaged
at the population level (p.ts). Both results are returned. Partial QTL
predicted R squared are also calculated using the difference between the
predicted R squared using all QTL and the predicted R squared without QTL i.
The bias between p.ts and p.vs is calculated as bias = 1 - (p.vs/p.ts).

  }

}
}
\examples{

\dontrun{

data(mppData)

# Specify a location where your results will be saved
my.loc <- tempdir()

CV <- mpp_CV(pop.name = "USNAM", trait.name = "ULA", mppData = mppData,
her = .4, Rep = 1, k = 3, verbose = FALSE, output.loc = my.loc)

}

}
\references{
Utz, H. F., Melchinger, A. E., & Schon, C. C. (2000). Bias and sampling error
of the estimated proportion of genotypic variance explained by quantitative
trait loci determined from experimental data in maize using cross validation
and validation with independent samples. Genetics, 154(4), 1839-1849.
}
\seealso{
\code{\link{mpp_back_elim}},
\code{\link{mpp_CIM}},
\code{\link{mpp_perm}},
\code{\link{mpp_SIM}},
\code{\link{QTL_R2}}
}
\author{
Vincent Garin
}
