% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/selectDVforEV.R
\name{selectDVforEV}
\alias{selectDVforEV}
\title{Select parsimonious sets of derived variables.}
\usage{
selectDVforEV(data, dvdata, alpha = 0.01, dir = NULL, trainmax = NULL)
}
\arguments{
\item{data}{Data frame containing the response variable in the first column
and explanatory variables in subsequent columns. The response variable
should represent presence/background data, coded as: 1/NA. See
\code{\link{readData}}.}

\item{dvdata}{List of data frames, with each data frame containing derived
variables for a given explanatory variable (e.g. the first item in the list
returned by \code{\link{deriveVars}}).}

\item{alpha}{Alpha-level used in F-test comparison of models. Default is
0.01.}

\item{dir}{Directory to which files will be written during subset selection
of derived variables. Defaults to the working directory.}

\item{trainmax}{Integer. Maximum number of uninformed background points to be
used to train the models. May be used to reduce computation time for data
sets with very large numbers of points. Default is no maximum. See Details
for more information.}
}
\value{
List of 2 (3): \enumerate{ \item A list of data frames, with each
  data frame containing \emph{selected} DVs for a given EV. This item is
  recommended as input for \code{dvdata} in \code{\link{selectEV}}. \item A
  list of data frames, where each data frame shows the trail of forward
  selection of DVs for a given EV. \item (If \code{trainmax} reduces the
  number of uninformed background points) a new \code{data} object. See
  details. }
}
\description{
For each explanatory variable (EV), \code{selectDVforEV} selects the
parsimonious set of derived variables (DV) which best explains variation in a
given response variable. The function uses a process of forward selection
based on comparison of nested models by the F-test. A DV is selected for
inclusion when, during nested model comparison, it accounts for a significant
amount of remaining variation, under the alpha value specified by the user.
}
\details{
The F-statistic that \code{selectDVforEV} uses for nested model comparison is
calculated using equation 59 in Halvorsen (2013). See Halvorsen et al. (2015)
for a more detailed explanation of the forward selection procedure.

If the derived variables were created using \code{\link{deriveVars}}, the
same response variable should be used in \code{selectDVforEV}, as the
deviation and spline transformations produced by \code{deriveVars} are
RV-specific.

If \code{trainmax} reduces the number of uninformed background points in the
training data, a new \code{data} object is returned as part of the function
output. This \code{data} object shows which of the uninformed background
points were randomly selected, and should be used together with the selected
DVs in \code{\link{selectEV}} during continued model selection.

Explanatory variables should be uniquely named, and the names must not
contain spaces, underscores, or colons. Underscores and colons are reserved
to denote derived variables and interaction terms repectively.
}
\examples{
\dontrun{
selecteddvs <- selectDVforEV(dat, deriveddat, alpha = 0.0001,
   dir = "D:/path/to/modeling/directory")

# From vignette:
grasslandDVselect <- selectDVforEV(grasslandPO, grasslandDVs[[1]], alpha = 0.001)
summary(grasslandDVs$EVDV)
sum(sapply(grasslandDVs$EVDV, length))
summary(grasslandDVselect$selectedDV)
sum(sapply(grasslandDVselect$selectedDV, length))
}

}
\references{
Halvorsen, R. (2013). A strict maximum likelihood explanation of
  MaxEnt, and some implications for distribution modelling. Sommerfeltia, 36,
  1-132.

Halvorsen, R., Mazzoni, S., Bryn, A., & Bakkestuen, V. (2015).
  Opportunities for improved distribution modelling practice via a strict
  maximum likelihood interpretation of MaxEnt. Ecography, 38(2), 172-183.
}

