% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/physignal.z.r
\name{physignal.z}
\alias{physignal.z}
\title{Assessing phylogenetic signal effect size in Procrustes shape variables}
\usage{
physignal.z(
  A,
  phy,
  lambda = c("burn", "mean", "front", "all"),
  iter = 999,
  seed = NULL,
  tol = 1e-04,
  PAC.no = NULL,
  print.progress = FALSE,
  verbose = FALSE
)
}
\arguments{
\item{A}{A matrix (n x [p x k]) or 3D array (p x k x n) containing Procrustes shape variables for a set of specimens}

\item{phy}{A phylogenetic tree of class = "phylo" - see \code{\link[ape]{read.tree}} in library ape}

\item{lambda}{An indication for how lambda should be optimized.  This can be a numeric value between 0 and 1 to override 
optimization.  Alternatively, it can be one of four methods: burn-in ("burn"); mean lambda across all data dimensions
("mean"); use a subset of all data dimensions where phylogenetic signal is possible, based on a front-loading
of phylogenetic signal in the lowest phylogenetically aligned 
components ("front"); or use of all data dimensions to calculate the log-likelihood in the optimization process ("all").  See Details
for more explanation.}

\item{iter}{Number of iterations for significance testing}

\item{seed}{An optional argument for setting the seed for random permutations of the resampling procedure.
If left NULL (the default), the exact same P-values will be found for repeated runs of the analysis (with the same number of iterations).
If seed = "random", a random seed will be used, and P-values will vary.  One can also specify an integer for specific seed values,
which might be of interest for advanced users.}

\item{tol}{A value indicating the magnitude below which phylogenetically aligned components should 
be omitted. (Components are omitted if their standard deviations are less than or equal to tol times 
the standard deviation of the first component.)  See \code{\link[RRPP]{ordinate}} for more details.}

\item{PAC.no}{The number of phylogenetically aligned components (PAC) on which to project the data.  Users can
choose fewer PACs than would be found, naturally.}

\item{print.progress}{A logical value to indicate whether a progress bar should be printed to the screen.
This is helpful for long-running analyses.}

\item{verbose}{Whether to include verbose output, including phylogenetically-aligned
component analysis (PACA), plus K, lambda,
and the log-likelihood profiles across PAC dimensions.}
}
\value{
Function returns a list with the following components: 
  \item{Z}{Phylogenetic signal effect size.}
  \item{pvalue}{The significance level of the observed signal.}
  \item{random.detR}{The determinants of the rates matrix for all random permutations.}
  \item{random.logL}{The log-likelihoods for all random permutations.}
  \item{lambda}{The optimized lambda value.}
  \item{K}{The multivariate K value.}  
  \item{PACA}{A phylogenetically aligned component analysis, based on OLS residuals.}
  \item{K.by.p}{The multivariate K value in 1, 1:2, 1:3, ..., 1:p dimensions, for the 
  p components from PACA.}   
  \item{lambda.by.p}{The optimized lambda in 1, 1:2, 1:3, ..., 1:p dimensions, for the 
  p components from PACA.}   
  \item{logL.by.p}{The log-likelihood in 1, 1:2, 1:3, ..., 1:p dimensions, for the 
  p components from PACA.}
  \item{permutations}{The number of random permutations used in the resampling procedure.}
  \item{opt.method}{Method of optimization used for lambda.}
  \item{opt.dim}{Number of data dimensions for which optimization was performed.}
  \item{call}{The matched call}
}
\description{
Function calculates the degree of phylogenetic signal from Procrustes shape variables, using
effect size estimated from likelihood.
}
\details{
The function estimates the degree of phylogenetic signal present in Procrustes shape variables for a given
phylogeny, based on the standardized location of the log-likelihood in a distribution generated from
randomization of residuals in a permutation procedure (RRPP). Unlike the analysis performed in
\code{\link{physignal}}, this function finds the optimal branch-length transformation, lambda, which maximizes 
likelihood.  This step results in better statistical properties (Collyer et al. 2022), and the effect
size (Z) -- the standardized location of the observed log-likelihood in its sampling distribution -- has a linear 
relationship with lambda.  Although, Pagel's lambda (Pagel et al. 1999) and Blomberg's K (Blomberg et al. 2003;
Adams 2014) are frequently used as measures of the amount of phylogenetic signal (and both are also reported), 
the Z-score is an effect size that can be compared among different traits 
(see \code{\link{compare.physignal.z}}), and it more precisely describes the fit of data to a tree 
(Collyer et al. 2022).

It is assumed that the landmarks have previously been aligned 
using Generalized Procrustes Analysis (GPA) [e.g., with \code{\link{gpagen}}].  Multivariate data are subjected
to a phylogenetically aligned component analysis to assure that data dimensions do not exceed the number of 
observations, making estimation of multivariate log-likelihoods possible.  The tolerance argument can be used to assure
an appropriate number of data dimensions are used.  The number of phylogenetically aligned components (PAC.no) 
can also be used to estimate log-likelihood in fewer dimensions.  This might be necessary for some data sets,
as residual covariance matrices can be ill-conditioned, especially if the number of variables exceeds the number
of observations.
It is assumed that lambda is constant for all shape variables.  It would be possible to consider, for example,
whether phylogenetic signal is consistent among different modules via separate analyses, followed by
analysis with \code{\link{compare.physignal.z}}.  This could be facilitated by using 
\code{\link{coords.subset}} to partition shape data. 
This function can also be used with univariate data (i.e. centroid size) if imported as matrix with rownames
giving the taxa names. 

Optimization of the tree branch scaling parameter, lambda, is not a process with a single, resolved solution.  For
univariate data, optimization is fairly simple: lambda is optimized at the value that yields maximum likelihood.  For
multivariate data, generalization is not straightforward.  Although this function currently assumes all variables 
in multivariate data have the same lambda, maximizing likelihood over all variables can ignore strong phylogenetic signal
restricted to just some of the variables.  Therefore, optimization can can consider the latent strength of phylogenetic 
signal in different ways.  Below is a summary of optimization methods, including advantages or disadvantages that might be
incurred.  Future versions of this function might update optimization methods, as more research affirms better methods.

\describe{
\item{\bold{burn}}{  A burn-in optimization samples a spectrum of lambda from 0 to 1, finding the effect
size, Z, from several distributions of log-likelihoods, in order to find the lambda value that maximizes the
effect size rather than the log-likelihood.  Once this value of lambda is obtained, the requested number
of permutations are run and the final effect size is estimated from the distribution of random log-likelihoods
generated.  This is perhaps the best optimization method to assure that the result effect size is 
maximized, but it requires considerably more time than the other methods.}
\item{\bold{mean}}{  Lambda is optimized for every variable and the mean of the lambdas is used as the 
optimized lambda for estimating multivariate log-likelihoods.  This method will guard against a tendency
for multivariate optimization to be biased toward 0 or 1, but will also tend to find intermediate values
of lambda, even if there is strong phylogenetic signal for some variables.}
\item{\bold{front}}{  This method performs phylogenetically aligned component analysis (PACA) and
determines the relevant number of components for which covariation between data and phylogeny is 
possible (as many or fewer than the number of variables).  PACA front-loads phylogenetic signal in first 
components. Thus, lambda is optimized with multivariate likelihood
but only for a portion of the full data space where phylogenetic signal might be found.  This method 
will tend to bias lambda toward 1, making analysis more similar to using multivariate K, as in
\code{\link{physignal}}.}
\item{\bold{all}}{  This method simply optimizes lambda using a multivariate likelihood for optimization.  
This is the simplest generalization of univariate lambda optimization, but it will tend to optimize lambda
at 0 or 1.}
\item{\bold{numeric override}}{  Users can override optimization, by choosing a fixed value.  Choosing
lambda = 1 will be fundamentally the same as performing the analysis on multivariate K, as in
\code{\link{physignal}}.}

}

 The generic functions, \code{\link{print}}, \code{\link{summary}}, and \code{\link{plot}} all work with \code{\link{physignal}}.
 The generic function, \code{\link{plot}}, produces a histogram of random log-likelihoods, 
 associated with the resampling procedure.
}
\examples{
\dontrun{
data(plethspecies) 
Y.gpa <- gpagen(plethspecies$land)    #GPA-alignment    

# Test for phylogenetic signal in shape
PS.shape <- physignal.z(A = Y.gpa$coords, phy = plethspecies$phy, 
lambda = "front")
summary(PS.shape)

# Problem with ill-conditioned residual covariance matrix; try shaving one dimension

PS.shape <- physignal.z(A = Y.gpa$coords, phy = plethspecies$phy, 
lambda = "front", PAC.no = 7)
summary(PS.shape)
plot(PS.shape)
plot(PS.shape$PACA, phylo = TRUE)

# Test for phylogenetic signal in size
PS.size <- physignal.z(A = Y.gpa$Csize, phy = plethspecies$phy, 
lambda = "front")
summary(PS.size)
plot(PS.size)
}
}
\references{
Collyer,  M.L., E.K. Baken, & D.C. Adams.  2022. A standardized effect size for evaluating
and comparing the strength of phylogenetic signal. Methods in Ecology and Evolution. 13:367-382.

Blomberg SP, Garland T, Ives AR. 2003. Testing for phylogenetic signal in comparative 
data: behavioral traits are more labile. Evolution, 57:717-745.

Adams, D.C. 2014. A generalized K statistic for estimating phylogenetic signal from shape and 
other high-dimensional multivariate data. Systematic Biology.  63:685-697.

Adams, D.C. and M.L. Collyer. 2019. Comparing the strength of modular signal, and evaluating 
alternative modular hypotheses, using covariance ratio effect sizes with morphometric data. 
Evolution. 73:2352-2367.

Pagel, M. D. (1999). Inferring the historical patterns of biological evolution. Nature. 401:877-884.
}
\seealso{
\code{\link{gm.prcomp}}, \code{\link{physignal}}
}
\author{
Michael Collyer
}
\keyword{analysis}
