% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/LSTM.R
\name{LSTM}
\alias{LSTM}
\title{A pre-trained Long Short Term Memory (LSTM) Network for Determining the Number of Factors}
\usage{
LSTM(
  response,
  cor.type = "pearson",
  use = "pairwise.complete.obs",
  vis = TRUE,
  plot = TRUE
)
}
\arguments{
\item{response}{A required \code{N} × \code{I} matrix or data.frame consisting of the responses of \code{N} individuals
to \code{I} items.}

\item{cor.type}{A character string indicating which correlation coefficient (or covariance) is to be computed. One of "pearson" (default),
"kendall", or "spearman". @seealso \code{\link[stats]{cor}.}}

\item{use}{An optional character string giving a method for computing covariances in the presence of missing values. This
must be one of the strings "everything", "all.obs", "complete.obs", "na.or.complete", or "pairwise.complete.obs" (default).
@seealso \code{\link[stats]{cor}}.}

\item{vis}{A Boolean variable that will print the factor retention results when set to TRUE, and will not print
when set to FALSE. (default = TRUE)}

\item{plot}{A Boolean variable that will print the NN plot when set to TRUE, and will not print it when set to
FALSE. (Default = TRUE)}
}
\value{
An object of class \code{LSTM} is a \code{list} containing the following components:
\item{nfact}{The number of factors to be retained.}
\item{features}{A matrix (1×20) containing all the features for determining the number of
      factors by the LSTM.}
\item{probability}{A matrix containing the probabilities for factor numbers ranging from 1
                   to 10 (1x10), where the number in the \eqn{f}-th column represents the probability
                   that the number of factors for the response is \eqn{f}.}
}
\description{
This function will invoke a pre-trained Long Short Term Memory (LSTM) Network that can reliably
perform the task of determining the number of factors. The maximum number of
factors that the network can discuss is 10. The LSTM model is implemented in Python
and trained on PyTorch (https://pytorch.org/) with
CUDA 12.6 for acceleration. After training, the LSTM were saved as \code{LSTM.onnx}
file. The \code{LSTM} function performs inference by loading the \code{LSTM.onnx}
file in both Python and R environments. Therefore, please note that Python (suggested >= 3.11) and the
libraries \code{numpy} and \code{onnxruntime} are required. @seealso \code{\link[LSTMfactors]{check_python_libraries}}

To run this function, Python (suggested >= 3.11) is required, along with the installation of \code{numpy} and
\code{onnxruntime}. See more in Details and Note.
}
\details{
A total of 1,000,000 datasets (\code{\link[LSTMfactors]{data.datasets.LSTM}}) were simulated
to extract features for training LSTM. Each dataset was generated following the methods described
by Auerswald & Moshagen (2019) and Goretzko & Buhner (2020),
with the following specifications:

\itemize{
  \item Factor number: \emph{F} ~ U[1,10]
  \item Sample size: \emph{N} ~ U[100,1000]
  \item Number of variables per factor: \emph{vpf} ~ [3,10]
  \item Factor correlation: \emph{fc} ~ U[0.0,0.5]
  \item Primary loadings: \emph{pl} ~ U[0.35,0.80]
  \item Cross-loadings: \emph{cl} ~ U[-0.2,0.2]
}

A population correlation matrix was created for each data set based on the following decomposition:
\deqn{\mathbf{\Sigma} = \mathbf{\Lambda} \mathbf{\Phi} \mathbf{\Lambda}^T + \mathbf{\Delta}}
where \eqn{\mathbf{\Lambda}} is the loading matrix, \eqn{\mathbf{\Phi}} is the factor correlation
matrix, and \eqn{\mathbf{\Delta}} is a diagonal matrix,
with \eqn{\mathbf{\Delta} = 1 - \text{diag}(\mathbf{\Lambda} \mathbf{\Phi} \mathbf{\Lambda}^T)}.
The purpose of \eqn{\mathbf{\Delta}} is to ensure that the diagonal elements of \eqn{\mathbf{\Sigma} } are 1.

The response data for each subject was simulated using the following formula:
\deqn{X_i = L_i + \epsilon_i, \quad 1 \leq i \leq I}
where \eqn{L_i} follows a normal distribution \eqn{N(0, \sigma)}, representing the contribution of latent factors,
and \eqn{\epsilon_i} is the residual term following a standard normal distribution. \eqn{L_i} and \eqn{\epsilon_i}
are uncorrelated, and \eqn{\epsilon_i} and \eqn{\epsilon_j} are also uncorrelated.

For each simulated dataset, a total of 2 types of features (@seealso \code{\link[LSTMfactors]{extractor.feature}}).
These features are as follows:
\describe{
  \item{(1)}{The top 10 largest eigenvalues.}
  \item{(2)}{The difference of the top 10 largest eigenvalues to the corresponding reference eigenvalues from
             arallel Analysis (PA). @seealso \link[EFAfactors]{PA}}
}
The two types of features above were treated as sequence data with a time step of 10 to train the LSTM model,
resulting in a final classification accuracy of 0.847.

The LSTM model is implemented in Python and trained on PyTorch (https://download.pytorch.org/whl/cu126) with
CUDA 12.6 for acceleration. After training, the LSTM was saved as a \code{LSTM.onnx} file. The \code{NN} function
performs inference by loading the \code{LSTM.onnx} file in both Python and R environments.
}
\note{
Note that Python (suggested >= 3.11) and the libraries \code{numpy} and \code{onnxruntime} are required.

First, please ensure that Python is installed on your computer and that Python is
included in the system's PATH environment variable. If not,
please download and install it from the official website (https://www.python.org/).

If you encounter an error when running this function stating that the \code{numpy} and \code{onnxruntime}
modules are missing:

 \code{Error in py_module_import(module, convert = convert) :}

   \code{ModuleNotFoundError: No module named 'numpy'}

or

 \code{Error in py_module_import(module, convert = convert) :}

   \code{ModuleNotFoundError: No module named 'onnxruntime'}

this means that the \code{numpy} or \code{onnxruntime} library is missing from your Python environment.
The \code{\link[LSTMfactors]{check_python_libraries}} function can help you install these two dependency libraries.

Of course, you can also choose not to use the \code{\link[LSTMfactors]{check_python_libraries}} function. You can
directly install the \code{numpy} or \code{onnxruntime} library using the appropriate commands.
If you are using Windows or macOS, please run the command \code{pip install numpy} or \code{pip install onnxruntime}
in Command Prompt or Windows PowerShell (Windows), or Terminal (macOS). If you are using Linux, please ensure that
\code{pip} is installed and use the command \code{pip install numpy} or \code{pip install onnxruntime} to install
the missing libraries.
}
\references{
Auerswald, M., & Moshagen, M. (2019). How to determine the number of factors to retain in exploratory factor analysis: A comparison of extraction methods under realistic conditions. Psychological methods, 24(4), 468-491. https://doi.org/10.1037/met0000200.

Goretzko, D., & Buhner, M. (2020). One model to rule them all? Using machine learning algorithms to determine the number of factors in exploratory factor analysis. Psychol Methods, 25(6), 776-786. https://doi.org/10.1037/met0000262.
}
\author{
Haijiang Qin <Haijiang133@outlook.com>
}
