% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/vs_optimize_truncqual.R
\name{vs_optimize_truncqual}
\alias{vs_optimize_truncqual}
\alias{optimize_truncqual}
\title{Optimize read truncation with truncqual}
\usage{
vs_optimize_truncqual(
  fastq_input,
  reverse = NULL,
  minovlen = 10,
  truncqual_range = 1:20,
  minlen = 1,
  min_size = 2,
  maxee_rate = 0.01,
  threads = 1,
  plot_title = TRUE,
  tmpdir = NULL
)
}
\arguments{
\item{fastq_input}{(Required). A FASTQ file path, FASTQ tibble (forward
reads), or a paired-end tibble of class \code{"pe_df"}. See \emph{Details}.}

\item{reverse}{(Optional). A FASTQ file path or FASTQ tibble (reverse reads).
Optional if \code{fastq_input} is a \code{"pe_df"} object.}

\item{minovlen}{(Optional). Minimum overlap between the merged reads. Must be
at least 5. Defaults to \code{10}.}

\item{truncqual_range}{(Optional). A numeric vector of \code{truncqual}
values to test. Sequences are truncated starting from the first base with the
specified base quality score or lower. Defaults to \code{1:20}.}

\item{minlen}{(Optional). Minimum number of bases a sequence must have to be
retained. Defaults to \code{0}. See \emph{Details}.}

\item{min_size}{(Optional). Minimum copy number (size) for a merged read to
be included in the results. Defaults to \code{2}.}

\item{maxee_rate}{(Optional). Threshold for average expected error. Must
range from \code{0.0} to \code{1.0}. Defaults to \code{0.01}. See
\emph{Details}.}

\item{threads}{(Optional). Number of computational threads to be used by
\code{VSEARCH}. Defaults to \code{1}.}

\item{plot_title}{(Optional). If \code{TRUE} (default), a summary title will
be displayed in the plot. Set to \code{FALSE} for no title.}

\item{tmpdir}{(Optional). Path to the directory where temporary files should
be written when tables are used as input or output. Defaults to
\code{NULL}, which resolves to the session-specific temporary directory
(\code{tempdir()}).}
}
\value{
A data frame with the following columns:
\itemize{
  \item \code{truncqual_value}: Tested \code{truncqual} value.
  \item \code{merged_read_pairs}: Count of merged read-pairs with a copy
  number above \code{min_size} after dereplication.
  \item \code{R1_length}: Average length of R1-reads after trimming.
  \item \code{R2_length}: Average length of R2-reads after trimming.
}

The returned data frame has an attribute named \code{"plot"} containing a
\code{\link[ggplot2]{ggplot2}} object based on the returned data frame. The
plot visualizes \code{truncqual} values against \code{merged_read_pairs},
\code{R1_length}, and \code{R2_length}, with the optimal \code{truncqual}
value marked by a red dashed line.

Additionally, the returned data frame has an attribute named
\code{"optimal_truncqual"} containing the optimal \code{truncqual} value.
}
\description{
\code{vs_optimize_truncqual} optimizes the truncation parameter
\code{truncqual} to achieve the best possible merging results. The function
iterates through a specified range of \code{truncqual} values to identify the
optimal value that maximizes the proportion of high-quality merged read pairs.
}
\details{
The function uses \code{\link{vs_fastq_mergepairs}},
\code{\link{vs_fastx_trim_filt}}, and \code{\link{vs_fastx_uniques}} where
the arguments to this functions are described in detail.

If \code{fastq_input} has class \code{"pe_df"}, the reverse reads will be
automatically extracted from the \code{"reverse"} attribute unless
explicitly provided in the \code{reverse} argument.

The best possible truncation option (\code{truncqual}) for merging is
measured by the number of merged read-pairs with a copy number above the
number specified by \code{min_size} after dereplication.

Changing \code{min_size} will affect the results. A low \code{min_size} will
include merged sequences with a lower copy number after dereplication, and a
higher \code{min_size} will filter out more reads and only count
high-frequency merged sequences.
}
\examples{
\dontrun{
# Define arguments
R1.file <- file.path(file.path(path.package("Rsearch"), "extdata"),
                     "small_R1.fq")
R2.file <- file.path(file.path(path.package("Rsearch"), "extdata"),
                     "small_R1.fq")

# Run optimizing function
optimize.tbl <- vs_optimize_truncqual(fastq_input = R1.file,
                                      reverse = R2.file)

# Display plot
print(attr(optimize.tbl, "plot"))

}

}
\references{
\url{https://github.com/torognes/vsearch}
}
\seealso{
\code{\link{vs_fastq_mergepairs}}, \code{\link{vs_fastx_trim_filt}},
\code{\link{vs_fastx_uniques}}
}
