% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/align_lineage.R
\name{repAlignLineage}
\alias{repAlignLineage}
\title{This function aligns all sequences (incliding germline) that belong to one clonal lineage and one cluster.
After clustering and building the clonal lineage and germline, the next step is to analyze the degree of mutation
and maturity of each clonal lineage. This allows for finding high mature cells and cells with a large
number of offspring. The phylogenetic analysis will find mutations that increase the affinity of BCR.
Making alignment of the sequence is the first step towards sequence analysis including BCR.}
\usage{
repAlignLineage(.data,
.min_lineage_sequences, .prepare_threads, .align_threads, .verbose_output, .nofail)
}
\arguments{
\item{.data}{The data to be processed. Can be \link{data.frame}, \link{data.table}
or a list of these objects.}

\item{.min_lineage_sequences}{If number of sequences in the same clonal lineage and the same
cluster (not including germline) is lower than this threshold, this group of sequences
will not be aligned and will not be used in next steps of BCR pipeline
(will be saved in output table only if .verbose_output parameter is set to TRUE).}

\item{.prepare_threads}{Number of threads to prepare results table.
Please note that high number can cause heavy memory usage!}

\item{.align_threads}{Number of threads for lineage alignment.

It must have columns in the immunarch compatible format \link{immunarch_data_format}, and also
must contain 'Cluster' column, which is added by seqCluster() function, and 'Sequence.germline'
column, which is added by repGermline() function.}

\item{.verbose_output}{If TRUE, all output dataframe columns will be included (see documentation about this
function return), and unaligned clusters will be included in the output. Setting this to TRUE significantly
increases memory usage. If FALSE, only aligned clusters and columns required for repClonalFamily() calculation
will be included in the output.}

\item{.nofail}{Will return NA instead of stopping if Clustal W is not installed.
Used to avoid raising errors in examples on computers where Clustal W is not installed.}
}
\value{
Dataframe or list of dataframes (if input is a list with multiple samples).
The dataframe has these columns:
* Cluster: cluster name
* Germline: germline sequence
* Aligned (included if .verbose_output=TRUE): FALSE if this group of sequences was not aligned with lineage
  (.min_lineage_sequences is below the threshold); TRUE if it was aligned
* Alignment: DNAbin object with alignment or DNAbin object with unaligned sequences (if Aligned=FALSE)
* V.length (included if .verbose_output=TRUE): shortest length of V gene part outside of CDR3 region in this
  group of sequences; longer V genes (including germline) are trimmed to this length before alignment
* J.length (included if .verbose_output=TRUE): shortest length of J gene part outside of CDR3 region in this
  group of sequences; longer J genes (including germline) are trimmed to this length before alignment
* Sequences (included if .verbose_output=TRUE): nested dataframe containing all sequences for this combination
  of cluster and germline; it has columns
  Sequence, V.end, J.start, CDR3.start, CDR3.end; all values taken from the input dataframe
}
\description{
Aligns all sequences incliding germline within each clonal lineage within each cluster
}
\examples{

data(bcrdata)
bcr_data <- bcrdata$data

bcr_data \%>\%
  seqCluster(seqDist(bcr_data), .fixed_threshold = 3) \%>\%
  repGermline() \%>\%
  repAlignLineage(.min_lineage_sequences = 2, .align_threads = 2, .nofail = TRUE)
}
\concept{align_lineage}
