% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/regCNN.R
\name{regCNN}
\alias{regCNN}
\alias{regCNN.default}
\alias{regCNN.formula}
\title{Condensed Nearest Neighbors for Regression}
\usage{
\method{regCNN}{default}(x, y, t = 0.2, ...)

\method{regCNN}{formula}(formula, data, ...)
}
\arguments{
\item{x}{a data frame of input attributes.}

\item{y}{a double vector with the output regressand of each sample.}

\item{t}{a double in [0,1] with the \emph{threshold} used by regression noise filter (default: 0.2).}

\item{...}{other options to pass to the function.}

\item{formula}{a formula with the output regressand and, at least, one input attribute.}

\item{data}{a data frame in which to interpret the variables in the formula.}
}
\value{
The result of applying the regression filter is a reduced dataset containing the clean samples (without errors or noise), since it removes noisy samples (those with errors).
This function returns an object of class \code{rfdata}, which contains information related to the noise filtering process in the form of a list with the following elements:
\item{xclean}{a data frame with the input attributes of clean samples (without errors).}
\item{yclean}{a double vector with the output regressand of clean samples (without errors).}
\item{numclean}{an integer vector with the amount of clean samples.}
\item{idclean}{an integer vector with the indices of clean samples.}
\item{xnoise}{a data frame with the input attributes of noisy samples (with errors).}
\item{ynoise}{a double vector with the output regressand of noisy samples (with errors).}
\item{numnoise}{an integer vector with the amount of noisy samples.}
\item{idnoise}{an integer vector with the indices of noisy samples.}
\item{filter}{the full name of the noise filter used.}
\item{param}{a list of the argument values.}
\item{call}{the function call.}

Note that objects of the class \code{rfdata} support \link{print.rfdata} and \link{summary.rfdata} methods.
}
\description{
Application of the regCNN noise filtering method in a regression dataset.
}
\details{
\emph{Condensed Nearest Neighbors} (CNN) seeks to obtain a data subset that improves the quality of the original dataset.
In classification problems, CNN performs a first classification and stores all the samples that are misclassified.
Then, those stored samples are taken as a training set. The process stops when all the unstored samples are correctly classified.
The implementation of this noise filter to be used in regression problems follows the proposal of Martín \emph{et al.} (2021),
which is based on the use of a noise threshold (\code{t}) to determine the similarity between the output variable of the samples.
}
\examples{
# load the dataset
data(rock)

# usage of the default method
set.seed(9)
out.def <- regCNN(x = rock[,-ncol(rock)], y = rock[,ncol(rock)])

# show results
summary(out.def, showid = TRUE)

# usage of the method for class formula
set.seed(9)
out.frm <- regCNN(formula = perm ~ ., data = rock)

# check the match of noisy indices
all(out.def$idnoise == out.frm$idnoise)

}
\references{
L. Devroye, L. Gyorfi and G. Lugosi,
\strong{Condensed and edited nearest neighbor rules.}
\emph{in: A Probabilistic Theory of Pattern Recognition}, 303-313, 1996.
\doi{https://doi.org/10.1007/978-1-4612-0711-5_19}.

J. Martín, J. A. Sáez and E. Corchado,
\strong{On the regressand noise problem: Model robustness and synergy with regression-adapted noise filters.}
\emph{IEEE Access}, 9:145800-145816, 2021.
\doi{https://doi.org/10.1109/ACCESS.2021.3123151}.
}
\seealso{
\code{\link{regRNN}}, \code{\link{regENN}}, \code{\link{regBBNR}}, \code{\link{print.rfdata}}, \code{\link{summary.rfdata}}
}
