% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/NMcheckData.R
\name{NMcheckData}
\alias{NMcheckData}
\title{Check data for Nonmem compatibility or check control stream for
data compatibility}
\usage{
NMcheckData(
  data,
  file,
  covs,
  covs.occ,
  cols.num,
  col.id = "ID",
  col.time = "TIME",
  col.dv = "DV",
  col.mdv = "MDV",
  col.cmt = "CMT",
  col.amt = "AMT",
  col.flagn,
  col.row,
  col.usubjid,
  na.strings,
  return.summary = FALSE,
  quiet = FALSE,
  as.fun
)
}
\arguments{
\item{data}{The data to check. data.frame, data.table, tibble,
anything that can be converted to data.table.}

\item{file}{Alternatively to checking a data object, you can use
file to specify a control stream to check. This can either be
a (working or non-working) input control stream or an output
control stream. In this case, NMdataCheck checks column names
in data against control stream (see NMcheckColnames), reads
the data as NONMEM would do, and do the same checks on the
data as NMdataCheck would do using the data
argument. col.flagn is ignored in this case - instead,
ACCEPT/IGNORE statements in control stream are applied. The
file argument is useful for debugging a Nonmem model.}

\item{covs}{columns that contain subject-level covariates. They
are expected to be non-missing, numeric and not varying within
subjects.}

\item{covs.occ}{A list specifying columns that contain
subject:occasion-level covariates. They are expected to be
non-missing, numeric and not varying within combinations of
subject and occasion. covs.occ=list(PERIOD=c("FED")) means
that FED is the covariate, while PERIOD indicates the
occasion.}

\item{cols.num}{Columns that are expected to be present, numeric
and non-NA. If a character vector is given, the columns are
expected to be used in all rows. If a column is only used for
a subset of rows, use a list and name the elements by
subsetting strings. See examples.}

\item{col.id}{The name of the column that holds the subject
identifier. Default is "ID".}

\item{col.time}{The name of the column holding actual time.}

\item{col.dv}{The name of the column holding the dependent
variable. For now, only one column can be specified, and MDV
is assumed to match this column. Default is DV.}

\item{col.mdv}{The name of the column holding the binary indicator
of the dependent variable missing. Default is MDV.}

\item{col.cmt}{The name(s) of the compartment column(s). These
will be checked to be positive integers for all rows. They are
also used in checks for row duplicates.}

\item{col.amt}{The name of the dose amount column.}

\item{col.flagn}{Optionally, the name of the column holding
numeric exclusion flags. Default value is FLAG and can be
configured using NMdataConf. Disable by using col.flagn=FALSE.}

\item{col.row}{A column with a unique value for each row. Such a
column is recommended to use if possible. Default ("ROW") can
be modified using NMdataConf.}

\item{col.usubjid}{Optional unique subject identifier. It is
recommended to keep a unique subject identifier (typically a
character string including an abbreviated study name and the
subject id) from the clinical datasets in the analysis set. If
you supply the name of the column holding this identifier,
NMcheckData will check that it is non-missing, that it is
unique within values of col.id (i.e. that the analysis subject
ID's are unique across actual subjects), and that col.id is
unique within the unique subject ID (a violation of the latter
is less likely).}

\item{na.strings}{Strings to be accepted when trying to convert
characters to numerics. This will typically be a string that
represents missing values. Default is ".". Notice, actual NA,
i.e. not a string, is allowed independently of na.strings. See
?NMisNumeric.}

\item{return.summary}{If TRUE (not default), the table summary
that is printed if quiet=FALSE is returned as well. In that
case, a list is returned, and the findings are in an element
called findings.}

\item{quiet}{Keep quiet? Default is not to.}

\item{as.fun}{The default is to return data as a data.frame. Pass
a function (say tibble::as_tibble) in as.fun to convert to
something else. If data.tables are wanted, use
as.fun="data.table". The default can be configured using
NMdataConf.}
}
\description{
Check data in various ways for compatibility with Nonmem. Some
findings will be reported even if they will not make Nonmem fail
but because they are typical dataset issues.
}
\details{
The following checks are performed. The term "numeric"
    does not refer to a numeric representation in R, but
    compatibility with Nonmem. The character string "2" is in this
    sense a valid numeric, "id2" is not.  \itemize{

\item Column
    names must be unique and not contain special characters

\item If an exclusion flag is used (for ACCEPT/IGNORE in Nonmem),
    elements must be non-missing and integers. If an exclusion
    flag is found, the rest of the checks are performed on rows
    where that flag equals 0 (zero) only.

\item If a unique row identifier is found, it has to be
non-missing, increasing integers. 

\item col.time (TIME),
    EVID, ID, CMT, MDV: If present, elements must be non-missing
    and numeric.

\item col.time (TIME) must be non-negative

\item EVID must be in {0,1,2,3,4}

\item CMT must be positive integers. However, can be missing or zero for EVID==3.

\item MDV must be the binary (1/0) representation of is.na(DV)

\item AMT must be 0 or NA for EVID 0 and 2

\item AMT must be positive for EVID 1 and 4

\item DV must be numeric

\item DV must be missing for EVID in {1,4}.

\item If found, RATE must be a numeric, equaling -2 or non-negative for dosing events.

\item If found, SS must be a numeric, equaling 0 or 1 for dosing records.

\item If found, ADDL must be a non-negative integer for dosing
records. II must be present.

\item If found, II must be a non-negative integer for dosing
records. ADDL must be present.

\item ID must be positive and values cannot be disjoint (all
    records for each ID must be following each other. This is
    technically not a requirement in Nonmem but most often an
    error. Use a second ID column if you deliberately want to
    soften this check)

\item TIME cannot be decreasing within ID, unless EVID in {3,4}.

\item all ID's must have doses (EVID in {1,4})

\item all ID's must have observations (EVID==0)

\item If a unique row identifier is used, this must be
    non-missing, increasing, integer

\item Character values must not contain commas (they will mess up
    writing/reading csv)

\item Columns specified in covs argument must be non-missing,
    numeric and not varying within subjects.

\item Columns specified in covs.occ must be
    non-missing, numeric and not varying within combinations of
    subject and occasion.

\item Columns specified in cols.num must be present, numeric
    and non-NA.

}
}
\examples{
dat <- readRDS(system.file("examples/data/xgxr2.rds", package="NMdata"))
NMcheckData(dat)
dat[EVID==0,LLOQ:=3.5]
## expecting LLOQ only for samples
NMcheckData(dat,cols.num=list(c("STUDY"),"EVID==0"=c("LLOQ")))
}
