\name{fSet}
\alias{fSet}

\title{
Defining the fSet input variable
}

\description{
Several of the statistical methods implemented in package
\pkg{DynTxRegime} allow for subset modeling or limiting
of feasible treatment options. This section details how this
input is to be defined.
}

\details{

In general, input \code{fSet} is used to define subsets of patients 
within an analysis. These subsets can be specified to (1) limit 
available treatments, (2) use different models for the propensity 
score and/or outcome regressions, and/or 
(3) use different decision function models for 
each subset of patients. The combination of inputs \code{moPropen},
\code{moMain}, \code{moCont}, \code{fSet}, and/or \code{regimes} 
determines which of these scenarios is 
being considered. We cover some common situations below.


Regardless of the purpose for specifying \code{fSet}, it must be a 
function that returns a list. There are two options for defining the 
function. Version 1 is that of the original \pkg{DynTxRegime} package. 
In this version, \code{fSet} defines the rules
for determining the subset of treatment options for an INDIVIDUAL.
The first element of the returned list is a character, which we term
the subset 'nickname.' This nickname is for bookkeeping purposes 
and is used to link models to subsets. The second element 
of the returned list is a vector 
of available treatment options for the subset. The formal arguments of 
the function must include (i) 'data' or (ii) individual covariate 
names as given by the column headers of \code{data}. An example using the 
covariate name input form is

\preformatted{
fSet <- function(a1) \{
  if(a1 > 1)\{
    subset <- list('subA',c(1,2))
  \} else \{
    subset <- list('subB',c(3,4) )
  \}
  return(subset)
\}}
This function indicates that if an individual has covariate a1 > 1, 
they are a member of subset 'subA' and their feasible
treatment options are \{1,2\}. If a1 <= 1, they are a member of subset
'subB' and their feasible treatment options are \{3,4\}.

A more efficient implementation for \code{fSet} is now accepted. In
the second form, \code{fSet} defines the subset of treatment options
for the full DATASET. It is again a function with
formal arguments  (i) 'data' or (ii) individual covariate names as 
given by the column headers of \code{data}. The function returns a list 
containing two elements: 'subsets' and 'txOpts.'  Element 'subsets' is 
a list comprising all treatment subsets; each element of the list contains 
the nickname and treatment options for a single subset. Element
'txOpts' is a character vector indicating the subset of which
each individual is a member. In this new format, 
the equivalent definition of \code{fSet} as that given above is:

\preformatted{
fSet <- function(a1) \{
  subsets <- list(list('subA', c(1,2)),
                  list('subB', c(3,4)))
  txOpts <- rep('subB', length(a1))
  txOpts[a1 > 1] <- 'subA'

  return(list("subsets" = subsets,
              "txOpts" = txOpts))
\}}

Though a bit more complicated, this version is much more efficient as
it processes the entire dataset at once rather than each individual 
separately.

The simplest scenario involving \code{fSet} is to define feasible 
treatment options and the rules that dictate how those treatment 
options are determined. For example, 
responder/non-responder scenarios are often encountered in
multiple-decision-point settings. An example of this scenario is:
patients that respond to the first stage treatment
remain on the original treatment; those that
do not respond to the first stage treatment
have all treatment options available to them at the second stage. 
In this case, the 
propensity score models for the second stage
are fit using only 'non-responders' for whom 
more than 1 treatment option is available. 

An example of an appropriate \code{fSet} function for
the second-stage is
\preformatted{ 
fSet <- function(data) \{ 
   if(data\$responder  == 0L)\{ 
     subset <- list('subA',c(1L,2L))
   \} else if(data\$tx1 == 1L) \{ 
     subset <- list('subB',c(1L) )
   \} else if(data\$tx1 == 2L) \{ 
     subset <- list('subC',c(2L) )
   \} 
   return(subset) 
\} }
for version 1 or for version 2
\preformatted{
fSet <- function(data) \{
  subsets <- list(list('subA', c(1L,2L)),
                  list('subB', c(1L)),
                  list('subC', c(2L)))
  txOpts <- character(nrow(data))
  txOpts[data$tx1 == 1L] <- 'subB'
  txOpts[data$tx1 == 2L] <- 'subC'
  txOpts[data$responder == 0L] <- 'subA'

  return(list("subsets" = subsets,
              "txOpts" = txOpts))
\}}


The functions above specify that patients with covariate responder = 0 
receive treatments from subset 'subA,' which comprises treatments 
A = (1,2). Patients with covariate responder = 1 receive treatment 
from subset 'subB' or 'subC' depending on the first stage treatment
received. If
\code{fSet} is specified in this way, 
\code{moPropen} must be a \code{"modelObj"};
the propensity model will be fit using only those patients with 
responder = 0. If outcome regression is used by the method,
\code{moMain} and \code{moCont} can be either objects
of class \code{"modelObj"} if all all patients are to be used
to obtain parameter estimates or lists of objects of class \code{"ModelObjSubset"}
if subsets are to be analyzed individually.

For a scenario where all patients have the same set of treatment
options available, but subsets of patients are to be analyzed using 
different models. We cane define \code{fSet} as
\preformatted{ 
fSet <- function(data) \{ 
   if(data\$a1 == 1)\{ 
     subset <- list('subA',c(1L,2L))
   \} else \{ 
     subset <- list('subB',c(1L,2L) )
   \} 
   return(subset) 
\} }
for version 1 or in the format of version 2
\preformatted{
fSet <- function(data)
\{
  subsets <- list(list('subA', c(1L,2L)),
                  list('subB', c(1L,2L)))
  txOpts <- rep('subB', nrow(data))
  txOpts[data$a1 == 1L] <- 'subA'

  return(list("subsets" = subsets,
              "txOpts" = txOpts))
\}}

where all patients have the same treatment options available, A = (1,2),
but different regression models will be fit for each subset (case 2 above) 
and/or different decision function models (case 3 above) for each
subset. If different propensity score models are used, \code{moPropen} 
must be a list of objects of class \code{"modelObjSubset."}
Perhaps,
\preformatted{ 
  propenA <- buildModelObjSubset(model = ~1,
                                 solver.method = 'glm',
                                 solver.args = list('family'='binomial'),
                                 predict.method = 'predict.glm',
                                 predict.args = list(type='response'),
                                 subset = 'subA')

  propenB <- buildModelObjSubset(model = ~1,
                                 solver.method = 'glm',
                                 solver.args = list('family'='binomial'),
                                 predict.method = 'predict.glm',
                                 predict.args = list(type='response'),
                                 subset = 'subB')

  moPropen <- list(propenA, propenB)
 }
If different decision function models are to be fit, \code{regimes}
would take a form similar to
\preformatted{ 
  regimes <- list( 'subA' = ~x1 + x2,
                   'subB' = ~x2 )
}
Notice that the names of the elements of \code{regimes} and the subsets passed to
buildModelObjSubset() correspond to the names defined by \code{fSet},
i.e., 'subA' or 'subB.' These nicknames are used for bookkeeping and 
link subsets to the appropriate models.


For a single-decision-point analysis, \code{fSet}
is a single function. For multiple-decision-point analyses,
\code{fSet} is a list of functions where each element of 
the list corresponds to the decision point (1st element <-
1st decision point, etc.)
}


