% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ddi_read.R
\name{read_ihgis_codebook}
\alias{read_ihgis_codebook}
\title{Read metadata from an IHGIS extract's codebook files}
\usage{
read_ihgis_codebook(cb_file, tbls_file = NULL, raw = FALSE)
}
\arguments{
\item{cb_file}{Path to a .zip archive containing an IHGIS extract, an IHGIS
data dictionary (\verb{_datadict.csv}) file, or an IHGIS codebook (.txt) file.}

\item{tbls_file}{If \code{cb_file} is the path to an IHGIS data dictionary .csv
file, path to the \verb{_tables.csv} metadata file from the same IHGIS extract.
If these files are in the same directory, this file will be automatically
loaded. If you have moved this file, provide the path to it here.}

\item{raw}{If \code{TRUE} return a character vector containing the lines of
\code{cb_file} rather than an \code{ipums_ddi} object. Defaults to \code{FALSE}.

If \code{TRUE}, \code{cb_file} must be a .zip archive or a .txt codebook file.}
}
\value{
If \code{raw = FALSE}, an \code{ipums_ddi} object with metadata about the variables
contained in the data for the extract associated with the given \code{cb_file}.

If \code{raw = TRUE}, a character vector with one element for each line of the
given \code{cb_file}.
}
\description{
\ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#experimental}{\figure{lifecycle-experimental.svg}{options: alt='[Experimental]'}}}{\strong{[Experimental]}}

Read the variable metadata contained in an IHGIS extract into an
\code{\link{ipums_ddi}} object.

Because IHGIS variable metadata do not adhere to all the standards of
microdata DDI files, some of the \code{ipums_ddi} fields will not be populated.

This function is marked as experimental while we determine whether there
may be a more robust way to standardize codebook reading across IPUMS
aggregate data collections.
}
\details{
IHGIS extracts store variable and geographic metadata in multiple
files:
\itemize{
\item \verb{_datadict.csv} contains the data dictionary with metadata
about the variables included across all files in the extract.
\item \verb{_tables.csv} contains metadata about all IHGIS
tables included in the extract.
\item \verb{_geog.csv} contains metadata about the tabulation geographies included
for any tables in the extract.
\item \verb{_codebook.txt} contains table and variable metadata in human readable
form and contains citation information for IHGIS data.
}

By default, \code{read_ihgis_codebook()} uses information from all these files and
assumes they exist in the provided extract (.zip) file or directory.
If you have unzipped your IHGIS extract and moved the \verb{_tables.csv} file,
you will need to provide the path to that file in the \code{tbls_file} argument.
Certain variable metadata can still be loaded without the \verb{_geog.csv} or
\verb{_codebook.txt} files. However, if \code{raw = TRUE}, the \verb{_codebook.txt} file
must be present in the .zip archive or provided to \code{cb_file}.

If you no longer have access to these files, consider resubmitting the
extract request that produced the data.

Note that IHGIS codebooks contain metadata for all the datasets contained
in a given extract. Individual data files from the extract may not contain
all of the variables shown in the output of \code{read_ihgis_codebook()}.
}
\examples{
ihgis_file <- ipums_example("ihgis0014.zip")

ihgis_cb <- read_ihgis_codebook(ihgis_file)

# Variable labels and descriptions
ihgis_cb$var_info

# Citation information
ihgis_cb$conditions

# If variable metadata have been lost from a data source, reattach from
# the corresponding `ipums_ddi` object:
ihgis_data <- read_ipums_agg(
  ihgis_file,
  file_select = matches("AAA_g0"),
  verbose = FALSE
)

ihgis_data <- zap_ipums_attributes(ihgis_data)
ipums_var_label(ihgis_data$AAA001)

ihgis_data <- set_ipums_var_attributes(ihgis_data, ihgis_cb)
ipums_var_label(ihgis_data$AAA001)

# Load in raw format
ihgis_cb_raw <- read_ihgis_codebook(ihgis_file, raw = TRUE)

# Use `cat()` to display in the R console in human readable format
cat(ihgis_cb_raw[1:21], sep = "\n")
}
