% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/DoU-preprocess-units.R
\name{DoU_preprocess_units}
\alias{DoU_preprocess_units}
\title{Preprocess the data for the DEGURBA spatial units classification}
\usage{
DoU_preprocess_units(
  units,
  classification,
  pop,
  resample_resolution = NULL,
  dissolve_units_by = NULL
)
}
\arguments{
\item{units}{character / object of class \code{sf}. Path to the vector layer with small spatial units, or an object of class \code{sf} with the small spatial units}

\item{classification}{character / SpatRaster. Path to the grid cell classification of the Degree of Urbanisation, or SpatRaster with the grid cell classification}

\item{pop}{character / SpatRaster. Path to the population grid, or SpatRaster with the population grid}

\item{resample_resolution}{numeric. Resolution to which the grids are resampled during pre-processing. If \code{NULL}, the grids are resampled to the smallest resolution among the population and classification grid.}

\item{dissolve_units_by}{character. If not \code{NULL}, the units are dissolved by this column's value, can for example be used to dissolve spatial units to a certain administrative level (see examples).}
}
\value{
named list with the required data to execute the spatial units classification procedure, and their metadata. The list contains the following elements:
\itemize{
\item \code{classification}: the (resampled and cropped) grid cell classification layer
\item \code{pop}: the (resampled and cropped) population grid
\item \code{units}: the (dissolved and filtered) spatial units (object of class \code{sf})
\item \code{metadata}: named list with the metadata of the input files. It contains the elements \code{units}, \code{classification} and \code{pop} (with paths to the respective data sources), \code{resample_resolution} and \code{dissolve_units_by} if not \code{NULL}. (Note that when the input sources are passed by object , the metadata might be empty).
}
}
\description{
The spatial units classification of the Degree of Urbanisation requires three different inputs (all input sources should be in the Mollweide coordinate system):
\itemize{
\item a vector layer with the small spatial units
\item a raster layer with the grid cell classification of the Degree of Urbanisation
\item a raster layer with the population grid
}

The three input layers are pre-processed as follows. The classification grid and population grid are resampled to the \code{resample_resolution} with the nearest neighbour algorithm. In doing this, the values of the population grid are divided by the oversampling ratio (for example: going from a resolution of 100 m to a resolution of 50 m, the values of the grid are divided by 4).

In addition, the function makes sure the extents of the three input layers match. If the bounding box of the units layer is smaller than the extent of the grids, then the grids are cropped to the bounding box of the units layer. Alternatively, if the units layer covers a larger area than the grids, then the units that do not intersect with the grids are discarded (and a warning message is printed). This ensures that the classification algorithm runs efficiently and does not generate any incorrect classifications due to missing data.

More information about the pre-processing workflow, see \href{https://ghsl.jrc.ec.europa.eu/documents/GHSL_Data_Package_2023.pdf}{GHSL Data Package 2023 (Section 2.7.2.3)}.
}
\examples{
\donttest{
# load the grid data
grid_data <- flexurba::DoU_load_grid_data_belgium()
# load the units and filter for West-Flanders
units_data <- flexurba::units_belgium \%>\%
  dplyr::filter(GID_2 == "30000")
# classify the grid
classification <- DoU_classify_grid(data = grid_data)

# preprocess the data for units classification
data1 <- DoU_preprocess_units(
  units = units_data,
  classification = classification,
  pop = grid_data$pop,
  resample_resolution = 50
)

# preprocess the data for units classification at level 3 (Belgian districts)
data2 <- DoU_preprocess_units(
  units = units_data,
  classification = classification,
  pop = grid_data$pop,
  resample_resolution = 50,
  dissolve_units_by = "GID_3"
)
}
}
