% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/sdf_ml.R
\name{sdf_project}
\alias{sdf_project}
\title{Project features onto principal components}
\usage{
sdf_project(
  object,
  newdata,
  features = dimnames(object$pc)[[1]],
  feature_prefix = NULL,
  ...
)
}
\arguments{
\item{object}{A Spark PCA model object}

\item{newdata}{An object coercible to a Spark DataFrame}

\item{features}{A vector of names of columns to be projected}

\item{feature_prefix}{The prefix used in naming the output features}

\item{...}{Optional arguments; currently unused.}
}
\description{
Project features onto principal components
}
\section{Transforming Spark DataFrames}{


The family of functions prefixed with \code{sdf_} generally access the Scala
Spark DataFrame API directly, as opposed to the \code{dplyr} interface which
uses Spark SQL. These functions will 'force' any pending SQL in a
\code{dplyr} pipeline, such that the resulting \code{tbl_spark} object
returned will no longer have the attached 'lazy' SQL operations. Note that
the underlying Spark DataFrame \emph{does} execute its operations lazily, so
that even though the pending set of operations (currently) are not exposed at
the \R level, these operations will only be executed when you explicitly
\code{collect()} the table.
}

