% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/embed.R
\name{embed}
\alias{embed}
\title{Embed texts with a Transformer model}
\usage{
embed(dat, layers, keep_tokens = TRUE, tokens_method = NULL)
}
\arguments{
\item{dat}{A dataframe with text data, one text per row}

\item{layers}{Integer vector specifying which model layers to aggregate from.}

\item{keep_tokens}{Logical, keep token-level embeddings in the returned
object or discard them to save memory}

\item{tokens_method}{Character scalar controlling how token-level
embeddings are aggregated to word types}
}
\value{
A dataframe where each row corresponds to one input text and each
column is an embedding dimension

@examples df <- data.frame(
text = c(
"I slept well and feel great today!",
"I saw from friends and it went well.",
"I think I failed that exam. I'm such a disappointment."
"I think I failed that exam. I'm such a disapointment."
)
)

emb_dat <- embed(
dat = df,
layers = 1,
keep_tokens = FALSE,
tokens_method = "mean"
)
}
\description{
Cleans a text column and converts it to a dataframe of numeric vectors via
BERT embeddings. For the input dataframe, each row
is one text entry.
}
