% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/machine_learning.R
\name{ml}
\alias{ml}
\title{Machine learning process}
\usage{
ml(
  x,
  y,
  x.test = NULL,
  y.test = NULL,
  family_column = NULL,
  split_by_family = FALSE,
  predict = TRUE,
  test_size = 0.25,
  better_smaller = TRUE,
  method = "ranger",
  test = TRUE,
  color_list = NULL
)
}
\arguments{
\item{x}{dataframe with the instances (rows) and its features (columns). It may also include a column with the family data.}

\item{y}{dataframe with the instances (rows) and the corresponding output (KPI) for each algorithm (columns).}

\item{x.test}{dataframe with the test features. It may also include a column with the family data. If NULL, the algorithm will split x into training and test sets.}

\item{y.test}{dataframe with the test outputs. If NULL, the algorithm will split y into training and test sets.}

\item{family_column}{column number of x where each instance family is indicated. If given, aditional options for the training and set test splitting and the graphics are enabled.}

\item{split_by_family}{boolean indicating if we want to split sets keeping family proportions in case x.test and y.test are NULL. This option requires that option \code{family_column} is different from NULL}

\item{predict}{boolean indicating if predictions will be made or not. If FALSE plots will use training data only and no ML column will be displayed.}

\item{test_size}{float with the segmentation proportion for the test dataframe. It must be a value between 0 and 1.}

\item{better_smaller}{boolean that indicates wether the output (KPI) is better if smaller (TRUE) or larger (FALSE).}

\item{method}{name of the model to be used. The user can choose from any of the models provided by \code{caret}. See \url{http://topepo.github.io/caret/train-models-by-tag.html} for more information about the models supported.}

\item{test}{boolean indicating whether the predictions will be made with the test set or the training set.}

\item{color_list}{list with the colors for the plots. If NULL or insufficient number of colors, the colors will be generated automatically.}
}
\value{
A list with the data and plots generated, including:
\itemize{
\item \code{data_obj} An \code{as_data} object with the processed data from \code{partition_and_normalize()} function.
\item \code{training} An \code{as_train} object with the trainings from the \code{AStrain()} function.
\item \code{predictions} A data frame with the predictions from the \code{ASpredict()} function, if the predict param is TRUE.
\item \code{table} A table with the summary of the output data.
\item \code{boxplot}, \code{ranking_plot}, \code{figure_comparison}, \code{optml_figure_comparison} and \code{optmlall_figure_comparison} with the corresponding plots.
}
}
\description{
Function that proceses input data, trains the machine learning models, makes a prediction and plots the results.
}
\examples{
\donttest{
data(branchingsmall)
machine_learning <- ml(branchingsmall$x, branchingsmall$y, test_size = 0.3,
family_column = 1, split_by_family = TRUE, method = "glm")
}
}
