This Vignette is supposed to give you a short introduction and a glance at the key features of mlrHyperopt. For updated information make sure to check the GitHub project page:

Purpose

The main goal of mlrHyperopt is to break boundaries and make Hyperparameter optimization super easy. Often beginners of machine learning and even experts don’t know which parameters have to be tuned for certain machine learning methods. Sometimes experts also don’t necessarily agree on the tuning parameters and their ranges. This package tries to tackle these problems by offering:

  • Recommended parameter space configurations for the most common learners.
  • A fully automatic, zero conf, one line hyper parameter configuration.
  • Option to upload a good parameter space configuration for a specific learner to share with colleagues and researchers.
  • Possibility to download publicly available parameter space configurations for the machine learning method of your choice.
  • An extensible interface to use the full variety of mlr of learners and tuning options.
  • Namely: grid search, cma-es, model-based optimization and random search.

Requirements

As the name indicates mlrHyperopt relies heavily on mlr. Additionally mlrMBO will be used automatically for pure numeric Parameter Spaces of dimension 2 or higher. Most used objects are documented in mlr. To create your own task check the mlr-tutorial on how to create Learning Tasks, Learners, Tuning Parameter Sets for Learners, as well as custom resampling strategies.

Getting started

Hyperparameter Tuning with mlrHyperopt can be done in one line:

library(mlrHyperopt)
res = hyperopt(iris.task, learner = "classif.randomForest")
res
## Tune result:
## Op. pars: nodesize=10; mtry=2
## mmce.test.mean=0.0466667

To obtain full control of what is happening you can define every argument yourself or just depend partially on the automatic processes.

pc = generateParConfig(learner = "classif.randomForest")
# The tuning parameter set:
getParConfigParSet(pc)
##             Type len            Def  Constr Req Tunable Trafo
## nodesize integer   -              1 1 to 10   -    TRUE     -
## mtry     integer   - floor(sqrt(p))  1 to p   -    TRUE     -
# Setting constant values:
pc = setParConfigParVals(pc, par.vals = list(mtry = 3))
hc = generateHyperControl(task = iris.task, par.config = pc)
# Inspecting the resamling strategy used for tuning
getHyperControlResampling(hc)
## Resample description: cross-validation with 10 iterations.
## Predict: test
## Stratification: FALSE
# Changing the resampling strategy
hc = setHyperControlResampling(hc, makeResampleDesc("Bootstrap", iters = 3))

# Starting the hyperparameter tuning
res = hyperopt(iris.task, par.config = pc, hyper.control = hc, show.info = FALSE)
res
## Tune result:
## Op. pars: nodesize=5; mtry=3
## mmce.test.mean=0.0297386

Sharing Search Spaces

The predefined parameter search spaces in this package do not cover all learners available in mlr and they don’t claim to be the best search spaces either. So you might want to share your ParConfig which includes the search space as well as constant parameter settings for a certain mlr learner as it will help other people to improve their performances. Also this way you can share this ParConfig with colleagues and your future self.

At the same time you can benefit from the online data base if you want to use a new mlr learner for which you are not aware of tunable parameters.

Uploading a Parameter Configuration

To upload a Parameter Configuration that consists of a ParamSet, an associated learner (either given by the general learner.name or the concrete learner.) and optionally some specific parameter settings (par.vals). Using just the learner.name indicates that you might want to use this ParConfig for the regression as well as the classification version of this learner.

par.set = makeParamSet(
  makeIntegerParam(
    id = "mtry",
    lower = expression(floor(p^0.25)),
    upper = expression(ceiling(p^0.75)),
    default = expression(round(p^0.5))),
  keys = "p")
par.config = makeParConfig(
  par.set = par.set,
  par.vals = list(ntree = 200),
  learner.name = "randomForest"
)
uploadParConfig(par.config, "jon.doe@example.com")

With this id you can later download this specific ParConfig.

Downloading Parameter Configurations

You can download a specific parameter configuration using the id:

If you looking for parameter configurations for your learner you can simply run:

my.learner = makeLearner("classif.svm")
# only classif svm
svm.configs = downloadParConfigs(learner.class = getLearnerClass(my.learner))
svm.configs
## [[1]]
## Parameter Configuration
##   Parameter Values: cachesize=100, tolerance=0.01
##   Associated Learner: classif.svm
##   Parameter Set:
##            Type len Def            Constr Req Tunable Trafo
## cost    numeric   -   -           0 to 15   -    TRUE     -
## degree  integer   -   3          1 to Inf   Y    TRUE     -
## gamma   numeric   -   -           -5 to 5   -    TRUE     Y
## kernel discrete   -   - polynomial,radial   -    TRUE     -
## 
## [[2]]
## Parameter Configuration
##   Parameter Values: 
##   Associated Learner: classif.svm
##   Note: From github.com/ja-thomas/OMLbots
##   Parameter Set:
##            Type len Def                   Constr Req Tunable Trafo
## cost    numeric   -   -                -10 to 10   -    TRUE     Y
## degree  integer   -   -                   2 to 5   Y    TRUE     -
## gamma   numeric   -   -                -10 to 10   Y    TRUE     Y
## kernel discrete   -   - linear,polynomial,radial   -    TRUE     -
# all svm
svm.configs = downloadParConfigs(learner.name = getLearnerName(my.learner))

You can also query for custom key value pair:

user.configs = downloadParConfigs(custom.query = list("user_email"="jon.doe@example.com"))