Quick-Start

Installation

Python version 3.6 or above is necessary before installing Hypernets.

Conda

Install Hypernets with conda from the channel conda-forge:

conda install -c conda-forge hypernets

Pip

Install Hypernets with pip command:

pip install hypernets

Optional, to run Hypernets in JupyterLab notebooks, install Hypernets and JupyterLab with command:

pip install hypernets[notebook]
  • Optional, to support experiment visualization base on web, install with command:
pip install hypernets[board]

Optional, to run Hypernets in distributed Dask cluster, install Hypernets with command:

pip install hypernets[dask]

Optional, to support simplified Chinese in feature generation, install jieba package before run Hypernets, or install Hypernets with command:

pip install hypernets[zhcn]

Optional, install all Hypernets components and dependencies with one command:

pip install hypernets[all]

``

Verify installation:

python -m hypernets.examples.smoke_testing

Getting started

In current version, we provide PlainModel (a plain HyperModel implementation), which can be used for hyper-parameter tuning with sklearn machine learning algorithms.

Basically, to search the best model only needs 4 steps:

  • Step 1. Define Search Space
  • Step 2. Select a Searcher
  • Step 3. Select a HyperModel
  • Step 4. Search and get the best model

Define Search space

Firstly, we define a search space for hyper-parameters of DecisionTreeClassifier and MLPClassifier:

from sklearn.neural_network import MLPClassifier
from sklearn.tree import DecisionTreeClassifier

from hypernets.core import get_random_state
from hypernets.core.ops import ModuleChoice, HyperInput, ModuleSpace
from hypernets.core.search_space import HyperSpace, Choice, Int


def my_search_space(enable_dt=True, enable_mlp=True):
    space = HyperSpace()

    with space.as_default():
        hyper_input = HyperInput(name='input1')

        estimators = []
        if enable_dt:
            estimators.append(dict(
                cls=DecisionTreeClassifier,
                criterion=Choice(["gini", "entropy"]),
                splitter=Choice(["best", "random"]),
                max_depth=Choice([None, 3, 5, 10, 20, 50]),
                random_state=get_random_state(),
            ))

        if enable_mlp:
            estimators.append(dict(
                cls=MLPClassifier,
                max_iter=Int(500, 5000, step=500),
                activation=Choice(['identity', 'logistic', 'tanh', 'relu']),
                solver=Choice(['lbfgs', 'sgd', 'adam']),
                learning_rate=Choice(['constant', 'invscaling', 'adaptive']),
                random_state=get_random_state(),
            ))

        modules = [ModuleSpace(name=f'{e["cls"].__name__}', **e) for e in estimators]
        outputs = ModuleChoice(modules)(hyper_input)
        space.set_inputs(hyper_input)

    return space

Training with PlainModel

Turning and scoring with my_search_space for heart_disease_uci dataset.


def train():
    from sklearn.model_selection import train_test_split
    from sklearn.metrics import classification_report
    from hypernets.core.callbacks import SummaryCallback
    from hypernets.examples.plain_model import PlainModel
    from hypernets.searchers import make_searcher
    from hypernets.tabular.datasets import dsutils

    X = dsutils.load_heart_disease_uci()
    y = X.pop('target')

    X_train, X_eval, y_train, y_eval = \
        train_test_split(X, y, test_size=0.3)
    
    # make MCTS searcher
    searcher = make_searcher('mcts', my_search_space, optimize_direction='max')
    callbacks = [SummaryCallback()]
    
    # create HyperModel and do 'search' action
    hm = PlainModel(searcher=searcher, reward_metric='f1', callbacks=callbacks)
    hm.search(X_train, y_train, X_eval, y_eval, )
    
    # get best estimator
    best = hm.get_best_trial()
    estimator = hm.final_train(best.space_sample, X_train, y_train)
    
    # scoring
    y_pred = estimator.predict(X_eval)
    print(classification_report(y_eval, y_pred))


if __name__ == '__main__':
    train()

Run the example, we will get console output:

17:36:56 I hypernets.u.common.py 147 - 2 class detected, {0, 1}, so inferred as a [binary classification] task
17:36:56 I hypernets.c.meta_learner.py 22 - Initialize Meta Learner: dataset_id:e10ae1d61123d55062f7d3b64b79a6e1
17:36:56 I hypernets.c.callbacks.py 235 - 
Trial No:1
--------------------------------------------------------------
(0) Module_ModuleChoice_1.hp_or:                            0
(1) DecisionTreeClassifier.criterion:                    gini
(2) DecisionTreeClassifier.splitter:                     best
(3) DecisionTreeClassifier.max_depth:                       5
--------------------------------------------------------------
...

              precision    recall  f1-score   support

           0       0.68      0.67      0.67        42
           1       0.72      0.73      0.73        49

    accuracy                           0.70        91
   macro avg       0.70      0.70      0.70        91
weighted avg       0.70      0.70      0.70        91