hypernets.experiment package¶
Submodules¶
hypernets.experiment.cfg module¶
hypernets.experiment.compete module¶
-
class
hypernets.experiment.compete.
CompeteExperiment
(hyper_model, X_train, y_train, X_eval=None, y_eval=None, X_test=None, eval_size=0.3, train_test_split_strategy=None, cv=None, num_folds=3, task=None, id=None, callbacks=None, random_state=None, scorer=None, data_adaption=None, data_adaption_target=None, data_adaption_memory_limit=0.05, data_adaption_min_cols=0.3, data_cleaner_args=None, feature_generation=False, feature_generation_trans_primitives=None, feature_generation_max_depth=1, feature_generation_categories_cols=None, feature_generation_continuous_cols=None, feature_generation_datetime_cols=None, feature_generation_latlong_cols=None, feature_generation_text_cols=None, collinearity_detection=False, drift_detection=True, drift_detection_remove_shift_variable=True, drift_detection_variable_shift_threshold=0.7, drift_detection_threshold=0.7, drift_detection_remove_size=0.1, drift_detection_min_features=10, drift_detection_num_folds=5, feature_selection=False, feature_selection_strategy=None, feature_selection_threshold=None, feature_selection_quantile=None, feature_selection_number=None, down_sample_search=None, down_sample_search_size=None, down_sample_search_time_limit=None, down_sample_search_max_trials=None, ensemble_size=20, feature_reselection=False, feature_reselection_estimator_size=10, feature_reselection_strategy=None, feature_reselection_threshold=1e-05, feature_reselection_quantile=None, feature_reselection_number=None, pseudo_labeling=False, pseudo_labeling_strategy=None, pseudo_labeling_proba_threshold=None, pseudo_labeling_proba_quantile=None, pseudo_labeling_sample_number=None, pseudo_labeling_resplit=False, retrain_on_wholedata=False, log_level=None, **kwargs)[source]¶ Bases:
hypernets.experiment.compete.SteppedExperiment
A powerful experiment strategy for AutoML with a set of advanced features.
There are still many challenges in the machine learning modeling process for tabular data, such as imbalanced data, data drift, poor generalization ability, etc. This challenges cannot be completely solved by pipeline search, so we introduced in HyperNets a more powerful tool is CompeteExperiment. CompeteExperiment is composed of a series of steps and Pipeline Search is just one step. It also includes advanced steps such as data cleaning, data drift handling, two-stage search, ensemble etc.
-
class
hypernets.experiment.compete.
DaskEnsembleStep
(experiment, name, scorer=None, ensemble_size=7)[source]¶
-
class
hypernets.experiment.compete.
DaskPseudoLabelStep
(experiment, name, estimator_builder_name, strategy=None, proba_threshold=None, proba_quantile=None, sample_number=None, resplit=False)[source]¶
-
class
hypernets.experiment.compete.
DataAdaptionStep
(experiment, name, target=None, memory_limit=0.05, min_cols=0.3)[source]¶
-
class
hypernets.experiment.compete.
DataCleanStep
(experiment, name, data_cleaner_args=None, cv=False, train_test_split_strategy=None)[source]¶ Bases:
hypernets.experiment.compete.FeatureSelectStep
-
cache_transform
(hyper_model, X_train, y_train, X_test=None, X_eval=None, y_eval=None, **kwargs)[source]¶
-
fit_transform
(**kwargs)¶
-
-
class
hypernets.experiment.compete.
DriftDetectStep
(experiment, name, remove_shift_variable, variable_shift_threshold, threshold, remove_size, min_features, num_folds)[source]¶ Bases:
hypernets.experiment.compete.FeatureSelectStep
-
fit_transform
(**kwargs)¶
-
-
class
hypernets.experiment.compete.
EnsembleStep
(experiment, name, scorer=None, ensemble_size=7)[source]¶
-
class
hypernets.experiment.compete.
EstimatorBuilderStep
(experiment, name)[source]¶ Bases:
hypernets.experiment.compete.ExperimentStep
-
build_estimator
(hyper_model, X_train, y_train, X_test=None, X_eval=None, y_eval=None, **kwargs)[source]¶
-
-
class
hypernets.experiment.compete.
ExperimentStep
(experiment, name)[source]¶ Bases:
sklearn.base.BaseEstimator
-
STATUS_FAILED
= 1¶
-
STATUS_NONE
= -1¶
-
STATUS_RUNNING
= 10¶
-
STATUS_SKIPPED
= 2¶
-
STATUS_SUCCESS
= 0¶
-
elapsed_seconds
¶
-
fit_transform
(hyper_model, X_train, y_train, X_test=None, X_eval=None, y_eval=None, **kwargs)[source]¶
-
task
¶
-
-
class
hypernets.experiment.compete.
FeatureGenerationStep
(experiment, name, trans_primitives=None, continuous_cols=None, datetime_cols=None, categories_cols=None, latlong_cols=None, text_cols=None, max_depth=1, feature_selection_args=None)[source]¶
-
class
hypernets.experiment.compete.
FeatureImportanceSelectionStep
(experiment, name, strategy, threshold, quantile, number)[source]¶ Bases:
hypernets.experiment.compete.FeatureSelectStep
-
fit_transform
(**kwargs)¶
-
-
class
hypernets.experiment.compete.
FeatureSelectStep
(experiment, name)[source]¶ Bases:
hypernets.experiment.compete.ExperimentStep
-
cache_transform
(hyper_model, X_train, y_train, X_test=None, X_eval=None, y_eval=None, **kwargs)[source]¶
-
selected_features
¶
-
unselected_features
¶
-
-
class
hypernets.experiment.compete.
FinalTrainStep
(experiment, name, retrain_on_wholedata=False)[source]¶
-
class
hypernets.experiment.compete.
MulticollinearityDetectStep
(experiment, name)[source]¶ Bases:
hypernets.experiment.compete.FeatureSelectStep
-
fit_transform
(**kwargs)¶
-
-
class
hypernets.experiment.compete.
PermutationImportanceSelectionStep
(experiment, name, scorer, estimator_size, strategy, threshold, quantile, number)[source]¶
-
class
hypernets.experiment.compete.
PseudoLabelStep
(experiment, name, estimator_builder_name, strategy=None, proba_threshold=None, proba_quantile=None, sample_number=None, resplit=False)[source]¶
-
class
hypernets.experiment.compete.
SpaceSearchStep
(experiment, name, cv=False, num_folds=3)[source]¶
-
class
hypernets.experiment.compete.
SpaceSearchWithDownSampleStep
(experiment, name, cv=False, num_folds=3, size=None, max_trials=None, time_limit=None)[source]¶
-
class
hypernets.experiment.compete.
StepNames
[source]¶ Bases:
object
-
DATA_ADAPTION
= 'data_adaption'¶
-
DATA_CLEAN
= 'data_clean'¶
-
DRIFT_DETECTION
= 'drift_detection'¶
-
ENSEMBLE
= 'ensemble'¶
-
FEATURE_GENERATION
= 'feature_generation'¶
-
FEATURE_IMPORTANCE_SELECTION
= 'feature_selection'¶
-
FEATURE_RESELECTION
= 'feature_reselection'¶
-
FINAL_ENSEMBLE
= 'final_ensemble'¶
-
FINAL_MOO
= 'final_moo'¶
-
FINAL_SEARCHING
= 'two_stage_searching'¶
-
FINAL_TRAINING
= 'final_train'¶
-
MULITICOLLINEARITY_DETECTION
= 'multicollinearity_detection'¶
-
PSEUDO_LABELING
= 'pseudo_labeling'¶
-
SPACE_SEARCHING
= 'space_searching'¶
-
TRAINING
= 'training'¶
-
-
class
hypernets.experiment.compete.
SteppedExperiment
(steps, *args, **kwargs)[source]¶ Bases:
hypernets.experiment._experiment.Experiment
-
train
(hyper_model, X_train, y_train, X_test, X_eval=None, y_eval=None, **kwargs)[source]¶ Run an experiment
Parameters: - hyper_model (HyperModel) –
- X_train –
- y_train –
- X_test –
- X_eval –
- y_eval –
-
hypernets.experiment.general module¶
hypernets.experiment.job module¶
hypernets.experiment.report module¶
-
class
hypernets.experiment.report.
ExcelReportRender
(file_path: str = './report.xlsx', theme='default')[source]¶ Bases:
hypernets.experiment.report.ReportRender
-
MAX_CELL_LENGTH
= 50¶
-
-
class
hypernets.experiment.report.
FeatureTrans
(feature, method, stage, reason, remark)¶ Bases:
tuple
-
feature
¶ Alias for field number 0
-
method
¶ Alias for field number 1
-
reason
¶ Alias for field number 3
-
remark
¶ Alias for field number 4
-
stage
¶ Alias for field number 2
-
-
class
hypernets.experiment.report.
FeatureTransCollector
(steps: List[hypernets.experiment._extractor.StepMeta])[source]¶ Bases:
object
-
METHOD_ADD
= 'add'¶
-
METHOD_DROP
= 'drop'¶
-