Hyperparameter Search
The framework provides built-in support for hyperparameter optimization using either Grid Search or Bayesian Optimization (Optuna). All experiments are configured and executed via the ubermain.py entry point.
System Configuration (ubermain.py)
The ubermain.py file controls the execution mode, parallelization, and the base configuration of the experiment.
from src.hyperparameter_search.multitasking import run_multitask
from src.hyperparameter_search.search_space import SearchSpace, OptunaSearchSpace
if __name__ == "__main__":
results_dir = os.path.abspath("results")
experiment_name = "my_experiment"
# --- Parallelization & Hardware ---
n_processes = 4 # Total parallel processes
use_gpu = True
num_processes_per_gpu = 2 # Parallel trainings per GPU
device_ids = [] # Empty = automatic distribution
# --- Search Mode Selection ---
use_optuna = False # Set to True for Optuna, False for Grid Search
# --- Grid Search Config ---
full_grid = True # True = Cartesian product; False = Pairwise
n_runs_per_configuration = 1 # Repeats per config (for stability)
# --- Optuna Config ---
optuna_num_runs = 100 # Total trial budget
# --- Model Configuration ---
config = {
"n_epochs": 1000,
# ... config containing SearchSpace or OptunaSearchSpace objects ...
}
# Execute
run_multitask(
results_dir, experiment_name, n_runs_per_configuration, config,
n_processes,
"automatic-distribution" if use_gpu and not device_ids else (device_ids if use_gpu else "no-gpu"),
num_processes_per_gpu,
use_torch_compile=False,
use_optuna=use_optuna,
optuna_num_trials=optuna_num_runs,
get_all_gridsearch_combinations=full_grid,
)
Grid Search
Grid search runs training for specified combinations of parameters defined in the config.
Configuration Options
full_grid = True: Tests every possible combination (Cartesian product) of the definedSearchSpacelists.full_grid = False: Pairs parameters by their list index. AllSearchSpacelists must have the same length. Useful for comparing specific pre-defined configurations (e.g., "Small Model" vs "Large Model").n_runs_per_configuration: Repeats each configurationNtimes (e.g., to average out random initialization effects).
Defining Parameters
Use the SearchSpace class within the config dictionary to define lists of values.
from src.hyperparameter_search.search_space import SearchSpace
config = {
# ...
"latent_step": {
"hyperparameters": {
# Defines a grid over 3 values.
# "hidden_dim" is the label used in the result folder name.
"hidden_dim": SearchSpace("hidden_dim", [10, 20, 50]),
}
}
}
Optuna Search
Optuna performs automated Bayesian optimization (using TPE) to find optimal hyperparameters efficiently. This is done by monitoring the model's performance based on a predefined metric. This could be any metric calculated on the test set during training (see evaluation during training for details) or the test loss. For configuration details please refer to the early stopping config as they share this configuration.
Configuration Options
use_optuna = True: Activates Optuna mode.optuna_num_runs: The total budget of trials to perform.- Parallelism: Controlled by
n_processes. Trials run in parallel, and results are reported back to a shared study.
Defining Search Spaces
Use the OptunaSearchSpace static methods (int, float, categorical, list) to define ranges.
Basic Types
from src.hyperparameter_search.search_space import OptunaSearchSpace
"hyperparameters": {
# Integer range [10, 100], step 10
"hidden_dim": OptunaSearchSpace.int("hidden_dim", 10, 100, step=10),
# Float range (log scale)
"learning_rate": OptunaSearchSpace.float("lr", 1e-4, 1e-1, log=True),
# Categorical choice
"activation": OptunaSearchSpace.categorical("act", ["relu", "tanh"]),
}
Lists
Search over a list of values (e.g., specific layer sizes).
"layer_sizes": OptunaSearchSpace.list(
"layers",
# Fixed length list of 2 entries
num_entries=OptunaSearchSpace.int("len", 2, 2),
# Each entry is an int between 10 and 50
entry=OptunaSearchSpace.int("size", 10, 50)
)
Dependency Injection
Some hyperparameters can depend on others (e.g., ensuring dim_forcing <= latent_dim). Dependencies are specified using dot notation (path to value) or a dictionary with a transformation.
"dim_forcing": OptunaSearchSpace.int(
"forcing",
min=1,
# Max value constrained by another config value
max="latent_dim",
step=1
)
# With transformation (e.g., max is half of hidden_dim)
"other_param": OptunaSearchSpace.int(
"label",
min=1,
max={
"path": "latent_step.hyperparameters.hidden_dim",
"transformation": lambda x: x // 2
}
)
Optuna dashboard
Optuna provides a built-in tool to monitor the runs and the optimization process. This can be accessed by running the following command:
optuna-dashboard sqlite:///<log_directory>/optuna_study.db
Where log_directory is results_dir/experiment_name.