admin管理员组

文章数量:1306142

In the configuration management library Hydra, it is possible to only partially instantiate classes defined in configuration using the _partial_ keyword. The library explains that this results in a functools.partial. I wonder how this interacts with seeding. E.g. with

  • pytorch torch.manual_seed()
  • lightnings seed_everything
  • etc.

My reasoning is, that if I use the _partial_ keyword while specifying all parameters for __init__, then I would essentially obtain a factory which could be called after specifying the seed to do multiple runs. But this assumes that _partial_ does not bake the seed in already. To my understanding that should not be the case. Is that correct?

In the configuration management library Hydra, it is possible to only partially instantiate classes defined in configuration using the _partial_ keyword. The library explains that this results in a functools.partial. I wonder how this interacts with seeding. E.g. with

  • pytorch torch.manual_seed()
  • lightnings seed_everything
  • etc.

My reasoning is, that if I use the _partial_ keyword while specifying all parameters for __init__, then I would essentially obtain a factory which could be called after specifying the seed to do multiple runs. But this assumes that _partial_ does not bake the seed in already. To my understanding that should not be the case. Is that correct?

Share Improve this question edited Feb 4 at 10:22 Daraan 3,9777 gold badges22 silver badges46 bronze badges asked Feb 3 at 15:28 Felix BenningFelix Benning 1,2024 gold badges14 silver badges30 bronze badges
Add a comment  | 

3 Answers 3

Reset to default 1

Before using hydra.utils.instantiate no third party code is not run by hydra. So you can set your seeds before each use of instantiate; or if a partial before each call to the partial.

Here a complete toy example, based on Hydra's doc overview, which creates a partial to instantiate an optimizer or a model, that takes a callable optim_partial as an argument.

# config.yaml
model:
  _target_: "__main.__.MyModel"
  optim_partial:
    _partial_: true
    _target_: __main__.MyOptimizer
    algo: SGD
  lr: 0.01
from functools import partial
from typing import Callable
import random
from pprint import pprint

import hydra
from omegaconf import DictConfig, OmegaConf


class MyModel:
    def __init__(self, lr, optim_partial: Callable[..., "MyOptimizer"]):
        self.optim_partial = optim_partial
        self.optim1 = self.optim_partial()
        self.optim2 = self.optim_partial()


class MyOptimizer:
    def __init__(self, algo):
        print(algo, random.randint(0, 10000))


@hydra.main(config_name="config", config_path="./", version_base=None)
def main(cfg: DictConfig):
    # Check out the config
    pprint(OmegaConf.to_container(cfg, resolve=False))
    print(type(cfg.model.optim_partial))
    
    # Create the functools.partial
    optim_partial: partial[MyOptimizer] = hydra.utils.instantiate(cfg.model.optim_partial)
    # Set the seed before you call the a partial
    random.seed(42)
    optimizer1: MyOptimizer = optim_partial()
    optimizer2: MyOptimizer = optim_partial()
    random.seed(42)
    optimizer1b: MyOptimizer = optim_partial()
    optimizer2b: MyOptimizer = optim_partial()

    # model is not a partial; use seed before creation
    random.seed(42)
    model: MyModel = hydra.utils.instantiate(cfg.model)


if __name__ == "__main__":
    main()
# Output
{'model': {'_target_': '__main__.MyModel',
           'lr': 0.01,
           'optim_partial': {'_partial_': True,
                             '_target_': '__main__.MyOptimizer',
                             'algo': 'SGD'}}}
type of cfg.model.optim_partial <class 'omegaconf.dictconfig.DictConfig'>
SGD 1824
SGD 409
SGD 1824
SGD 409
SGD 1824
SGD 409

Generally speaking Hydra is independent of PyTorch and does not directly interact with (except via plugins). _partial_ has nothing at all to do with PyTorch or seeding.

At a glance what you are suggesting should work, but it's best if you just verify it.

Your understanding is correct.Using partial in Hydra simply returns functools.partial object and doesnt immediately execute class constructor or otherwise "bake in" seed.As result seed is not fixed at the time of creating that partial.You can safely call torch.manual_seed(...) or any other seed-setting functions just before you invoke the partial object multiple times for reproducible runs.

A common pattern is something like:

import torch
from hydra import compose, initialize

# For example your Hydra config defines partial for your model
with initialize(config_path="conf",version_base=None):
    cfg =compose(config_name="config")

model_partial=cfg.model  # A functools.partial(MyModel, ...)

# Then in your experiment loop :
for seed in [123, 456, 789]:
    torch.manual_seed(seed)
    model = model_partial()  # Actually instantiate model with given seed
    # train or evaluate your model

本文标签: pythonHow does Hydra partial interact with seedingStack Overflow