Join the Discussion 💬
Share ideas, ask questions, and chat with us over at hydra-zen’s discussion board.
Tip
🎓 Using hydra-zen for your research project? Cite us!
Welcome to hydra-zen’s documentation!#
hydra-zen is a Python library that makes the Hydra framework simpler and more elegant to use. Use hydra-zen to design your project to be:
Configurable: Change deeply-nested parameters and swap out entire pieces of your program, all from the command line.
Repeatable: each run of your code will be self-documenting; the full configuration of your software is saved alongside your results.
Scalable: launch multiple runs of your software, be it on your local machine or across multiple nodes on a cluster.
hydra-zen eliminates all hand-written yaml configs from your Hydra project. It does so by providing functions that dynamically and automatically generate dataclass-based configs for your code. It also provides a custom config-store API and task-function wrapper, which help to eliminate most of the Hydra-specific boilerplate from your project.
hydra-zen is fully compatible with Hydra, and is appropriate for use in both rapid prototypes and production-grade code. It is also great for designing your data science and machine learning research to be reproducible. hydra-zen provides specialized support for using NumPy, Jax, PyTorch, and Lightning (a.k.a PyTorch-Lightning) in your Hydra application.
hydra-zen at a glance#
Suppose you have the following library code.
# Contents of baby_torch.py
# Note: no Hydra/hydra-zen specific code here!
def relu(x): ...
def sigmoid(x): ...
class Model:
def __init__(self, activation, nlayers, logits = False) -> None:
self.summary = f"Model:\n-{activation=}\n-{nlayers=}\n-{logits=}"
class DataLoader:
def __init__(self, batch_size = 10, shuffle_batch = True):
self.summary = f"DataLoader:\n-{batch_size=}\n-{shuffle_batch=}\n"
def train_fn(model: Model, dataloader: DataLoader, num_epochs: int = -1):
print(f"Training with {num_epochs=}\n")
print(model.summary, end="\n\n")
print(dataloader.summary)
We want to be able to configure and run the train_fn
from the commandline, while being
able to modify all aspects of its inputs, including parameters nested in Model
and
DataLoader
.
hydra_zen
makes short work of this: we can create and store custom configurations for
all parts of this library code and generate a CLI that reflects the resulting hierarchical config.
# Contents of train.py
from hydra_zen import just, store
from baby_torch import DataLoader, Model, relu, sigmoid
# Automatically generate and store configs for `Model`
model_store = store(group="model")
model_store(Model, name="generic")
model_store(Model, nlayers=100, name="big")
model_store(Model, nlayers=2, name="tiny")
# Configure that relu/sigmoid should "just" be imported,
# not initialized during run.
activation_store = store(group="model/activation")
activation_store(just(relu), name="relu")
activation_store(just(sigmoid), name="sigmoid")
data_store = store(group="dataloader")
data_store(DataLoader, name="train")
data_store(DataLoader, shuffle_batch=False, name="test")
# Configure the top-level function that will be executed from
# the CLI; provide the default model & dataloader configs to
# use.
store(
train_fn,
hydra_defaults=[
"_self_",
# default config:
# - 'big' model using relu activation
# - train-mode dataloader
{"model": "big"},
{"model/activation": "relu"},
{"dataloader": "train"},
],
)
if __name__ == "__main__":
from hydra_zen import zen
store.add_to_hydra_store()
# Generate the CLI For train_fn
zen(train_fn).hydra_main(
config_name="train_fn",
config_path=None,
version_base="1.3",
)
# Hydra will accept configuration options from
# the CLI and merge them with the stored configs.
#
# hydra-zen then instantiates these configs
# -- creating the Model & DataLoader instances --
# and passes them to train_fn, running the training code.
#
# Hydra records the exact, reproducible config
# for each run, and saves the results in an
# auto-generated, configurable output dir
Now we can configure and run train_fn
from the CLI exposed by train.py
:
$ python train.py
Training with num_epochs=-1
Model:
-activation=<function relu at 0x0000016B9C10F280>
-nlayers=100
-logits=False
DataLoader:
-batch_size=10
-shuffle_batch=True
$ python train.py num_epochs=2 model/activation=sigmoid
Training with num_epochs=2
Model:
-activation=<function sigmoid at 0x00000185640D4280>
-nlayers=100
-logits=False
DataLoader:
-batch_size=10
-shuffle_batch=True
$ python train.py model=tiny model.logits=True dataloader.batch_size=22
Training with num_epochs=-1
Model:
-activation=<function relu at 0x0000016B9C10F280>
-nlayers=2
-logits=True
DataLoader:
-batch_size=22
-shuffle_batch=True
Each run’s reproducible configuration will be saved as a yaml file; by default Hydra places these in a time-stamped directory.
$ less outputs/2023-03-11/12-13-14/.hydra/config.yaml
_target_: baby_torch.train_fn
model:
_target_: baby_torch.Model
activation:
path: baby_torch.sigmoid
_target_: hydra_zen.funcs.get_obj
nlayers: 100
logits: false
dataloader:
_target_: baby_torch.DataLoader
batch_size: 10
shuffle_batch: true
num_epochs: 2
hydra-zen works with arbitrary Python code bases; this example happens to mimic a machine learning application but hydra-zen is ultimately application agnostic.
You can read more about hydra-zen’s config store and its auto-config capabilities here.
Attention, Hydra users:
If you are already using Hydra, let’s cut to the chase: the most important benefit of using hydra-zen is that it automatically and dynamically generates structured configs for you.
This means that it is much easier and safer to write and maintain the configs for your Hydra applications:
Write all of your configs in Python. No more yaml files!
Write less, stop repeating yourself, and get more out of your configs.
Get automatic type-safety via
builds()
’s signature inspection.Validate your configs before launching your application.
Leverage auto-config support for additional types, like
functools.partial
, that are not natively supported by Hydra.
hydra-zen also also provides Hydra users with powerful, novel functionality. With it, we can:
Add enhanced runtime type-checking for our Hydra application, via pydantic, beartype, and other third-party libraries.
Design configs specialized behaviors, like configs with meta-fields.
Leverage a powerful functionality-injection framework in our Hydra applications.
Run static type-checkers on our config-creation code to catch incompatibilities with Hydra.
Installation#
hydra-zen is lightweight: its only dependencies are hydra-core
and
typing-extensions
. To install it, run:
$ pip install hydra-zen
If instead you want to try out the features in the upcoming version, you can install the latest pre-release of hydra-zen with:
$ pip install --pre hydra-zen
Learning About hydra-zen#
Our docs are divided into four sections: Tutorials, How-Tos, Explanations, and Reference.
If you want to get a bird’s-eye view of what hydra-zen is all about, or if you are completely new to Hydra, check out our Tutorials. For folks who are savvy Hydra users, our How-Tos and Reference materials can help acquaint you with the unique capabilities that are offered by hydra-zen. Finally, Explanations provide readers with taxonomies, design principles, recommendations, and other articles that will enrich their understanding of hydra-zen and Hydra.
Note that each page in our reference documentation features extensive examples and explanations of how the various components of hydra-zen work. Check it out!
- Tutorials
- Create and Launch a Basic Application with Hydra
- Add a Command Line Interface to Our Application
- Design a Hierarchical Interface for an Application
- Provide Swappable Configuration Groups
- Inject Novel Functionality via the Application’s Configurable Interface
- Configure and Run scikit-learn’s Classifier Comparison Example
- Run Boilerplate-Free ML Experiments with PyTorch Lightning & hydra-zen
- How-To Guides
- Explanation
- Reference
- Changelog
- 0.13.1rc1 - 2024-07-13
- 0.13.0 - 2024-04-29
- 0.12.1 - 2024-01-21
- 0.12.0 - 2023-12-07
- 0.11.0 - 2023-07-13
- Documentation - 2023-03-11
- 0.10.2 - 2023-07-04
- 0.10.1 - 2023-05-23
- 0.10.0 - 2023-03-05
- Documentation - 2023-01-22
- 0.9.1 - 2023-01-13
- 0.9.0 - 2022-12-30
- 0.8.0 - 2022-09-13
- 0.7.1 - 2022-06-22
- 0.7.0 - 2022-05-10
- 0.6.0 - 2022-03-09
- 0.5.0 - 2022-01-27
- 0.4.1 - 2021-12-06
- 0.4.0 - 2021-12-05
- 0.3.1 - 2021-11-13
- 0.3.0 - 2021-10-27
- 0.2.0 - 2021-08-12
- 0.1.0 - 2021-08-04
- Footnotes