Getting Started#
As an example of how to use Soleil, we will build a system to train a basic classifier. The approach presented is a Soleil porting of the CIFAR classification example in the PyTorch website.
Note
An installable Python package with code for this example can be found in source sub-directory <soleil code root>/soleil_examples. You can install these examples as follows:
cd <soleil code root>/soleil_examples
pip install .
For convenience, ./solconf sub-directories are placed inside each example module directory.
Model and train/eval routines#
The CIFAR classifier example consists of three main components, 1) the model, 2) the training routine and 3) the evaluation routine. We show the function and initializer signatures for these components below – the details of the implementation beyond the parameter names are not necessary when building solconf modules.
Note that, as is common, the train and eval routines share some commone parameters.
Note
Soleil assumes that modules with components such as these are installed (or at least in the Python path) and accessible with standard Python import statements.
8class Net(nn.Module):
9 def __init__(self):
6def train(net, trainloader, optimizer, criterion, path):
6def eval(testloader, net, path):
The solconf package#
A solconf package is a directory hierarchy containing *.solconf files that is analogous to a Python package containing modules and nested sub-packages. When loaded, the package’s *.solconf files will be instantiated as SolConfModule objects. Unlike a Python package, solconf packages do not need to be installed – their root configurations can be loaded by file path using load_solconf().
Note
Soleil package root configurations are *.solconf files within the package that are intended to be loaded by the user using load_solconf(). All *.solconf files can be root
configurations if they resolve (i.e., if overrides for all req() members are supplied when loading).
Since our aim is to create a training system, we will create a root configuration called train.solconf inside our solconf package folder:
train.solconf#
1# soleil_examples/cifar/solconf/train.solconf
2
3from soleil.solconf import *
4from torch import nn
5
6# The callable that resolves this module
7type: as_type = "soleil_examples.cifar.train:train"
8
9
10# The parameters of the as_type member, declared below, are
11# `net`, `optimizer`, `criterion`, `trainloader` and `path`.
12
13
14class net:
15 type: as_type = "soleil_examples.cifar.conv_model:Net"
16
17
18class optimizer:
19 type: as_type = "torch.optim:SGD"
20 params = resolved(net).parameters()
21 lr = 0.001
22 momentum = 0.9
23
24
25criterion = nn.CrossEntropyLoss()
26
27# `data` is hidden -- it is not passed to the as_type callable
28data: hidden = load(".data.default")
29
30trainloader = data.trainloader
31
32path = "/tmp/soleil_cifar_example"
The as_type member#
The first member on this package specifies that the package describes a call to the training routine by means of the line:
7type: as_type = "soleil_examples.cifar.train:train"
The as_type annotation on the type member indicates to Soleil that 1) the member will contain a callable that will resolve the module and that 2) all other non-hidden module members will be gathered, resolved and passed to this callable as keyword arguments. If, as in this case, the member’s value is a string with format <module>:<entity>, the as_type modifier will further retrieve the actual callable and assign it the the as_type member. Note that this is only a convenience, and one could also assign to the as_type member the actual callable directly:
from soleil_examples.cifar.train import train
type: as_type = train
Note
Annotations such as as_type and hidden are called modifiers in Soleil parlance and change the behavior of the member they annotate. See the Modifier syntax section for more details on their usage or the cheatsheet’s modifiers section for a full list of available modifiers.
The next two members (net and optimizer) also include a nested as_type-annotated member. The first member describes an instance of the
soleil_examples.cifar.model:Net model shown above.
Description attributes vs. instance attributes#
Continuing our analysis of train.solconf, the second member – optimizer – describes an instance of PyTorch’s torch.optim:SGD optimizer. This description
poses a problem since instantiating the optimizer requires a call to net.parameters() to let the optimizer know what parameters we will optimize. But at this point we only have net’s description and not the actual instance, so we cannot call net.parameters(). We hence create a special object resolved(net) that will lazily evaluate all nested attributes, subscripts and calls to net, resolving these until the entire solconf module is resolved:
20 params = resolved(net).parameters()
If we did not need to enable configuration of net, we could have instead assigned the instance of net directly, obviating the need for the lazy evaluation trick via resolved described above.
Member criterion, for example, is initialized directly to an instance of CrossEntropyLoss:
25criterion = nn.CrossEntropyLoss()
For completeness, a possible configurable description of criterion could instead be:
class criterion:
type:as_type = 'torch.nn:CrossEntropyLoss'
label_smoothing = 0.0
ignore_index = -100
Loading sub-modules#
Since a description of the data used to train and test the model is complex and a concern of its own, we create it in a separate solconf module that we load with the load() function:
28data: hidden = load(".data.default")
The path provided to the load() function follows rules similar to module paths provided to Python import statements. The main difference is that absolute paths will refer to top-level sub-modules within the same package. In this case, since the data sub-package and the root config train.solconf are both at the root of the package, then load(".data.default") and load("data.default") would both load the same sub-module.
Inheriting descriptions#
The data description solconf module "data.default" contains the following code:
1# soleil_examples/cifar/solconf/data/default.solconf
2
3from soleil.solconf import *
4import torchvision.transforms as transforms
5
6
7transform = transforms.Compose(
8 [transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]
9)
10
11
12## Declaring unresolved templates
13@hidden
14class dataset:
15 type: as_type = "torchvision.datasets:CIFAR10"
16 root = "/tmp/soleil_examples/cifar"
17 train = req()
18 download = True
19 transform = transform
20
21
22@hidden
23class dataloader:
24 type: as_type = "torch.utils.data:DataLoader"
25 dataset = req()
26 shuffle = req()
27 batch_size = 4
28 num_workers = 2
29
30
31## Declaing resolvable datasets
32class trainset(dataset):
33 train = True
34
35
36class testset(dataset):
37 train = False
38
39
40## Declaing resolvable dataloaders
41class trainloader(dataloader):
42 dataset = trainset
43 shuffle = True
44
45
46class testloader(dataloader):
47 dataset = trainset
48 shuffle = False
The module contains two template descriptions – dataset and dataloader – that will be derived by the training and testing dataset and dataloader descriptions. These template descriptions on their own cannot be resolved because they contain unspecified required members:
@hidden
class dataset:
...
train = req()
...
@hidden
class dataloader:
...
dataset = req()
shuffle = req()
...
The training and testing datasets inherit all the non-required members and overload the required members, making them resolvable:
class trainset(dataset):
train = True
class testset(dataset):
train = False
class trainloader(dataloader):
dataset = trainset
shuffle = True
class testloader(dataloader):
dataset = trainset
shuffle = False
Differentiating instantiations#
A given Soleil resolvable (e.g., trainset above) always resolves to the same instance of the description:
from soleil import resolve
obj1 = resolve(trainset)
obj2 = resolve(trainset)
assert obj1 is obj2
This makes it possible to pass the same object to multiple as_type members.
When different instances are required, one needs to derive a description for each instance:
obj1 = resolve(trainset)
class trainset2(trainset): pass
obj2 = resolve(trainset2)
assert obj1 is not obj2
This can also be done with the convenience method derive:
trainset2 = derive(trainset1)
obj1 = resolve(trainset)
obj2 = resolve(trainset2)
assert obj1 is not obj2
The submodule and choices overridables#
One common situation in machine learning experiments it the need to swap out one component – the model
14class net:
15 type: as_type = "soleil_examples.cifar.conv_model:Net"
for example – by a new variant. Doing so without modifying existing solconf files is useful, and submodule() offers a way to do so: One first creates a new *.solconf file for the new variant and places all such variants in the same subpackage.
For example, we can place two model descriptions models/conv.solconf and models/fc.solconf inside sub-package models/. Using the special load variant submodule() in
net = submodule('.models', 'conv')
tells soleil to load the model description in soleil module .models.conv if no override is provided, and to otherwise use the override value as the module name. As example, one could load the fully connected variant of the model using
$ solex train.solc net='"fc"'
Another useful function similarly providing special overridable abilities is choices() – it works like submodule() but lets you explicitly provide the value for each string key as opposed to requiring these to be names of sub-modules in a specific sub-package.
As an example, we can rewrite the submodule()-based model selection mechanism above with choices() as follows:
net = choices(
{'conv': load('.models.conv'), 'fc': load('.models.fc')},
'conv'
)
train2.solconf#
In order to run evaluations on the trained model, we need to build an eval.solconf configuration. Since the eval() and train() functions both share common parameters, it makes sense to inherit some of these parameters from the train configuration when building the eval configuration. To support this, we will modify our train.solconf configuration, wrapping all the parameters in a class that we can later inherit from (the lines modified relative to train.solconf are highlighted):
1# soleil_examples/cifar/solconf/train2.solconf
2
3from soleil.solconf import *
4from torch import nn
5
6
7@promoted
8class _:
9 # The callable that resolves this module
10 type: as_type = "soleil_examples.cifar.train:train"
11
12 # The parameters of the as_type member, declared below, are
13 # `net`, `optimizer`, `criterion`, `trainloader` and `path`.
14
15 class net:
16 type: as_type = "soleil_examples.cifar.conv_model:Net"
17
18 global _params
19 _params = resolved(net).parameters()
20
21 class optimizer:
22 type: as_type = "torch.optim:SGD"
23 params = _params
24 lr = 0.001
25 momentum = 0.9
26
27 criterion = nn.CrossEntropyLoss()
28
29 # `data` is hidden -- it is not passed to the as_type callable
30 data: hidden = load(".data.default")
31
32 trainloader = data.trainloader
33
34 path = "/tmp/soleil_cifar_example"
Promoted module classes#
The @promoted decorator applied to this class (see the first two highlighted lines in train2.solconf) is a syntactic convenience that ensures that, when loading train2.solconf using, e.g.,
load_solconf("./train2.solconf")
the returned value continues to be whatever object was described in the module – in this case the output of function train(...) – as opposed to the dictionary {'_':train(...)}.
Similarly, when loading a submodule within a solconf file,
train_class = load("./train2.solconf")
will return the promoted class “_” as opposed to the the solconf module of type SolConfModule.
In general, promotion will make the syntax for CLI overrides more natural and the output of sub-modules loaded within solconf files more intuitive.
Note
It is good practice to always wrap the members of a module in a promoted class. Doing so makes it possible to derive that module to create new root configurations and improves override syntax. Note that all module members outside the class are in effect hidden.
Going back to our example in train2.solconf, wrapping the contents of the module in a class created the following problem: the local context of the nested optimizer class can no longer see the net variable defined in the local context of the containing class _. We address that problem by defining a global variable _params (implicitly hidden due to the underscore prefix – and because it is part of the globals and not the class’s locals) in the parent local context, where net is visible, and using that in the nested class (see the last three highlighted lines in train2.solconf).
eval.solconf#
We can now use train2.solconf as a base to build an eval configuration:
1from soleil.solconf import *
2
3
4@promoted
5class _(_train := spawn(".train2")):
6 type: as_type = "soleil_examples.cifar.eval:eval"
7
8 optimizer: hidden
9 criterion: hidden
10 trainloader: hidden
11
12 testloader = _train.data.testloader
Module inheritance with spawn#
The configuration in eval.solconf consists of a @promoted-decorated class that derives from a spawned module:
1@promoted
2class _(_train := spawn(".train2")):
Function spawn() assumes that it receives a path to a module that likewise contains a promoted class. It will then create a new package and load spawned module in that package, passing in the process any overrides that were specified within the source package. The returned class is hence part of a new package and will be a different class than if the spawned module were instead loaded (e.g., using load(".train2")). Using spawn as opposed to load allows overrides to be specified more naturally, while ensuring that overrides continue to be applied at variable definition time.
Todo
Add a += assignment operator that allows overrides to be applied after the target description (read class or module) is created. This will not support links to dependent variable.