Hierarchisation Scheme
A Hierarchisation Scheme is a strategy for creating model parameters from two sources: a shared "Group" source and a specific "Subject" source.
It is defined not by a single class, but by a logic pattern implemented inside each Model (Encoder or DSR Step).
Common Schemes
1. None (Baseline)
- Concept: Every parameter is fully shared. There is no subject adaptation.
- Group Params: Full weight matrices ().
- Subject Params: None.
- Construction: .
2. Linear Projection
- Concept: Parameters lie on a low-dimensional manifold. Each subject is a point on this manifold.
- Group Params: A large "basis" tensor or projection matrix ().
- Subject Params: A small "coordinate" vector ().
- Construction: .
- Use Case: Efficient adaptation when you have many subjects but limited data per subject.
3. Outer Product
- Concept: Perturb a shared base matrix with a rank-1 update.
- Construction: .
Implementation Guide
To implement or modify a scheme, you work within the initialize_* and construct_params methods of the specific model (e.g., src/models/dsr_model/models/alrnn/base.py).
Step 1: Define the Scheme Logic
You switch on the scheme dictionary to decide how to behave.
Example from LinearProjectionALRNN:
# Initialization Phase
@classmethod
def initialize_group_level_params(cls, hierarchisation_scheme, hyperparameters):
if hierarchisation_scheme["scheme"] == "linear-projection":
# Create the heavy lifting "basis" matrices
return nn.ParameterDict({
"P_W": init_projection(feat_dim, lat_dim, lat_dim),
"P_h": init_projection(feat_dim, lat_dim)
})
# ... handle other schemes ...
@classmethod
def initialize_subject_level_params(cls, hierarchisation_scheme, ...):
if hierarchisation_scheme["scheme"] == "linear-projection":
# Create the lightweight subject specific vector
return init_vector(hierarchisation_scheme["feature_dimension"])
# Construction Phase
@classmethod
def construct_params(cls, hierarchisation_scheme, hyperparams, group_params, subject_params):
if hierarchisation_scheme["scheme"] == "linear-projection":
p = subject_params # The vector s
# W = P_W * p
W = construct(group_params["P_W"], p)
# h = P_h * p
h = construct(group_params["P_h"], p)
return {"W": W, "h": h}
Step 2: Define Scheme Helpers
To avoid duplicating code (e.g., the math for ), helpers are often placed in src/models/dsr_model/utils/hierarchisation/.
- See
LinearProjectionSchemeUtilsfor standard projection logic. - See
CommonInitializationUtilsfor standard random init.
Step 3: Registering a New Scheme
- Definitions: Add a new configuration TypedDict (if needed) to
src/models/hierarchised_model.pyor the model's specific file. - Logic: Implement the
if scheme == "new_scheme"branches in your target model'sbase.py(or a dedicated file likenew_scheme_alrnn.py). - Config: Ensure
resolve_configallows this new scheme string.
Why is there no "Scheme Class"?
You might expect a LinearProjectionScheme class. We avoid this because different models use schemes differently.
- A "Linear Projection" for a CNN Encoder (projecting convolution filters) is mathematically different from a "Linear Projection" for an RNN (projecting recurrent weights).
- Therefore, the interpretation of the scheme is left to the Model class itself.