Skip to content

Conversation

@TorkelE
Copy link
Member

@TorkelE TorkelE commented Jan 10, 2026

This adds a @SymbolicNeuralNetwork macro. Basically

@SymbolicNeuralNetwork NN, p = chain

is equivalent to

NN, p = SymbolicNeuralNetwork(; chain, n_input = num_inputs(chain), n_output = num_outputs(chain), nn_name = :NN, nn_p_name = :p)

On the left-hand side of = you provide the variables to store the neural network an parameteridation in. On the right-hand side you have the neural network architecture. The only other rgument you can potentially give as a random number generator:

rng = Xoshiro(111)
@SymbolicNeuralNetwork NN, p = chain, rng

(the interface for this is not that pretty, but the best I came up with, and figured it would be good to have the option there)

@TorkelE
Copy link
Member Author

TorkelE commented Jan 11, 2026

I think the test failure is related to JET, but I am a bit unsure what it is or how it fails here.

Copy link
Member

@SebastianM-C SebastianM-C left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main issue that I have with this is the input size implementation as I detailed in the other comments. I wanted to add that in the past to LuxCore and ended up deleting that function because it was not possible to implement in a generic enough manner.

Comment on lines 217 to 218
_num_chain_inputs(chain) = chain.layers[1].in_dims
_num_chain_outputs(chain) = chain.layers[end].out_dims
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the outputsize API from LuxCore which is more generic than this. See https://lux.csail.mit.edu/stable/api/Building_Blocks/LuxCore#Layer-size

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, yeah, I figured there probably was something more appropriate

end

# Internal functions for determining the number of NN inputs and outputs.
_num_chain_inputs(chain) = chain.layers[1].in_dims
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only works for a Chain where the first layer is Dense (or some other layer with the in_dims field, but there are other layers for which this will error: https://lux.csail.mit.edu/stable/api/Lux/layers#Built-In-Layers

I'm not convinced that we can compute the input size of an arbitrary Lux model. I did try to add this to LuxCore LuxDL/Lux.jl#491 (comment) but I did end up with removing that function. There are some layers that can work with arbitrary input sizes, like WrappedFunction which is why I don't have any automated way of setting this option or any internal validation for the inputs (we have that for the outputs).

```
Here, `@SymbolicNeuralNetwork` takes the neural network chain as its input, and:
1) Automatically infer nn_name and nn_p_name from the variable names on the left-hand side of the assignment.
2) Automatically infer n_input and n_output from the chain structure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I don't think it's possible to do the n_input automatically. See my other comment for more details, but the main counter argument is that you can have layers that work with any input dimension (for example think about a custom scaling layer that scales the inputs by some number).

Comment on lines +7 to +11
chain = Lux.Chain(
Lux.Dense(1 => 3, Lux.softplus, use_bias = false),
Lux.Dense(3 => 3, Lux.softplus, use_bias = false),
Lux.Dense(3 => 1, Lux.softplus, use_bias = false)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All tests use Chains of Dense, but a single Dense for example would error with the current implementation or a WrappedFunction layer which doesn't have any computable input size.

@TorkelE
Copy link
Member Author

TorkelE commented Jan 12, 2026

Didn't realise that you could not compute input sizes for all Lux models, agree that that would prevent something like this for working.

The only solution I can come up with would be to only allow this for a narrower case of Lux models where the input size can be computed. We then make a note of this in the docs, and when someone uses it for another type of Lux model, we throw an error and point the users to the interface which works more generally. Do you think that would work?

@SebastianM-C
Copy link
Member

yeah, that would be a compromise, I suppose we need to have a clear list of what the macro intends to support, I don't have any particular preferences there.

@TorkelE
Copy link
Member Author

TorkelE commented Jan 12, 2026

What would you say is the largest category of models for which automatically finding input size is possible/straightforward? It is fine if it is a fairly limited class still. I think a main function of the macro is that it make really nice demo examples that are easy to understand that we can put in docs/presentations/posters. For people who are quite eep into NN architecture it is probably less relevant anyway.

@SebastianM-C
Copy link
Member

SebastianM-C commented Jan 12, 2026

I think these would be the layers + the Chain for the layer container. There are other layers that store things related to the input dimension, but they are not as restrictive since you can have only requirements on a certain dimension (like for the Conv layer). There is also the complication from the batching dimension, but that is not really usable with UDEs anyway.

  ┌──────────┬──────────────────────┬──────────────────────────────────────┐     
  │  Layer   │        Field         │             Input Shape              │   
  ├──────────┼──────────────────────┼──────────────────────────────────────┤    
  │ Dense    │ .in_dims             │ (in_dims, batch)                     │   
  ├──────────┼──────────────────────┼──────────────────────────────────────┤    
  │ Bilinear │ .in1_dims, .in2_dims │ (in1_dims, batch), (in2_dims, batch) │   
  ├──────────┼──────────────────────┼──────────────────────────────────────┤    
  │ RNNCell  │ .in_dims             │ (in_dims, batch)                     │    
  ├──────────┼──────────────────────┼──────────────────────────────────────┤     
  │ LSTMCell │ .in_dims             │ (in_dims, batch)                     │   
  ├──────────┼──────────────────────┼──────────────────────────────────────┤    
  │ GRUCell  │ .in_dims             │ (in_dims, batch)                     │   
  └──────────┴──────────────────────┴──────────────────────────────────────┘  

@TorkelE
Copy link
Member Author

TorkelE commented Jan 13, 2026

Do you know what I should provide as the second two arguments to

Lux.outputsize(layer, x, rng)

?

I tried checking the doscs and source code but didn't find anything.

@TorkelE
Copy link
Member Author

TorkelE commented Jan 13, 2026

I have tried to update to account for different input types. I went through the Lux documentation, but couldn't actually find instances where these was used, so I think that my test examples might not be very insightful... Also unsure whether my approach for the bilayer worked. I tried it out for simple inputs, but not fully sure whether it would work "in action".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants