-
-
Notifications
You must be signed in to change notification settings - Fork 6
Add @SymbolicNeuralNetwork maro #103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
I think the test failure is related to JET, but I am a bit unsure what it is or how it fails here. |
SebastianM-C
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main issue that I have with this is the input size implementation as I detailed in the other comments. I wanted to add that in the past to LuxCore and ended up deleting that function because it was not possible to implement in a generic enough manner.
src/ModelingToolkitNeuralNets.jl
Outdated
| _num_chain_inputs(chain) = chain.layers[1].in_dims | ||
| _num_chain_outputs(chain) = chain.layers[end].out_dims |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have the outputsize API from LuxCore which is more generic than this. See https://lux.csail.mit.edu/stable/api/Building_Blocks/LuxCore#Layer-size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, yeah, I figured there probably was something more appropriate
src/ModelingToolkitNeuralNets.jl
Outdated
| end | ||
|
|
||
| # Internal functions for determining the number of NN inputs and outputs. | ||
| _num_chain_inputs(chain) = chain.layers[1].in_dims |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only works for a Chain where the first layer is Dense (or some other layer with the in_dims field, but there are other layers for which this will error: https://lux.csail.mit.edu/stable/api/Lux/layers#Built-In-Layers
I'm not convinced that we can compute the input size of an arbitrary Lux model. I did try to add this to LuxCore LuxDL/Lux.jl#491 (comment) but I did end up with removing that function. There are some layers that can work with arbitrary input sizes, like WrappedFunction which is why I don't have any automated way of setting this option or any internal validation for the inputs (we have that for the outputs).
| ``` | ||
| Here, `@SymbolicNeuralNetwork` takes the neural network chain as its input, and: | ||
| 1) Automatically infer nn_name and nn_p_name from the variable names on the left-hand side of the assignment. | ||
| 2) Automatically infer n_input and n_output from the chain structure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I don't think it's possible to do the n_input automatically. See my other comment for more details, but the main counter argument is that you can have layers that work with any input dimension (for example think about a custom scaling layer that scales the inputs by some number).
| chain = Lux.Chain( | ||
| Lux.Dense(1 => 3, Lux.softplus, use_bias = false), | ||
| Lux.Dense(3 => 3, Lux.softplus, use_bias = false), | ||
| Lux.Dense(3 => 1, Lux.softplus, use_bias = false) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All tests use Chains of Dense, but a single Dense for example would error with the current implementation or a WrappedFunction layer which doesn't have any computable input size.
|
Didn't realise that you could not compute input sizes for all Lux models, agree that that would prevent something like this for working. The only solution I can come up with would be to only allow this for a narrower case of Lux models where the input size can be computed. We then make a note of this in the docs, and when someone uses it for another type of Lux model, we throw an error and point the users to the interface which works more generally. Do you think that would work? |
|
yeah, that would be a compromise, I suppose we need to have a clear list of what the macro intends to support, I don't have any particular preferences there. |
|
What would you say is the largest category of models for which automatically finding input size is possible/straightforward? It is fine if it is a fairly limited class still. I think a main function of the macro is that it make really nice demo examples that are easy to understand that we can put in docs/presentations/posters. For people who are quite eep into NN architecture it is probably less relevant anyway. |
|
I think these would be the layers + the |
|
Do you know what I should provide as the second two arguments to Lux.outputsize(layer, x, rng)? I tried checking the doscs and source code but didn't find anything. |
|
I have tried to update to account for different input types. I went through the Lux documentation, but couldn't actually find instances where these was used, so I think that my test examples might not be very insightful... Also unsure whether my approach for the bilayer worked. I tried it out for simple inputs, but not fully sure whether it would work "in action". |
This adds a
@SymbolicNeuralNetworkmacro. Basicallyis equivalent to
On the left-hand side of
=you provide the variables to store the neural network an parameteridation in. On the right-hand side you have the neural network architecture. The only other rgument you can potentially give as a random number generator:(the interface for this is not that pretty, but the best I came up with, and figured it would be good to have the option there)