Implement xtensor.signal.convolve1d#1868
Conversation
294bc8e to
18ffc19
Compare
18ffc19 to
c19ec69
Compare
pytensor/xtensor/signal.py
Outdated
| - 'valid': The output consists only of elements that do not rely on zero-padding, with shape (..., max(N, M) - min(N, M) + 1,). | ||
| - 'same': The output is the same size as in1, centered with respect to the 'full' output. | ||
| dim: str | ||
| The dimension along which to convolve. Must be present in both inputs, and will be present in the output |
There was a problem hiding this comment.
I have doubts about this. To make things concrete, suppose I'm doing language modeling, so the input is a table of tokenized sentences with dims (document, word). Then the filter is dim (word, ). Are these really the same thing conceptually? The filter is weight over words, but it's a bit sketchy to say they're both the same dimension.
I'm not 100% sure I'm right but I want to flag my doubt.
There was a problem hiding this comment.
Pragmatically the alternative is you need 3 dims to specify this operation, the dim of first input, the dim of the second input and the dim of the output?
There was a problem hiding this comment.
the output of the operation is definitely the same dimension as the first input. I only have a doubt about what the dims of the kernel are.
In my example, the dimension of the kernel would be "weights" i guess, and the labels are positional lags. If the dimension were words (which in a sense I guess it is), you would expect the labels to also be tokens in the vocabulary, but instead it will be -1, -2, -3 giving positional offsets.
There was a problem hiding this comment.
We can allow 1, or 3 dims.
If 1 it has to be shared,
If 3, the first is for left and the second for right input (exclusively) and the last for the output.
We can also allow 2 dims, and say the leftmost remains in the output. I slightly dislike it because it gives preference to one of the inputs, but mathematically the operation is commutative?
There was a problem hiding this comment.
data/kernel sounds like ML world, but that's just one application of convolution? Our inputs are simply called in1 and in2
There was a problem hiding this comment.
What you're doing here seems fine, and in the test you got it to work with apply_ufunc -- I would have wagered that it wouldn't have worked in pure xarray. I just don't think it's purely correct, but given that I have no good alternative, I just want to document my thoughts and we can move on.
There was a problem hiding this comment.
data/kernel sounds like ML world, but that's just one application of convolution? Our inputs are simply called in1 and in2
I think it's signal processing too, but my motivation was to give some concrete things to reason about.
There was a problem hiding this comment.
We can allow the 1/2/3 dims syntax. We do something like that for dot where b may be vector or matrix.
I see the point of allowing convolution with distinct labels. I do think in some cases (like autocorrelation?) it makes sense to be a shared dimension?
The apply_ufunc allows excluding dimensions that need not be aligned, basically the same flag I added to XBlockwise
There was a problem hiding this comment.
Changed to 2 distinct dims, and reverted XBlockwise special flag until we need it
c19ec69 to
f71c979
Compare
1505b7d to
fc7216d
Compare
Make sure we don't issue unexpected warnings
fc7216d to
007d060
Compare
| zeros_right = as_xtensor( | ||
| zeros((in2_core_size - 1) // 2, dtype=in1.dtype), dims=(in1_dim,) | ||
| ) | ||
| in1 = concat([zeros_left, in1, zeros_right], dim=in1_dim) |
There was a problem hiding this comment.
if you bully me about it i'll make xtensor for pad
There was a problem hiding this comment.
why concat is so pretty
|
|
||
| full_mode = as_scalar(np.bool_(mode == "full")) | ||
|
|
||
| xop = XBlockwise( |
There was a problem hiding this comment.
I thought you took this away, i misunderstood your comment maybe
There was a problem hiding this comment.
I took away the XBlockwise argument that says core input dimensions with the same name need not be aligned in length.
That was needed so both in1 and in2 could have the same dim, but need not have the same length. Now that they must use separate dims I didn't need that special argument.
There was a problem hiding this comment.
What did you think I meant?
Also includes misc cleanup I have been gathering for a while. Commits are logically separated