You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
if I am not mistaken, you have mixed-up the temporal and the feature dimension in the GRU input of the latent correlation layer.
So, the input data x of the latent_correlation_layer has the shape (batch_size, window_size, number_of_nodes).
Then, the permutated input of the GRU has the shape (number_of_nodes, batch_size, window_size).
Since you do not set the batch_first flag, the PyTorch GRU implementation expects an input of shape (sequence_length, batch_size, input_size) (see documentation ).
Moreover, in the paper you say that you use the last hidden state.
In your code you simply use the output of the GRU, which is the hidden state at every single time step and not just the last one.
If you would actually use the last hidden state and fix the allegedly mixed-up dimensions, you would end up with a single vector of length hidden_size in the PyTorch documentation, which corresponds to your units input, i.e., the number of nodes.
Now, I am wondering about the correct dimensionality of your linear projections W^Q and W^K for the query and key representations.
These are vectors in the current implementation, which would not make much sense in the equations from the paper in combination with a vector R.
My conclusion: To me, it seems like there are major conflicts between the paper and the implementation regarding the latent correlation layer and the formal approach from the paper has issues regarding correct dimensionalities or is formulated inaccurately.
Please correct me, if I made any mistake.
Best regards
Chris
The text was updated successfully, but these errors were encountered:
@Chrixtar I totally agreed with you. I also believe this implementation was evaluated on the ECG dataset in a way that doesn’t match the description on the ECG dataset website. Specifically, the columns should be the time steps and the rows should be the nodes based on the description in the ECG dataset website.
Hello,
if I am not mistaken, you have mixed-up the temporal and the feature dimension in the GRU input of the latent correlation layer.
So, the input data x of the latent_correlation_layer has the shape
(batch_size, window_size, number_of_nodes)
.Then, the permutated input of the GRU has the shape
(number_of_nodes, batch_size, window_size)
.Since you do not set the batch_first flag, the PyTorch GRU implementation expects an input of shape
(sequence_length, batch_size, input_size)
(see documentation ).Moreover, in the paper you say that you use the last hidden state.
In your code you simply use the output of the GRU, which is the hidden state at every single time step and not just the last one.
If you would actually use the last hidden state and fix the allegedly mixed-up dimensions, you would end up with a single vector of length
hidden_size
in the PyTorch documentation, which corresponds to yourunits
input, i.e., the number of nodes.Now, I am wondering about the correct dimensionality of your linear projections W^Q and W^K for the query and key representations.
These are vectors in the current implementation, which would not make much sense in the equations from the paper in combination with a vector R.
My conclusion: To me, it seems like there are major conflicts between the paper and the implementation regarding the latent correlation layer and the formal approach from the paper has issues regarding correct dimensionalities or is formulated inaccurately.
Please correct me, if I made any mistake.
Best regards
Chris
The text was updated successfully, but these errors were encountered: