Skip to content

Conflicting latent correlation layer implementation and equations from paper #21

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Chrixtar opened this issue Nov 8, 2021 · 2 comments

Comments

@Chrixtar
Copy link

Chrixtar commented Nov 8, 2021

Hello,

if I am not mistaken, you have mixed-up the temporal and the feature dimension in the GRU input of the latent correlation layer.
So, the input data x of the latent_correlation_layer has the shape (batch_size, window_size, number_of_nodes).
Then, the permutated input of the GRU has the shape (number_of_nodes, batch_size, window_size).
Since you do not set the batch_first flag, the PyTorch GRU implementation expects an input of shape (sequence_length, batch_size, input_size) (see documentation ).

Moreover, in the paper you say that you use the last hidden state.
In your code you simply use the output of the GRU, which is the hidden state at every single time step and not just the last one.
If you would actually use the last hidden state and fix the allegedly mixed-up dimensions, you would end up with a single vector of length hidden_size in the PyTorch documentation, which corresponds to your units input, i.e., the number of nodes.
Now, I am wondering about the correct dimensionality of your linear projections W^Q and W^K for the query and key representations.
These are vectors in the current implementation, which would not make much sense in the equations from the paper in combination with a vector R.

My conclusion: To me, it seems like there are major conflicts between the paper and the implementation regarding the latent correlation layer and the formal approach from the paper has issues regarding correct dimensionalities or is formulated inaccurately.

Please correct me, if I made any mistake.

Best regards
Chris

@tdingquan
Copy link

I had the same doubt while reading the code.

@JunyiYe
Copy link

JunyiYe commented Apr 12, 2023

@Chrixtar I totally agreed with you. I also believe this implementation was evaluated on the ECG dataset in a way that doesn’t match the description on the ECG dataset website. Specifically, the columns should be the time steps and the rows should be the nodes based on the description in the ECG dataset website.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants