Skip to content

Attributes of fields defined at the class level should not be shared between instances #325

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ylep opened this issue Nov 13, 2023 · 11 comments

Comments

@ylep
Copy link
Member

ylep commented Nov 13, 2023

I am having trouble with the Capsul tests for highres-cortex:

  • when I test each process or pipeline separately all works as expected

    python -m highres_cortex.test.test_capsul SphereTestCase.test_laplacian
    python -m highres_cortex.test.test_capsul SphereTestCase.test_equivolumetric_pipeline
    ...
    
  • but, whenever I run the whole test suite, I get random failures

    python -m highres_cortex.test.test_capsul SphereTestCase
    

The problem is that when I call highres_cortex.capsul.processes.Laplacian after is has been run as part of a pipeline, its fields retain the generate_temporary attribute!

I am using class-level attributes and not instance attributes, because I thought that it was more in line with the philosophy behind traits / pydantic typing.

@denisri

This comment was marked as resolved.

@ylep
Copy link
Member Author

ylep commented Nov 14, 2023

Thanks @denisri for the tip! I have made the change to use a temporary local database as you suggested. Unfortunately, this did not fix the original issue.

I noticed that it is always the same two tests that are consistently failing (test_laplacian and test_upwind_euclidean). I will triple-check that the issue is not on my side...

By the way, since using a temporary database I am getting messages printed to my terminal asynchronously (server has probably shutdown. Exiting.). This is probably harmless, but is a bit annoying.

@ylep ylep changed the title Failures when multiple CapsulEngines are instantiated sequentially in a process Failures when multiple pipelines/processes are run sequentially in a process Nov 14, 2023
@ylep
Copy link
Member Author

ylep commented Nov 14, 2023

I made some progress and characterized the bug a bit more precisely: it happens when certain processes are run in a certain order. For instance, running highres_cortex.capsul.processes.Laplacian after highres_cortex.capsul.isovolume fails systematically:

KEEP_TEMPORARY=1 python -m highres_cortex.test.test_capsul SphereTestCase.test_equivolumetric_pipeline SphereTestCase.test_laplacian --verbose

It behaves as if the highres_cortex.capsul.processes.Laplacian process worked correctly the first time, but somehow got broken after being used in the pipeline (which makes use of it internally). Still investsigating...

@ylep

This comment was marked as outdated.

@denisri
Copy link
Collaborator

denisri commented Nov 14, 2023

I did not take time to look at your pipelines yet, but this definitely looks like a bug in Capsul.
Is the parameter which has a problem linked to a sub-process which is instantiated in both pipelines ?
(I wonder if there is a kind of class-level field which would erroneously share a value between instances, or something similar)

@ylep
Copy link
Member Author

ylep commented Nov 14, 2023

Found it! And yes @denisri you guessed it right: the problem is that when I call highres_cortex.capsul.processes.Laplacian after is has been used in the pipeline, its fields retain the generate_temporary attribute!

I am using class-level attributes and not instance attributes, because I thought that it was more in line with the philosophy behind traits / pydantic typing.

https://github.com/neurospin/highres-cortex/blob/b9b76f008c4b98a828a4267861498c790b13492b/python/highres_cortex/capsul/processes.py#L51-L71

Isn't there a way that we could store instance-level attributes? Maybe the typing fields are not the right place to store such attributes...

@denisri
Copy link
Collaborator

denisri commented Nov 14, 2023

OK, then at least we understand.
Actually we use class-level fields as if they were instance fields, just as a more convenient way to declare them, but currently it's wrong. In Capsul v2, class traits were in fine duplicates as instance traits, so we did not run into this problem (class-level traits thus were not possible). In Capsul v3 we have probably not taken care of this, and fields are not duplicated in instances, thus do not behave the same.
So we have to decide which behaviour we want to implement (assuming it's possible to automatically duplicate class fields on instances).
Do we want to actually have class-level fields (more "pythoninc"), or always have instance fields (as in Capsul v2) ?

Isn't there a way that we could store instance-level attributes? Maybe the typing fields are not the right place to store such attributes...

Maybe not, it's just convenient and easy to do it this way. Another way would probably require some additional development.

@ylep ylep changed the title Failures when multiple pipelines/processes are run sequentially in a process Attributes of fields defined at the class level should not be shared between instances Nov 14, 2023
@sapetnioc
Copy link
Collaborator

In v3, metadata are obtained from the Field instances and for class fields they are stored in class and therefore shared with all instances. It would not be easy to change the API to get metadata from controller instance since metadata are used in many places and can be read/modified by direct attribute access and there is no obvious method to find in code.

So, one option would be to copy all class fields into Process instances during construction. I have the feeling that it may use too much resource for big pipelines but I am not absolutely sure.

Another option could be to make class fields read-only (to avoid users to have the same kind of problem one day), to allow explicit copy of a class fields into an instance field to make it read-write and to store generate_temporary values in a dict in executable instead of in each field. I do not know if there are other internal metadata that are used that way but making class field read-only would raise an exception.

@denisri
Copy link
Collaborator

denisri commented Nov 14, 2023

The copy to instance mechanism can be a solution, and could be done on demand, only for fields for which we want to change metadata. So yes why not. We would need an easy copy function, like Controller.make_instance_field(field_name) or something.

sapetnioc added a commit that referenced this issue Nov 15, 2023
@sapetnioc
Copy link
Collaborator

sapetnioc commented Nov 15, 2023

This issue should be fixed by two PR, populse/soma-base#53 and #326. Controller class fields are now read-only and process instantiation copy them to instance field. It was too difficult to make something smarter by copying fields only when they need to be modified.

sapetnioc added a commit that referenced this issue Nov 16, 2023
#325 adaptation for read-only class fields
ylep added a commit to neurospin/highres-cortex that referenced this issue Nov 16, 2023
populse/capsul#325 has been fixed, remove the workaround
@ylep
Copy link
Member Author

ylep commented Nov 16, 2023

I confirm that the original issue is fixed for me, thanks!

@ylep ylep closed this as completed Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants