Skip to content

Virtual + External options for circuit extraction #155

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
james-isbister opened this issue May 13, 2025 · 17 comments
Open

Virtual + External options for circuit extraction #155

james-isbister opened this issue May 13, 2025 · 17 comments

Comments

@james-isbister
Copy link
Collaborator

No description provided.

@MWolfR
Copy link
Contributor

MWolfR commented May 13, 2025

In the following branch of obi-one: https://github.com/openbraininstitute/obi-one/tree/extrinsic_as_option
I exposed the option to ask the brain build function doing the heavy lifting to also create extrinsic edges from the parent circuit.
However, there are now two issues:

  1. When the "create_external" option mentioned above is set to True, there are multiple errors in brainbuilder -> extract sub-circuit. They appear to be minor and relatively simply fixed.
  2. I suspect the name of the source population is not properly updated in the edges file representing the new extrinsic input.

ping @chr-pok

@MWolfR
Copy link
Contributor

MWolfR commented May 13, 2025

IMPORTANT: I just noticed the following:
While the virtual node populations from the parent circuit are successfully propagated to the extracted circuit, there is one issue: It looks like the same extraction logic is applied to the virtual populations.

That means, if I extract "Layer4Inhibitory" from an O1 circuit, then also the extrinsic inputs from the rest of S1 are filtered so that only Layer 4 inhibitory source remain. This is obviously not applied to the inputs from VPM / POM.

In any case, that logic does not make sense and should be removed. I do not foresee any use case for it!

@MWolfR
Copy link
Contributor

MWolfR commented May 13, 2025

I just confirmed that the name of the source node population is NOT updated in the edges h5 file! This already affects the original extrinsic inputs from S1 that are propagated to the extracted circuit via "do_virtual=True".

In detail, in the edges file of the extrinsic connectome "external_S1nonbarrel_neurons__S1nonbarrel_neurons__chemical" the following holds:

Inside the edges h5 file there is a dataset under "edges/external_S1nonbarrel_neurons__S1nonbarrel_neurons__chemical/source_node_id". It specifies the source of an extrinsic connection. It has an attribute called "node_population" that identifies the node population of the sources. It must be updated to be the name of the correct node population, i.e., 'external_S1nonbarrel_neurons__S1nonbarrel_neurons__chemical'.

Same is true for the newly created extrinsic connectivity representing the connections from O1 to the extracted circuit.

@chr-pok
Copy link
Collaborator

chr-pok commented May 13, 2025

That means, if I extract "Layer4Inhibitory" from an O1 circuit, then also the extrinsic inputs from the rest of S1 are filtered so that only Layer 4 inhibitory source remain. This is obviously not applied to the inputs from VPM / POM.

In any case, that logic does not make sense and should be removed. I do not foresee any use case for it!

Hmm, this sounds like a potential issue I meant before which could be tricky to handle with node sets in multi-population circuits. Assuming that "Layer4Inhibitory" is just defined as {"layer": 4, "synapse_class": "INH"} it would be resolved in all available populations, except the (virtual) ones which don't have these properties. So my guess is that because of that the filter may be wrongly applied to the extrinsic population(s) as well. Yet to be further investigated...

@chr-pok
Copy link
Collaborator

chr-pok commented May 13, 2025

Inside the edges h5 file there is a dataset under "edges/external_S1nonbarrel_neurons__S1nonbarrel_neurons__chemical/source_node_id". It specifies the source of an extrinsic connection. It has an attribute called "node_population" that identifies the node population of the sources. It must be updated to be the name of the correct node population, i.e., 'external_S1nonbarrel_neurons__S1nonbarrel_neurons__chemical'.

If I am not mistaken, the correct name of the source node population should be 'external_S1nonbarrel_neurons'.

@MWolfR
Copy link
Contributor

MWolfR commented May 14, 2025

  1. Add the path to the “id_mapping.json” file at a useful place to the circuit_config.json.
  2. Inside the id_mapping.json, add for each population an entry “old_population_name” that is the original node population name the parent circuit. E.g., the entry for “external_S1nonbarrel_neurons” would be “S1nonbarrel_neurons”.
  3. Update the source population name inside the edges.h5 file
  4. Ensure that all edge and node populations are listed in the circuit_config.json
  5. Ensure that no filtering is applied to extrinsic populations. Note that this is maybe not that important. This only affects cases where you extract a circuit from a circuit that is already extracted. Such double-extracted extrinsics may not be that useful anyways.

@james-isbister
Copy link
Collaborator Author

This is one for another day, but I quite strongly agree with what Michael said the other day that in the medium/long-term we ideally wouldn't do any remapping.

I think its very straightforward that a neuron maintains its id from the circuit it was originally placed in

I appreciate that there may be lots of analysis pipelines that expect continuous indices though!

@chr-pok
Copy link
Collaborator

chr-pok commented May 14, 2025

  1. Add the path to the “id_mapping.json” file at a useful place to the circuit_config.json.

Just an idea, we could also add the original IDs directly as a new column to the nodes file.

@chr-pok
Copy link
Collaborator

chr-pok commented May 14, 2025

  1. Ensure that no filtering is applied to extrinsic populations.

We could even remove any existing extrinsic populations before running the circuit extraction. This would require to change the circuit config in-place. In split_population.py in my custom branch split_popul_fix of brainbuilder I have already created support for passing a Circuit object instead of a path which should allow us to apply in-place changes to an existing circuit:

def split_subcircuit(output, node_set_name, circuit, do_virtual, create_external)
    """Split a single subcircuit out of circuit, based on nodeset
    Args:
        ...
        circuit(bluepysnap.Circuit|str): Sonata circuit object or path to circuit_config sonata file
        ...
    """

@MWolfR
Copy link
Contributor

MWolfR commented May 14, 2025

I have in the branch mentioned above implemented the points 1-4 listed above. With respect to point 5 I added the option to specify a prefix such that all node and edge populations with that prefix are deleted from the circuit before it goes through extraction. Not elegant, but it tackles the problem.

However, all of the things would be much better to do on the level of brain builder, instead of the circuit extraction itself. To make it work, I had to hack a lot of things quite ugly.

Additionally, here is another point:
6. The newly created extrinsic node population loses all its properties. Info on locations, mtypes, etc. are not extracted. That should be fixed.

@MWolfR
Copy link
Contributor

MWolfR commented May 16, 2025

After the meeting yesterday and some more discussions and thinking, the following seems to be a working goal:
- When a circuit is an extracted circuit, it references in its circuit_config.json a json file containing an id mapping

- The id mapping file is a json file with the following structure. 
    - A dictionary, keys are the names of all node populations; values are dicts with keys:
        - “old_id” a  list of ids in the originally built circuit
        - “new_id” a  list of ids of the same nodes in the same order in the extracted circuit
        - “old_population” the name of their node population in the originally built circuit.
    - Note that the originally built circuit is not necessarily the parent circuit that is extracted from, but it can also be the grandparent, etc.
    

- When a circuit is extracted from an extracted circuit: Let A be the original circuit, B the circuit extracted from A and C the newly from B extracted circuit. B will contain a node population representing the extrinsic inputs from A\B. C will require a new node population representing the extrinsic inputs from B\C. These two extrinsic populations must be merged into a single population representing extrinsic from A\C. This can be done as follows:
	- When extracting C, the extractor iterates over all node populations. When it encounters a node population that is 
	    - (1) virtual 
	    - (2) in the id_mapping file has an “old_population” name that is equal to the non-virtual node population, then it knows that it represents A\B and must be merged.  

- When spike replay is used it works as follows:
	- The code of a spike replay block has access to the following inputs, either explicitly or through entiticore:
		- Spikes 
		- circuit that the spikes were generated from (circ_o)
		- name of the source node population that the spikes are to be injected for
		- circuit that the spikes are to be injected into (circ_i)
	- If circ_i == circ_o, nothing needs to be done
	- else, spike node ids are re-mapped as follows:
		- First, using the id_mapping of circ_o from “new_id” to “old_id”
		- Then using the id_mapping of circ_i from “old_id” to “new_id”.

@james-isbister
Copy link
Collaborator Author

Sounds good!

I would only suggest using "orig_id" (and "orig_population") rather than "old_id", as 1) it sounds nicer and 2) its less ambiguous I think

@MWolfR
Copy link
Contributor

MWolfR commented May 16, 2025

Created three issues in https://github.com/openbraininstitute/brainbuilder/issues to track the requirements. I have not seen @james-isbister comment when creating them, so I used the "old_id" term instead. We can change that when it's picked up.

@james-isbister
Copy link
Collaborator Author

Also am I right in thinking that the mapping table for say the external biophysical synapses can be of length:

len(union(afferent neurons onto the subcircuit in the original circuit, subcircuit neurons)) ?

@MWolfR
Copy link
Contributor

MWolfR commented May 19, 2025

@james-isbister : What?

@james-isbister
Copy link
Collaborator Author

The mapping file doesn't need to be of length S1 nodes. But just of the length:

(number of nodes in subcircuit) + (number of nodes which project from S1 to the subcircuit) - (number in the intersection of these two)

@evakennyobi
Copy link

it is planned for these sub-issues to be assigned & commence this week?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants