Skip to content

extend code to feature groups #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
shaayaansayed opened this issue Mar 21, 2022 · 1 comment
Open

extend code to feature groups #5

shaayaansayed opened this issue Mar 21, 2022 · 1 comment

Comments

@shaayaansayed
Copy link

correct me if I'm wrong, but I don't believe the current code is setup to calculate values for feature groups.

Can you confirm I'm understanding this correctly? To extend the code for groups, we would want to select subsets over feature groups rather than individual features. Then when measuring predictiveness, we include all features that are part of the selected feature groups. So for example, if we have groups:

vitals = [blood_pressure, heart_rate]
labs = [sodium, potassium, sugar]
diagnoses = [kidney, heart, liver]

If S = [0, 1], then we train a model with blood_pressure, heart rate, sodium, potassium, and sugar.

Would we need to normalize anything?

@bdwilliamson
Copy link
Owner

Which function are you using?

For both vim and cv_vim, you should be able to input a vector of indices to the argument s. For your example, if your predictors are [blood pressure, heart rate, sodium, potassium, sugar], you could input s = [0,1] to consider the importance of vitals as a group.

Groups aren't currently set up in spvim. To extend to groups, we would (a) create a partition of the space into the groups (in your example, vitals, labs, and diagnoses), (b) measure predictiveness using each combination of the feature groups [in your example: all variables, no variables, vitals alone, labs alone, diagnoses alone, vitals + labs, vitals + diagnoses, labs + diagnoses], (c) combine together using the formula. The normalization constant would be different than the individual-variable Shapley value.

I don't have time for this at the moment (and I think @jjfeng probably doesn't either -- though she may have thought about it a bit), so if you want to create a PR that would be fantastic!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants