-
Notifications
You must be signed in to change notification settings - Fork 49
Matroid lifting (Graph to Combinatorial) #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #32 +/- ##
=======================================
Coverage ? 74.05%
=======================================
Files ? 19
Lines ? 790
Branches ? 0
=======================================
Hits ? 585
Misses ? 205
Partials ? 0 ☔ View full report in Codecov by Sentry. |
ready for review |
Hello @gescalona1 ! Thank you for your submission. As we near the end of the challenge, I am collecting participant info for the purpose of selecting and announcing winners. Please email me (or have one member of your team email me) at [email protected] so I can share access to the voting form. In your email, please include:
Before July 12, make sure that your submission respects all Submission Requirements laid out on the challenge page. Any submission that fails to meet this criteria will be automatically disqualified. |
Update 7/18: Added highlights (after overview)
In this PR, I talk about a way to convert graphs to matroids, which are (almost) combinatorial complexes.
See the accompanying notebook for better details.
Overview
For a given$G = (V, E)$
Important note: Matroids are general structures, and we can always perform a trivial lifting by transforming a matroid to a CC. For example, we can use the set of spanning trees to be the k-cells of a complex, where$k = |V| - 1 - 1$ . However, we can go further and use graph curve matroids, where graph curves are connected, projective algebraic curves which correspond to the connectivity of the dual graphs of the original graph.
Note: the image is missing one blue edge, but the result is the same.
To accommodate with tests, smaller graphs were used. Here is the commit with the notebook that has slightly bigger graphs.
New Datasets
The datasets used were taken from houseofgraphs, which have different datasets based on connectivity, graph genus, etc. For simplicity, they are manually coded in, we can consider using the site as a whole for datasets.
Mathematical Foundation
Covered more below, but these are inspired by abstract notions of linear independence formulated by Whitney, called Matroids. In particular, we also use graph curve matroids, formulated by Geiger et al. They represent the hyperplane sections of the graph curve from the dual graph of graphs.
Original Approaches
To my current knowledge, applying matroids as combinatorial complexes has not been done yet. I also did very minor modifications to the definitions of matroids to suit their compatibility with the definition of combinatorial complexes.
Non deterministic Procedures
While not implemented, there does exist a nondeterministic procedure by way of sampling via markov chains. See [4] for more details.
What is a Matroid?
A Matroid is a very familiar geometric structure that encompasses several combinatorial structures, and I found the opportunity to apply them here! There are a wide variety amount of definitions, but I will focus on 2 of them:$(S, \mathcal{I})$ , a matroid $\mathcal{M} = (S, \mathcal{I})$ , where we understand $S$ to be some ground set and if $I \in \mathcal{I} \subseteq 2^S$ , then $I$ is considered independent. Moreover, $\mathcal{I}$ follows these 3 properties, for if $A, B \subseteq S$ :
We say call a set system
Note: I will use A + b to mean the union of the set A and the singleton set B' = {b}.
In particular, the first and second conditions tell us that a matroid is a (abstract) simplicial complex. The 3rd condition is more special, and I will introduce the 2nd equivalent definition for a Matroid.
Let$\mathit{rank}: S \to Z_+$ be what we will call the matroid rank function. It is one if it follows these properties, for $A \subseteq S, a \in S$ :
The above is equivalent, because we can say that$I \in \mathcal{I}$ if $\mathit{rank}(A) = |A|$ .
From what we can see then is that a Matroid is a combinatorial complex. Albeit with some minor modifications, like the empty set isn't allowed to be in, and the rank is off by 1 for a normal combinatorial complex.
Matroids are combinatorial complexes
So, why would we ever want to use such an object instead of the more general combinatorial complex? It is my opinion that the most important in regard to machine learning is the 3rd property, which is more well known as the submodular property for functions. Alternative definitions include:
Submodular (monotone) functions, like matroid functions, give characteristics of information diversity on sets, where higher diversity is weighted more. [1] There are many problems in machine learning that can be described in terms of submodular minimization/maximization.
Perspectives
We discuss other reasons to use Matroids (from other perspectives):
-- In particular, I would like to point that we can form a partition matroid from hypergraphs to CCs, where its bases can be the induced graphs from a hypergraph.
Other
In this PR, the method described to create a matroid from a given$G$ is the graph curve matroid construction. [2] The tutorial notebook gives a description of this matroid. I would like to note that the implementation is very slow because it relies on computing the spanning trees of a graph. In practice however, there are new results that show we can efficiently sample them (via MCMC). [4]
The difference between matroids and CCs are that matroids include empty sets and singletons have rank 0 instead of 1. To compensate, we can easily remove the empty set of a matroid and subtract its rank by 1.
I also use the HMC model, but however, it uses only up to rank=2 part of the complex, which corresponds to specific triangles that are detailed in the tutorial notebook. In practice, matroid rank functions are much more diverse than that.
[1] Bilmes, J. (2022). Submodularity in machine learning and artificial intelligence. arXiv. https://arxiv.org/abs/2202.00132
[2] Geiger, Alheydis, Kevin Kuehn, and Raluca Vlad. "Graph Curve Matroids." arXiv preprint arXiv:2311.08332 (2023). https://arxiv.org/abs/2311.08332.
[3] Sun, T., & Nelson, B. (2023). Greedy Matroid Algorithm And Computational Persistent Homology. arXiv preprint arXiv:2308.01796. Retrieved from https://arxiv.org/abs/2308.01796
[4] Anari, N., Liu, K., Oveis Gharan, S., & Vinzant, C. (2019). Log-Concave Polynomials II: High-Dimensional Walks and an FPRAS for Counting Bases of a Matroid. arXiv preprint arXiv:1811.01816. Retrieved from https://arxiv.org/abs/1811.01816