Category Theory ∩ Machine Learning

Category Theory ∩ Machine Learning(github.com)

118 points by bgavran 3 years ago | 69 comments

I have recently written a paper on understanding machine learning via the lens of Hopf algebra https://arxiv.org/abs/2302.01834.

Hopf algebras (which are really just tensors with recurrence relations built in) subsume convnets, transformers and diffusion model and also provide a theoretically better autodiff that operates within single layers as opposed to across entire graphs.

Furthermore, there is a correspondence between Hopf algebra and cyclical linear logic and Hopf algebras are related to zonotopes, which are polyhedra that have been used in verified numerical computation. I'm strongly convinced the LL connection can provide proofs over zonotopes which paves the way towards interpretable AI and will be central for XAI.

I know this sounds too good to be true but Persi Diaconis has also written a paper that shows how useful Hopf algebras are in the context of Markov chains https://arxiv.org/abs/1206.3620

I'm working on a next gen Hopf algebra based machine learning framework.

Join my discord if you want to discuss this further https://discord.cofunctional.ai.

====

My account is currently rate limited so I will use this comment to respond to comments below.

red_trumped: What about Hopf algebras do I not understand?

gaze: Haha, it's been a while since I have commented about QC. What do I not understand about it? And what comment are you referring to?

red_trumpet 3 years ago | |

Your paper didn't pass my smell test at all, tbh. For example the formula you write about "product" and "coproduct" in section 3 is literally identical (as "=" is symmetric). In section 4.2 you write "the product is the standard tensor product" with a formula that doesn't at all involve the map m: A \otimes A \to A. The formula you write is the induced product on A \otimes A, assuming that you already have a product on A. The formula for "coproduct" is just an example[1] of a coproduct, not every coproduct has to look that way.

[1] https://en.wikipedia.org/wiki/Coalgebra#Examples

keithalewis 3 years ago | | |

Same here. Sloppily written with very little content. If the author can't take the time to proofread his own paper why should anyone else waste their time?

jurynulifcation 3 years ago | | |

adamnemecek has posted too many comments and is in cooldown phase, but he's asked me to post this comment: "It's the programmers equal sign. I think that the surrounding text provides a decent explanation what the deal is.

You are right, there's a missing sentence fragment, "standard tensor product that satisfies the property...".

Read the Diaconis paper. "

--- This isn't a sock puppet and I hope this isn't against site rules. I just wanted to try and help facilitate good discussion. I think trumpet brought up some interesting criticisms and felt Adam had a legitimate interest in responding ASAP.

gexaha 3 years ago | |

could you advertise your research a bit less often, please? i see your post like literally almost every other day here

hgsgm 3 years ago | | |

Or at least explain it in more accessible way. Every time Adam posts about the paper, it gets confused comments and no engagement on the content, because it's pretty deep graduate level pure math, which is occasionally seen but rare on HN.

nicwilson 3 years ago | |

Intersting papers.

https://arxiv.org/abs/2302.01834 appears to have a typo in section 4.5

S(hg) = S(g)S(g)

looks like it should be S(hg) = S(h)S(g) or S(hg) = S(g)S(h)

adamnemecek 3 years ago | | |

Right thanks.

ianandrich 3 years ago | |

I just read your Coinductive guide to inductive transformer heads paper.

My mind is blown.

Is the Hopf Algebra based ML framework you are working on on your github? I took a glance, but you have 1500 repositories and it wasn't on the first few of them.

adamnemecek 3 years ago | | |

It's in very early stages and it's not there yet no. Join the discord https://discord.cofunctional.ai or my twitter https://twitter.com/adamnemecek1 if you want to follow progress. It might take some time.

umutisik 3 years ago |

It is tempting to believe that category theory will shed new light on and simplify machine learning, just like it did in algebraic geometry, algebraic topology and other mathematical things. This is wishful thinking. Folks who care about doing something useful should stay away from this content.

lgas 3 years ago | |

I'd suggest providing some justification for your declaration if you want anyone to listen to you.

epgui 3 years ago | |

That's a supremely anti-intellectual take.

AlexCoventry 3 years ago |

> Category Theory has been finding increasing applications in machine learning

What's the most compelling application so far?

bawolff 3 years ago | |

Application in the sense they are using it is probably different than the sense you are using it. Although its still probably a fair question regardless.

AlexCoventry 3 years ago | | |

Any application where Category Theory is making it substantially easier to express the software or reason about it is fair game, from my perspective.

rmdamiao 3 years ago |

Is this simply a consequence of exponential growth in CS publications driven by machine learning or is there something really going on here?

bgavran 3 years ago | |

OP here.

The exponential growth in CS publication is much faster. This repository is simply a testament that CT is slowly ramping up.

It's meant to show what kind of expressive power and breadth current CT models have, which to my knowledge isn't something that's well-known outside of our niche community.

bgavran 3 years ago | | |

It's also meant to suggest where things are going (the kind of a chart I have in mind is this one https://twitter.com/bgavran3/status/1422206118688956420 ), though I understand this is something that deserves a much more substantial proof.

moralestapia 3 years ago | | |

>The exponential growth in CS publication is much faster.

So ... yes?

adamnemecek 3 years ago | |

The field needs better foundations. CT is pretty good.

hgsgm 3 years ago | | |

Why? How?

The OP GitHub site doesn't promote any material that introduces the concepts at all. The "survey" paper at the top is nigh-impenetrable. I'm sure the category theorists are having fun modelling machine learning, but it doesn't show how machine learning benefits from the category theory.

KRAKRISMOTT 3 years ago | | |

No. It won't make a significant (if any at all) difference to effectiveness. Rewriting Pytorch in Haskell won't magically get you AGI.

eigenform 3 years ago |

I'm not experienced/well-read in either ML or CT, but awhile ago I remember hearing Tai-Danae Bradley equate "knowing a word by the company it keeps" to the Yoneda lemma, and I always thought that was kind of interesting (although I guess I'm not qualified enough to know whether that statement is useful or vacuous)

haskellandchill 3 years ago | |

> knowing a word by the company it keeps

I'm still shocked no one has developed language learning software along these lines. I had a prototype in the works for thai years ago but never got time to get it off the ground. using statistical models trained on web corpus for a language learning app seems like a no brainer.

think of it like navigating a word as a point in a graph connected to every example context it is in, with associated words being clickable into similar context bundles. then make it differential between host and target language given a translation so you can see which contexts the translation fails and succeeds in.

donnowhy 3 years ago |

category theory is 'native 2-dimensional' math. i.e. category theory explains everything in terms of graphs, where a graph is made from two different sorts of 'entities', nodes and vertices i.e. categories and morphisms

this being math, I wonder to which extent can category theory be re-expressed in terms of sets.

perhaps a better question is if category theory can be re-expressed (or founded on) functions?

lastly, I wonder if category theory can be expressed in terms of functions (i think maybe it can, without sets?) why shouldn't it be expressible in terms of sets (for some reason I don't think just sets are sufficient, may have to define functions (which possible in terms of sets) before 'expressing' categories starting with set theory)?