An idiot’s guide to lead optimisation for proteins(magnusross.github.io) |
An idiot’s guide to lead optimisation for proteins(magnusross.github.io) |
I used to work for Cradle and writing this paper was the last thing I did before leaving – on good terms – to found my own startup. :D And we'll 100% be using Cradle for our lead optimization.
(On the off-chance: I'm at PEGS Boston this week chatting all things AI+antibodies, in particular for rare diseases. If this topic is of interest to any other protein+tech geeks here then send me an email, let's grab coffee.)
[1] e.g. https://proteindf.github.io/
Do you think it's also the case for lead optimization where you typically have some degree of measurements around your starting point, and you are expecting to stay in that local neighborhood for the generated candidates, too?
(Disclaimer: former Cradle employee here)
Yeah it's totally true you can't build a one-size-fits-all foundation model, the data just isn't there. But also... no-one needs that. It's totally fine to tweak a foundation model for any individual problem, and that's the bulk of what is being described in the linked blog post / in the underlying paper.
FWIW whilst at Cradle we had a lot of doubts going into this. Like, thermostability is clearly evolutionarily correlated so it was always pretty likely that by hook or by crook the models could do that correctly. But, binding? Aggregation? Not at all clear that the same principles should hold. And the exciting finding was that yes, yes they do.
I speculate Cradle is taking the approach they are vs structural/spacial, as structure spacial models don't work very well on big molecules like proteins! (And/or are too slow; errors accumulate over space etc)
20 different types coded for, but once you get into PTMs that number goes way up.
I can think of:
etanercept
The largest commercial classes of multi-domain therapeutic proteins include the crispr (and similar) that drive gene therapies, and the chimeric antigen receptors (and similar) that drive cell therapies.
But lead optimization there look different than this page’s efforts.
I guess I imagine one of the highest order obstacle to protein therapeutics to be immunogenicity, which is really hard to design around for a de Novo protein