AI in physics: are we facing a scientific revolution?(4alltech.com) |
AI in physics: are we facing a scientific revolution?(4alltech.com) |
But then again, pretty much any article that uses AI instead of ML is hogwash too. Are we crediting someone with this one?
[0] https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headline...
this shows the effect of infinite dissecting data searching without a theory pretty well https://xkcd.com/882/
But neural networks are very good parametric function approximators, generally better than what traditionally gets used in physics (b-splines or whatever). So people have started to design neural networks that are well-suited as function approximators for specific physical systems.
It's fairly straightforward -- it's not an "AI" that has "knowledge" of "physics" -- just using modern techniques and hardware to solve a numerical minimization problem. I think this will probably become pretty widespread. It won't be flashy or exciting though -- it will be boring to anyone but specialists, as the rest of machine learning ought to be.
Let's say I supply a high-dimensional DAE, f(x', x, z) = 0, x(0) = x₀, where classical methods like quadrature are unwieldy. Does the algorithm generate n samples in the solution space by integrating n times and then fitting an NN? With different initial conditions? Or does it perform quadrature with NNs instead of polynomial basis functions?
I was thinking specifically of this and related approaches https://arxiv.org/abs/1909.08423 where they search for the ground state by iteratively using an MCMC sampler and doing SGD. The innovation is a network architecture that takes classic approaches from physics and judiciously replaces parts with flexible NNs.
I had not even considered how things might work if you actually want to think about time.
Do you know if anybody has been running this NN+DiffEq solver stuff on big HPC systems that also have GPUs? If you know of any papers where they tried this, would be interesting to look at.
Is there a paper comparing the performance of this particular solver against the state of the art ?
(if you are using GPUs, the AmgX library has a finite-difference solver for Poisson in their examples - very far from the state of the art, but a comparison might put performance in perspective)
Not true. In computational fluid dynamics, variational methods are only one category out of many, and they aren't dominant.
f can be any arbitrary choice that works.
Not sure if the choice of f being a NN is necessarily related to AI, where some cognitive function is being replicated. It is a good function approximator though.
Basically, I think if somebody wants to work in machine learning then they should be encouraged, and I think it's great that barriers to entry are lower than most fields, but the average person should not feel like they need to care about it, and if they do it might be because they have an inaccurate narrative.
https://cacm.acm.org/blogs/blog-cacm/138907-john-mccarthy/fu...
Would an AI be able to learn how to apply multigrid methods?
I would like to see more truly innovative work done on the theoretical side, but I don't think we'll see "AI" bridge the gap between QFT and GR any time soon. I think in order for something like that to happen we need a new approach, as the current approach of throwing deep learning models at it doesn't feel like the right answer.
On a more general note, the SciML organization [1] has been quite successful in helping incorporating more ML into science.
https://www.lesswrong.com/posts/L5JSMZQvkBAx9MD5A/is-gpt-3-c...
And this is without training on the specific task. It's getting scary...
20 years ago I wrote my PhD thesis in physics, using genetic algorithms and neural networks to "guess" some basic physical behaviour in particle physics.
It was difficult to find good reporters because the application was quite exotic but I felt that this is something which would be worth investigating. I quit academia afterwards and did not come back - but I am happy to see that this road is back on the radar.
This then extended to greenfield operations, and finally I moved to information security.
https://computing.fnal.gov/machine-learning/
One of the "AI" applications I remember seeing -- potentially applicable outside physics -- involved using CNNs to read a 2D graph (as in graphical plot, not G = (V,E)) in order to visually detect certain patterns/aberration. (probably many physics groups around the world are doing the same)
At first glance this sounds kind of silly and trivial -- one might say, why not just detect those patterns from the data arrays directly? Instead of from a bitmap image of a plot of the data?
Unfortunately some patterns are contextual. A trained human eye can detect them easily, while writing a foolproof mathematical algorithm is difficult: e.g. it has to pick out the pattern, apply a bunch of exclusion rules etc.
(One instance of this, for example, is an old mechanic telling you what's going on under the hood just from listening the vibrations of a car, while a traditional DSP algorithm might not be able to do it as reliably because it hasn't seen all the patterns and contexts in which those sounds arise.)
This is a domain where neural networks/transfer learning really shines. It can capture "intuition" by learning the surrounding context, rather than relying on handcrafted features.
So Fermilab has an AI algorithm that looks at millions of graphs via a CNN, which replicates the work of thousands of human physicists looking for patterns. We've already seen examples of this in radiology.
Does that mean when AI finally arrives, it slaughters all of us?
Also, in the similie I think humanity is supposed to be the Old world, so I'm really wondering who we're supposed to find and enslave ...
There are groups that are using graph neural networks to understand statistical mechanics and microscopy. There are also a number of groups working on trying to automate synthesis (most of it is Gaussian process based, a handful of us are trying reinforcement learning--it's painful). On the theory side, there is work speeding up simulation efforts (ex. DFT functionals) as well as determining if models and experiment agree (Eun Ah Kim rocks!).
Outside of my field, there has been a push with Lagrangian/Hamiltonian NNs that is really cool in that you get interpretability for "free" when you encode physics into the structure of the network. Back to my field, Patrick Riley (Google) has played with this in the context of encoding symmetries in a material into the structure of NNs.
There are of course challenges. In some fields, there is a huge amount of data--in others, we have relatively small data, but rich models. There are questions on what are the correct representations to use. Not to mention the usual issues of trust/interpretability. There's also a question of talent given opportunities in industry.
Landmark to watch for will be when the first GPT-generated paper gets a citation in a human-authored paper without the human realising.
This affirmation shows the author has little idea about GNNs. GNNs have layers, and each layer is a graph. In order to implement the graph GNNs use the adjacency matrix to propagate information along the edges. But there are multiple layers of GNN, without multiple layers they would not be able to do multi-hop inferences.
"If AI is like Columbus, computing power is Santa Maria"
and intractable physics problems are like... indigenous people?
Just in this domain alone, excluding the 100 other applications of ML, and the fact that we haven't even begun optimization in earnest, I certainly believe ML will change the direction of computing. It already has: look at where investment and research dollars have gone. (not to say that trends don't happen, but when I saw the performance results I thought: sh*t, this is big.)
Add to this the rise of the qubit, and the next 50 years are going to be even crazier than the last 50.
Yes, I am a proselytizer of school of James Gleick. "Faster" was a prophecy[1].
[1] https://www.amazon.com/Faster-Acceleration-Just-About-Everyt...
If anyone is interested, we have one on data science in industry coming up: https://attendee.gotowebinar.com/register/604483936035643777...
What on Earth does that paragraph mean? Parts of the article read to me like they were generated automatically, but other parts don't.
Moore's law appears to be slumping lately.
Re: "This coincides with our previous experience in physics, says Cranmer: "The language of simple symbolic models describes the universe correctly."
As an approximation, yes, but that doesn't mean a "true" formula has necessarily been found.
Not so. https://docs.google.com/spreadsheets/d/1NNOqbJfcISFyMd0EsSrh...
Some people would argue that these things are one, I think otherwise.
Are they somehow using the parameters of the ANN to seed the generic algorithms (and structure)?
The system managed to reinvent Newton's second law and find a formula to predict the density of dark matter.
(note that symbolic regression is often said to improve explainability but that, left unchecked, it tends to produce huge unwieldy formulas full of magical constants)
Why don't they use CNNs on the data series itself?
The 2D plot is more for training -- a human physicist picks out visual artifacts of interest to bootstrap the training. Humans of course can see blips and weird curves better on a visualization than in a pure data series.
For instance, a human can say if this little tail off a contour bends this way, it's right; if it bends in a different way, it's wrong. Or if an contour is "prickly" or "blobby".
Whether something "looks right" or "wrong" is really hard to mathematically reduce to a parsimonious description, especially when there's variance in the samples -- after all, there could be multiple subtle descriptions of "looks right" or "looks wrong" -- but a CNN is perfect for generalizing based on labeled samples. Similar to a radiologist looking at a scan and toggling isTumor = (True, False).
Nonetheless, "AI" applications are pervasive:
Auto - improved robotics, adaptive cruise control Finance - High Frequency Trading, Credit Risk Modeling (i.e. your Credit Score) Health Care - Health insurance risk estimation, Predictive staffing Government - Predictive policing, recidivism risk, benefits decisions Retail - improved customer targeting, inventory management etc etc etc - name an industry, I'll give you 3 examples.
The issue isn't that it's not there; the issue is that it's BORING. And nobody gives a press release saying "we saved 0.4% of COGS from improved inventory demand forecasts," even if that represents $10M, because nobody cares.
But boring doesn't mean it's not a bazillion dollar opportunity for a lot of companies.
AI has already won, most people just don't realize it.
All those successful forms of AI are narrow, not the AGI of science fiction (like Data, Skynet, HAL) or Ray Kurzweil predictions. AI is a tool humans use to extend human capabilities. It always has been. Maybe someday it will be something more.
That said, what definition of AI are you using? It seems to me you're stretching it a bit...
See:
https://arxiv.org/abs/1912.05079
https://chemrxiv.org/articles/Extending_the_Applicability_of...
It's nice, but not really quantum-mechanics level (which is maybe HF, DFT or coupled cluster), which takes a lot more cycles (but also allows to optimize geometries without knowing wether a bond exists)
I think reproducibility can be tackled--at least some journals (shameless plug--I'm a lowly associate editor on science advance) are strongly encouraging people include data/code with publications. I have reviewed papers in Nature Comput. Materials where people have included data/jupyter notebooks (not perfect, but a very good start). It would be great if funding agencies started adding more teeth to requirements on data sharing. However, many more groups are putting their code on Github.
Machine "learning" isn't ideal either, but is at least a bit more limited in the scope of what it conveys.
Once you leave the hype baggage behind, it's more easy to see the significant progress that these tools - in concert with increased power and data resources - have made in many different areas over the last few years, some of them listed elsewhere in the answers to your question.
- Basically every piece of software that makes recommendations (Netflix, Google, Facebook, YouTube, Instagram, TikTok, etc.) uses machine learning.
- Anything that makes time series forecasts (Uber/Maps ETA prediction, Walmart's 2 hour delivery, etc.) uses machine learning.
- All the most popular speech-to-text assistants (Alexa, Google Assistant, Siri) use machine learning.
- Smartphone cameras use machine learning to enhance picture quality.
- A lot of very highly-used security monitoring solutions (Stripe's fraud detection, CloudFlare's bot detection, etc.) rely on machine learning.
- A surprising number of physical commerce-type situations rely on machine learning (autonomous filling stations, for example, are pretty common in the trucking industry).
- A lot of smart image manipulation tools (Instagram/SnapChat filters, etc.) rely on deep learning.
- Email clients, particularly Gmail, use machine learning for spam filtering and for things like Smart Compose.
- Some infrastructure products use machine learning, as in the case of EC2's predictive autoscaling.
And those are just hyper-scale examples. There's a ton earlier-stage-but-still-in-production projects doing awesome things with ML:
- Wildlife Protection Solutions legitimately doubled their detection rate of poachers in nature preserves with ML.
- PostEra, Benevolent AI, and a bunch of other ML-based medicine platforms (medicinal chemistry, drug discovery, etc.) have already had exciting results.
- There are a bunch of startups building industry-specific APIs out of models, like Glisten.ai, that are already profitable.
- A number of computer vision products have been brought to market in the healthcare space—Ezra.ai screens full-body MRIs for cancers, SkinVision detects melanomas.
- ML-powered chatbots are a pretty huge market. Olivia (a financial assistant) has something like 500k users. AdmitHub has successfully lowered summer melt (the attrition of college-intending students between spring and fall) at a bunch of colleges. Rasa is an entire platform that helps startups build NLP-powered bots.
Sorry that went a bit long, but basically, the production ML space is incredibly deep, and spans most industries/company sizes. Unfortunately, press coverage of ML tends to treat it as if it's this mystic, sci-fi future technology, and as a result, this "Show me AGI or it's snake oil" mindset naturally emerges.
Expert system approach to search and speech reckognition never worked well.
Digital assistance predicting that it needs to remind you about an upcoming flight, going to work etc are other examples.
Marketing.
It's really not odd at all. The average person has some familiarity with ML/AI, so you don't have to expend the energy to introduce them to the topic in a way that is understandable and also engaging to them. They already have a baseline, and are likely already aware of some interesting use cases. By contrast, they might not even know what "operations research" is, so you have to be both willing and able to expend the energy to explain the field in a way that is comprehensible and interesting. I'm sure it's possible, but the cross-section of people with the knowledge, the interest, and the social graces to do it is probably small.
To me it seems that a large swath of the science community dislikes buzzwordification and pop science more than they like proliferation of knowledge, based on how negative responses seem to be to things like normie interest in AI here. I would be fascinated to read any peer reviewed studies on the negative impacts of pop science on long term scientific advancement, so that I could understand this bias (and debunk my own bias that more interest in science is better in the long term).
>it will be boring to anyone but specialists, as the rest of machine learning ought to be.
Very fun!
Yes, I exaggerated when I said that, but its still mostly variational problems.
By that standard, you could interpret almost any numerical method for PDEs used in academia or industry as variational (aside from some fringe ones). By "variational" I mean methods which are designed in a variational way from the start, not can be merely interpreted variationally.
All of what I mentioned are neural networks.
The mere ability to perform computational work is something virtually even the tiniest piece of hardware entails, or even an abacus for that matter.
finding correlations and using a human to filter the interesting ones from the flukes doesn't make the correlations engine an ai, the intelligence is still in the human
The other subset of methods, continuous physics-informed neural networks, are described in https://www.sciencedirect.com/science/article/pii/S002199911... .
For a very basic introduction, I wrote some lecture notes on how this is done for a simple ODE with code examples: https://mitmath.github.io/18S096SciML/lecture2/ml
The tests are rarely equivalent, in that standard PDE technology can move to new domains, boundary conditions, materials, etc., without new training phases. If one needs to solve many nearby problems, there are many established techniques for leveraging that similarity. There is active research on ML to refine these techniques, but it isn't a silver bullet.
Far more exciting, IMO, is to use known methods for representing (reference-frame invariant and entropy-compatible) constitutive relations while training their form from observations of the PDE, and to do so using multiscale modeling in which a fine-scale simulation (e.g., atomistic or grain-resolving for granular/composite media) is used to train/support multiscale constitutive relations. In this approach, the PDEs are still solved by "standard" methods such as finite element or finite volume, and thus can be designed with desired accuracy and exact conservation/compatibility properties and generalize immediately to new domains/boundary conditions, but the trained constitutive models are better able to represent real materials.
A good overview paper on ML in the context of multiscale modeling: https://arxiv.org/pdf/2006.02619.pdf
That's like saying "a matrix is a collection of real numbers, so anything you say about one applies to the other".
> You can certainly apply projections to images, I mean this is what photoshop does.
This doesn't seem to refer to anything in the comment you're replying to.
I think I understand why you're putting in the learned derivative operator, but I think it's rarely desirable. Computing derivatives with compatibility properties is a well-studied domain (e.g., finite element exterior calculus), as is tensor invariance theory (e.g., Zheng 1994, though this subject is sorely in need of a modern software-centric review). When the exact theory is known and readily computable, it's hard to see science/engineering value in "learned" surrogates that merely approximate the symmetries.
More generally, it is disheartening to see trends that would conflate discretization errors with modeling errors, lest it bring back the chaos of early turbulence modeling days that prompted this 1986 Editorial Policy Statement for the Journal of Fluids Engineering. https://jedbrown.org/files/RoacheGhiaWhite-JFEEditorialPolic...
I completely agree, which is why the approach I am taking is to only utilize surrogates to think which are unknown or do not have an exact theory. I don't think surrogates will be more efficient than methods developed that exploit specific properties of the problem. In fact, I think the recent proof of convergence for PINNs simultaneously demonstrates this might be an issue (there was no upper bound to the proved convergence rate, but the one they could prove was low order).
>More generally, it is disheartening to see trends that would conflate discretization errors with modeling errors, lest it bring back the chaos of early turbulence modeling days that prompted this 1986 Editorial Policy Statement for the Journal of Fluids Engineering. https://jedbrown.org/files/RoacheGhiaWhite-JFEEditorialPolic....
Agree, this is a difficult issue with approaches that augment numerical approaches with data-driven components. There are ways to validate these trained components independent of the training data (i.e. by using other data), but validation will always be more difficult.
I presume you meant "verify" in the last sentence.
> You can certainly apply projections to images, I mean this is what photoshop does.
What's the relationship of this to anything in the comment you replied to?
That wasn't me. But I can still elaborate: while you can certainly consider a non-color image as a matrix, the operation of multiplying this matrix with a vector is rather meaningless.
While a lot of things can be made into or viewed as matrices, a matrix is typically only meaningful as a representation of a linear map.
So I completely agree with you that throwing away classical knowledge won't go very far, which is why that's not what we're doing. We utilizing neural networks within and on top of classical methods to try and solve problems where they have not traditionally performed well, or utilizing it to cover epistemic uncertainty from model misspecification.
I think it would be a good topic for a blog post or teaching paper that shows how to do this for very simple problems "end-to-end" (e.g. advection eqt, diffusion eq, advection-diffusion, burgers eqt., poisson eqt, etc.).
I see the appeal in showing that these can be used for very complex problems, but what I want to understand is what are the trade-offs for the most basic hyperbolic, parabolic, and elliptic one-dimensional problems. What's the accuracy? What's the order of convergence in practice? Are there tight upper bounds? (does that even matter?), what's the performance, how does the performance scale with the number of degrees of freedom, what does a good training pipeline look like, what's the cost of training, inference, etc.
There are well-understood methods that are optimal for all of the problems above. Knowing that you can apply these NN for problems without optimal methods is good, but I'd be more convinced that this is not just "NN-all-the-things hype" if I were to understand how these methods fair against problems for which optimal methods are indeed available.