Building AI without a neural network(hivekit.io) |
Building AI without a neural network(hivekit.io) |
We ran this over two years and saw impressive results - especially for the time. I feel, this very much was a "thinking system" in the same sense. And I concur - there are many architectures and approaches to built this sort of thing.
Basically, you create a "population" out of possible placement strategies for orders. You then cycle through "generations" that quickly adapt to the changing market conditions.
Ultimately, you end up with something that - on average - provides execution closer to the target price than traditional, more static strategies. But the problem for us was that the outliers where really far out. Basically, if it got it wrong, it got it really wrong and you were often stuck with a set of half executed orders that had to be cleaned up manually.
- uses the information it already has combined with new inputs to form new information based on both (aka reasoning)
- retains learned information over time (aka memory)
- contains feedback loops to eventually eradicate "wrong" information (aka learning)
- converges to similar conclusions in similar circumstances (aka reasoning or possibly instict)
At least several of the most important parts of thought are there, though it is obviously very different than our own mind. That's OK, both submarines and penguins can move underwater but they do it in very different ways. No reason to think that thought wouldn't have multiple ways in which it could be implemented.
Evolution is also almost certainly not self-aware but neither is a dog and we consider dogs to be capable of thinking too.
Similar to how we have things like NP-hard > NP > (maybe) P, you could have a classification based on how many (types of) information can be processed or something like that. Maybe a similar but separate scale for learning capacity?
Evolution by natural selection is adaptive in the sense you describe, but no one would consider to be "thinking".
It is important to recognize the limitations of these deep neural networks and especially GPTs which both have been hyped to the point where most see it as the solution to everything; since many deep-learning supporters continue to compare it to the human brain, despite the severely limited explainability in these models to the point where not even researchers know what it is doing.
It's one of the reasons why there's very low trust in deep neural networks in general. Humans can be held to account for their actions but with almost all these neural network systems cannot be held to account when something goes wrong and accept the output as the answer whilst 'it is thinking' or 'reasoning'.
> The mechanism behind real world complex systems requires both simple rulesets and a communicative fabric that allows for real time feedback loops. For flocks of starlings this can be as easy as keeping eye contact with the next bird. For Termites, it is pheromone trails. For societies, it is language, reputation and status. For capital markets, it is money. And for the internet, its the physical wiring and network architecture that makes it all possible.
As long as this solution provides transparency and explainable results rather than giving an answer from a black box system then I'm looking forward to this approach.
"Top down" never really works for complex systems, the economy being an obvious example. But we tend to ignore that when thinking about neural networks.
I guess if you wanted to be pedantic, there is no swarm, or in other words the swarm is just a shorthand way of describing the collective decisions of all the nodes.
Where exactly does the emergent behavior come from if the nodes are dumb, e.g. non-autonomous?
“But that leaves large and important areas that GPTs are entirely unfit for: Real-time problem solving in dynamic environments.”
Isn’t Tesla using GPTs at least for vision? What do you call the AI technologies employed by self driving projects and robotics?
It is certainly not a generative pre-trained model (GPT) used in FSD or autopilot. Most likely advanced deep learning models for object detection or segmentation with very low latency.
Whatever they are using, it still easily can get confused with similar looking objects on the road, which is why Tesla requires drivers to keep their eyes on the road and hands on the wheel. Not even Tesla, trusts its own AI systems.
Even if humans invent bullshit explanations, in serious cases of accountability is done in the courts system which humans investigate about the whole timeline of events of a dispute which there is very little room for perverting the course of justice and making everything up.
Hence this scenario, lawyers would liked to have known as to why did an AI system give hallucinated citations when it was used in a legal proceeding? It's even worse that legal experts knowingly trusted it and failed to reason with the results from this AI system; because fundamentally it cannot explain why that issue happened. [0] It even goes beyond basic citations, with autonomous cars without humans behind the wheel [1] with the company (Cruise) being unable to convince the regulators or even explain the crashes and had to pull the vehicles off the road due to this high amount of risk.
So yet again, explainability in AI with neural networks is still far worse than humans, even when these systems cannot be trusted in high risk and novel situations.
[0] https://www.theguardian.com/technology/2023/jun/23/two-us-la...
[1] https://www.theguardian.com/us-news/2023/oct/24/driverless-c...
I took a deep learning class before things really took off [0], and what we were taught was that a lot of deep learning research was just trying different structures to see what worked. Even the people who were good at it didn't base their architectures on strong theories for how and why things would work, they just had developed an intuition through lots of trial and error.
All the explanations for what the various structures in a deep learning model are doing are post hoc—they're observations made of the behavior of the systems after we build them.
[0] Edit: Not just a random deep learning class either. The TA for that class created AI Dungeon that semester. The professor came in one day with a story of how they accidentally racked up tens of thousands of dollars in Google Cloud bills in 24 hours when that went viral.
I wish the industry would be more transparent about where all these brilliant ideas come from (spoiler alert: academics from the 1930s-70s, not "genius founders").
I wouldn't characterize it like that ... The brain has a specific learning architecture / dynamics that has been created via evolution under selection pressure to learn (i.e predict outcomes) better.
As a product of evolution I wouldn't want to call it "architected" or a top-down design, but for the time being it's the only example we have of such a successful learning system so it would make sense to copy what evolution has done, which means copying how it works on all levels.
A simpler example is convolutional neural nets for vision which were explicitly designed to simplistically mimic some of the behavior of our visual system with it's multi-layer (V1, V2, etc) architecture and local learning rules. Sure there's emergent behavior occurring when data is fed into such a system (both brain's visual system or CNN), such as the pattern detectors we see emerge at lowest level, but this emergent behavior is a result of the overall architecture.
[1]: https://en.wikipedia.org/wiki/Friedrich_Hayek#Economic_calcu...
[2]: https://en.wikipedia.org/wiki/Catallaxy
[3]: https://citeseerx.ist.psu.edu/doc_view/pid/2688969848b753368...
https://nonint.com/2023/06/10/the-it-in-ai-models-is-the-dat...
Even then, the architectures we use are essentially stumbled on. This is alchemy, not modern chemistry.
That said, I don't think that LLMs are yet close to fully utilizing the training data. e.g. they are prone to hallucinating due to not thinking ahead and backing themselves into a corner where they have to say something. One obvious improvement for that one is more forward looking prediction (i.e. "thinking ahead" - engage brain before opening mouth), which for LLMs can be addressed by tree of thought rollouts and RL learning.
So, while architecture is not important if all architectures are equally powerful (in which case you'll learn what's available to learn), it certainly does matter if not all are fully up to the job, as it would appear none currently are.
Yet all four of these examples use the same model architecture (transformers).
The eye is part of the brain for instance. The details are emergent but there is most definitely strong top down architecture encoded in DNA
That would be slightly annoying, but they did this to then justify points about how the resulting model is "emergent". The research process, and the resulting output, can and (my main point) almost always do, differ as to whether they are truly emergent.
Without tweaking anything (so not RWKV), you could train a GPT level RNN...if you had the compute to burn.
We use Transformers today in large part because they got rid of recursion and in effect could massively parallelize compute.