https://www.amazon.com/Genetic-Programming-III-Darwinian-Inv...
I wanted to use genetic algorithms (GAs) to come up with random programs run against unit tests that specify expected behavior. It sounds like they are doing something similar, finding potential solutions with neural nets (NNs)/LLMs and grading them against an "evaluator" (wish they added more details about how it works).
What the article didn't mention is that above a certain level of complexity, this method begins to pull away from human supervisors to create and verify programs faster than we can review them. When they were playing with Lisp GAs back in the 1990s on Beowulf clusters, they found that the technique works extremely well, but it's difficult to tune GA parameters to evolve the best solutions reliably in the fastest time. So volume III was about re-running those experiments multiple times on clusters about 1000 times faster in the 2000s, to find correlations between parameters and outcomes. Something similar was also needed to understand how tuning NN parameters affects outcomes, but I haven't seen a good paper on whether that relationship is understood any better today.
Also GPU/SIMD hardware isn't good for GAs, since video cards are designed to run one wide algorithm instead of thousands or millions of narrow ones with subtle differences like on a cluster of CPUs. So I feel that progress on that front has been hindered for about 25 years, since I first started looking at programming FPGAs to run thousands of MIPS cores (probably ARM or RISC-V today). In other words, the perpetual AI winter we've been in for 50 years is more about poor hardware decisions and socioeconomic factors than technical challenges with the algorithms.
So I'm certain now that some combination of these old approaches will deliver AGI within 10 years. I'm just frustrated with myself that I never got to participate, since I spent all of those years writing CRUD apps or otherwise hustling in the struggle to make rent, with nothing to show for it except a roof over my head. And I'm disappointed in the wealthy for hoarding their money and not seeing the potential of the countless millions of other people as smart as they are who are trapped in wage slavery. IMHO this is the great problem of our time (explained by the pumping gas scene in Fight Club), although since AGI is the last problem in computer science, we might even see wealth inequality defeated sometime in the 2030s. Either that or we become Borg!
In the end, it doesn't matter that much which approach is taken because it's all classification problems. We just need the solution matrix, and ideally what computation went into solving it. I feel that this simple fact is lost amidst the complexity of how ML is taught today.
ML isn’t accelerating because of better code or research breakthroughs either. It’s happening because the big CPU manufacturers didn’t do anything for 20 years and GPU manufactures had their lunch. ML is straightforward, even trivial in some cases with effectively unlimited cores and bandwidth. We’re just rediscovering parallelization algorithms that were well known in functional programming generations ago. These discoveries are inevitable in a suitable playground.
I used to have this fantasy that I would get ahead of the curve enough to be able to dabble in the last human endeavor but I'm beginning to realize that that's probably never going to happen. Machines will soon beat humans in pretty much every category, and not because someone figures out how to make it all work, but because there simply isn't enough time to stop it now. There are a dozen teams around the world racing to solve any problem and anyone’s odds of being first are perhaps 10% at best. Compounded with darwinian capitalism, the risk/reward equation is headed towards infinity so fast that it’s looking like the smartest move is not to play.
Barring a dystopian future or cataclysm, I give us 10 years, certainly no more than 20, before computers can do anything people can do, at least economically. And the really eerie thing is that that won’t be the most impressive thing happening, because kids will know it’s all just hill climbing and throwing hardware at problems. It will be all the other associated technologies that come about as people abandon the old hard ways of doing things.
https://en.wikipedia.org/wiki/Genetic_programming
I read the 3rd edition:
https://www.amazon.com/Genetic-Programming-III-Darwinian-Inv...
That way all processing resources can go towards exploring the problem space for potential solutions close to the global minimum or maximum, instead of being wasted on code containing syntax errors that won't execute.
So the agent's real-world Python LLM code would first be transpiled to Lisp and evolved internally, then after it's tested and shown to perform better imperically than the original code, be translated back and merged into the agent.
Then the challenge becomes transpiling to/from other imperative programming (IP) languages like Python, which is still an open problem:
-
Going from Lisp to Python (or running Lisp within Python) is trivial, and I've seen implementations for similar IP languages like C++ in like 1 page of code. They pop up on HN frequently.
But going from Python to Lisp (or running Python within Lisp) is a lot harder if one wishes to preserve readability, which may or may not matter here. Naive conversions bind variables under pseudonyms, so a Python variable like my_counter becomes int_123 and it works like an emulator, merely executing the operations performed by the Python code. Mutability gets buried in monadic logic or functional impurity which has the effect of passing the buck rather than getting real work done. Structs, classes, associative arrays, etc lose their semantic meaning and appear as a soup of operations without recognizable structure.
To my knowledge, nobody has done the hard work of partitioning imperative code into functional portions which can be transpiled directly to/from FP code. Those would only have const variables and no connection to other processes of execution other than their initial and final values, to be free of side effects and be expressible as prefix/postfix/infix notation without change to logic, as imperative or functional code.
Mutability could be represented as shadowed variables within ephemeral functional sub-scopes, or by creating new value names for each mutation and freeing the intermediate variables via reference counting or garbage collection. Think of each new value as running in a forked version of the current process, with only that value being different after copy-on-write. A simple for-loop from 1 to 1000 would run that many forked processes, keeping only the last one, which contains the final value of the iterator.
Mutability can also be represented as message passing between processes. So the FP portions would be ordinary Lisp, glued together with IO functions, possibly monadic. I don't like how Haskell does this, mainly because I don't fully understand how it works. I believe that ClojureScript handles mutability of its global state store by treating each expression as a one-shot process communicating with the store, so that the code only sees initial and final values. While I don't know if I understand how that works, I feel that it's a more understandable way of doing things, and probably better represents how real life works, as explained to me in this comment about Lisp Flavored Erlang (LFE) and Erlang's BEAM (see parent comments for full discussion):
https://news.ycombinator.com/item?id=43931177
Note that FP languages like Lisp are usually more concerned with types and categories than IP languages, so can have or may need stronger rules around variable types to emulate logic that we take for granted in IP languages. For example, Lisp might offer numbers of unlimited size or precision that need to be constrained to behave like a float32. Similar constraints could affect things like character encoding and locale.
-
I first learned about everything I just explained around 2005 after reading the book. I first had thoughts about brute-forcing combinations to solve small logic circuit and programming challenges during my electrical and computer engineering (ECE) courses at UIUC in the late 1990s, because it took so much mental effort and elbow grease to create solutions that are obvious in hindsight.
Then the Dot Bomb happened, the Mobile bubble happened, the Single Page Application bubble happened, and the tech industry chose easy instead of simple:
https://www.infoq.com/presentations/Simple-Made-Easy/
This is why we chose easy hardware like GPUs over simple highly multicore CPUs, and easy languages like Ruby/React over simple declarative idempotent data-driven paradigms like HTTP/HTML/htmx.
The accumulated technical debt of always choosing the quick and easy path set AI (and computing in general) back decades. The AI Winter, endless VC wealth thrown at non-problems chasing profit, massive wealth inequality, so many things stem from this daily application of easy at the expense of simple.
I wish I could work on breaking down IP languages like Python into these const functional portions with mutability handled through message passing in LFE to create an IP <-> FP transpiler for optimization, automatic code generation and genetic algorithm purposes. Instead, I've had to survive by building CRUD apps and witness the glacial pace of AI progress from the sidelines.
It may be too late for me, but maybe these breadcrumbs will help someone finally get some real work done.