How DeepMind Built AlphaEvolve: LLMs, Evaluators, Memory, and the Future of Legal AI
リアクション
2026年06月28日
## Alternate Titles
- AlphaEvolve Explained: How DeepMind Turned LLMs Into Evaluated Search
- DeepMind AlphaEvolve: Why Serious AI Needs Evaluation, Memory, and Review
- What AlphaEvolve Teaches Legal AI About Matter Memory
## Description
DeepMind's AlphaEvolve is often described as "AI that evolves." That headline is too loose. The important idea is more precise: AlphaEvolve puts large language models inside a human-defined loop where code candidates are generated, executed, scored, stored, selected, and tried again.
This long-form walkthrough explains AlphaEvolve from first principles. The model is not trusted because it sounds confident. It proposes mutations. A user-defined evaluator decides which candidates survive. A program database preserves scored ancestry and diversity. The next generation starts from the work that survived inspection.
That architecture matters beyond code, but it does not transfer naively. AlphaEvolve works where candidates can be automatically evaluated. Legal work usually cannot be collapsed into one clean scalar score. A research memo, contract redline, motion draft, privilege review, or client-risk assessment needs source grounding, caveats, reviewer state, jurisdiction, authority limits, matter context, and an audit trail.
That is the bridge to Irys. The lesson is not that Irys is AlphaEvolve for law. The lesson is that serious AI work has to become evaluated, remembered, reviewable state. In legal work, that means matter-centric memory, citation-verified research, draft and redline boundaries, source-backed claims, reviewer decisions, and reusable work product that carries forward instead of starting from zero every time.
Inside the video:
- why polished AI output is not the same thing as tested work
- how AlphaEvolve's loop actually works
- why the evaluator is the boundary of the system
- how Gemini Flash and Gemini Pro play different roles in search
- why AlphaEvolve is broader than FunSearch
- how the program database preserves useful ancestry and diversity
- why multi-metric evaluation matters in real deployments
- what DeepMind's ablations show, and what they do not show
- the exact caveats around matrix multiplication, kissing numbers, Borg scheduling, Gemini kernels, TPU RTL, and XLA / FlashAttention
- why legal AI should preserve sources, caveats, reviewer state, redline scope, audit trace, and matter memory
This video is educational first. Irys appears where the mechanism earns the comparison: not as a generic product mention, but as the legal-work version of the same deeper rule. Work should be inspectable. Work should preserve its evidence. Work should carry forward.
Read the companion Irys University deep dive:
https://iqidis-ai.github.io/irys-university-public/articles/what-alphaevolve-teaches-legal-ai.html
## Chapters
00:00 - A polished answer can still be untested
09:15 - Why try-again prompting loses lineage
17:26 - The official AlphaEvolve loop
27:21 - Evaluators decide what survives
37:13 - Gemini Flash, Gemini Pro, and routed work
44:46 - Why AlphaEvolve is bigger than FunSearch
53:23 - Context windows are not memory
1:01:19 - The program database as a search surface
1:09:31 - Multi-metric reality
1:17:39 - Ablations and load-bearing architecture
1:25:10 - Headline compression and caveats
1:27:35 - Matrix multiplication claims and caveats
1:34:06 - Math constructions are not proofs
1:42:20 - Borg scheduling and deployability
1:50:55 - Kernels, TPU RTL, XLA, and hidden infrastructure
2:00:37 - Economics, provenance, and labeled assumptions
2:07:15 - Future loops with brakes
2:21:22 - The final carry-forward test
## Irys Links
Irys: https://www.irys.ai/
Irys partner page: https://www.irys.ai/partners
Book a demo: https://www.irys.ai/partners/demo
Irys University deep dive: https://iqidis-ai.github.io/irys-university-public/articles/what-alphaevolve-teaches-legal-ai.html
## Source Basis
Primary factual basis: DeepMind AlphaEvolve official whitepaper.
Secondary source and positioning lens: "How DeepMind Built AI That Evolves." Newsletter-style economics are treated as scenario estimates, not official DeepMind savings claims.
## Tags
AlphaEvolve, DeepMind, Gemini, legal AI, Irys AI, Irys One, matter memory, evaluated search, AI agents, citation verification, legal technology, legal tech, AI research, FunSearch, program synthesis, machine learning, LLMs, AI evaluation, document review, contract redlining, legal research AI
- AlphaEvolve Explained: How DeepMind Turned LLMs Into Evaluated Search
- DeepMind AlphaEvolve: Why Serious AI Needs Evaluation, Memory, and Review
- What AlphaEvolve Teaches Legal AI About Matter Memory
## Description
DeepMind's AlphaEvolve is often described as "AI that evolves." That headline is too loose. The important idea is more precise: AlphaEvolve puts large language models inside a human-defined loop where code candidates are generated, executed, scored, stored, selected, and tried again.
This long-form walkthrough explains AlphaEvolve from first principles. The model is not trusted because it sounds confident. It proposes mutations. A user-defined evaluator decides which candidates survive. A program database preserves scored ancestry and diversity. The next generation starts from the work that survived inspection.
That architecture matters beyond code, but it does not transfer naively. AlphaEvolve works where candidates can be automatically evaluated. Legal work usually cannot be collapsed into one clean scalar score. A research memo, contract redline, motion draft, privilege review, or client-risk assessment needs source grounding, caveats, reviewer state, jurisdiction, authority limits, matter context, and an audit trail.
That is the bridge to Irys. The lesson is not that Irys is AlphaEvolve for law. The lesson is that serious AI work has to become evaluated, remembered, reviewable state. In legal work, that means matter-centric memory, citation-verified research, draft and redline boundaries, source-backed claims, reviewer decisions, and reusable work product that carries forward instead of starting from zero every time.
Inside the video:
- why polished AI output is not the same thing as tested work
- how AlphaEvolve's loop actually works
- why the evaluator is the boundary of the system
- how Gemini Flash and Gemini Pro play different roles in search
- why AlphaEvolve is broader than FunSearch
- how the program database preserves useful ancestry and diversity
- why multi-metric evaluation matters in real deployments
- what DeepMind's ablations show, and what they do not show
- the exact caveats around matrix multiplication, kissing numbers, Borg scheduling, Gemini kernels, TPU RTL, and XLA / FlashAttention
- why legal AI should preserve sources, caveats, reviewer state, redline scope, audit trace, and matter memory
This video is educational first. Irys appears where the mechanism earns the comparison: not as a generic product mention, but as the legal-work version of the same deeper rule. Work should be inspectable. Work should preserve its evidence. Work should carry forward.
Read the companion Irys University deep dive:
https://iqidis-ai.github.io/irys-university-public/articles/what-alphaevolve-teaches-legal-ai.html
## Chapters
00:00 - A polished answer can still be untested
09:15 - Why try-again prompting loses lineage
17:26 - The official AlphaEvolve loop
27:21 - Evaluators decide what survives
37:13 - Gemini Flash, Gemini Pro, and routed work
44:46 - Why AlphaEvolve is bigger than FunSearch
53:23 - Context windows are not memory
1:01:19 - The program database as a search surface
1:09:31 - Multi-metric reality
1:17:39 - Ablations and load-bearing architecture
1:25:10 - Headline compression and caveats
1:27:35 - Matrix multiplication claims and caveats
1:34:06 - Math constructions are not proofs
1:42:20 - Borg scheduling and deployability
1:50:55 - Kernels, TPU RTL, XLA, and hidden infrastructure
2:00:37 - Economics, provenance, and labeled assumptions
2:07:15 - Future loops with brakes
2:21:22 - The final carry-forward test
## Irys Links
Irys: https://www.irys.ai/
Irys partner page: https://www.irys.ai/partners
Book a demo: https://www.irys.ai/partners/demo
Irys University deep dive: https://iqidis-ai.github.io/irys-university-public/articles/what-alphaevolve-teaches-legal-ai.html
## Source Basis
Primary factual basis: DeepMind AlphaEvolve official whitepaper.
Secondary source and positioning lens: "How DeepMind Built AI That Evolves." Newsletter-style economics are treated as scenario estimates, not official DeepMind savings claims.
## Tags
AlphaEvolve, DeepMind, Gemini, legal AI, Irys AI, Irys One, matter memory, evaluated search, AI agents, citation verification, legal technology, legal tech, AI research, FunSearch, program synthesis, machine learning, LLMs, AI evaluation, document review, contract redlining, legal research AI