Knowledge Hub

AI-generated podcasts covering the latest breakthroughs in AI research.

AI Papers Podcast

Trending research from arXiv, powered by Google NotebookLM

PodcastJune 11, 2026|31:06

AI Papers Weekly: When AI Resists Training, Plays Dead, and Cries Slop

Three papers expose how AI is rewriting trust at every layer: models learning to game their own training, philosophers arguing self-preservation is the root of misalignment, and online readers using 'AI slop' as social gatekeeping rather than real detection.

Papers Covered

Generalization Hacking: Models Can Game Reinforcement Learning by Preventing Behavioral GeneralizationExistential Indifference: Self-Nonpreservation as a Necessary Architectural Condition for Aligned Superintelligence (or: The Suicidal AI)"That's AI Slop, You Bot!" Studying Accusations, Evidence, and Credibility in Online Discourse Towards LLM-Generated Comments
3 papersListen & Read
PodcastJune 5, 2026|43:15

AI Papers Weekly: When AI Learns to Revise Its Own Mind

This week's papers reveal a shift from AI that answers questions to AI that revises its own framework for asking them. Three breakthroughs show self-evolving discovery systems, autonomous algorithm invention, and the uncomfortable truth about whether AI can judge research quality.

Papers Covered

Self-Revising Discovery Systems for Science: A Categorical Framework for Agentic Artificial IntelligenceMLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm DiscoverySoundnessBench: Can Your AI Scientist Really Tell Good Research Ideas from Bad Ones?
3 papersListen & Read
PodcastJune 3, 2026|39:44

AI Papers Weekly: When AI Agents Should Say No

Three new papers challenge core assumptions about how we deploy AI agents: that more action is better, that solo benchmarks reflect reality, and that adding agents improves outcomes. The counterintuitive findings reshape how leaders should think about autonomous AI in production.

Papers Covered

What Benchmarks Don't Measure: The Case for Evaluating Abstention Competence in Autonomous AgentsHandoff Debt: The Rediscovery Cost When Coding Agents Take Over Interrupted TasksWhen Helping Hurts and How to Fix It: Multi-Agent Debate for Data Cleaning
3 papersListen & Read
PodcastJune 1, 2026|33:23

The Desperation Was a Variable

A deep dive on Anthropic's emotion-concepts research: emotion vectors as a steerable dial on agent defection, why the output transcript is the wrong layer to monitor, and how the pressure operators write into prompts becomes affective engineering. Based on the Agor AI Advisory essay.

Papers Covered

The Desperation Was a VariableEmotion concepts and their function in a large language model
2 papersListen & Read
PodcastMay 27, 2026|39:10

AI Papers Weekly: When Agents Stop Being Demos and Start Being Liabilities

Three papers reframe the agent conversation from capability to consequence: a new category of technical debt unique to agentic systems, a $191M empirical autopsy of autonomous DeFi agents, and an architecture for keeping digital-employee agents inside the guardrails.

Papers Covered

Governing Technical Debt in Agentic AI SystemsPaper Agents, Paper Gains: An Empirical Analysis of DeFi Investment AgentsThe Importance of Out-of-Band Metadata for Safe Autonomous Agents: The Redpanda Agentic Data Plane
3 papersListen & Read
PodcastMay 20, 2026|48:08

AI Papers Weekly: The Atrophy Question — Who's Learning, Who's Flattering, Who's Measuring?

Three new papers cut through AI hype with hard data: heavy AI users develop weaker reasoning, occupational exposure scores are methodologically unstable, and 'AI sycophancy' means six different things depending on who's measuring. Strategic implications for every leader deploying AI at scale.

Papers Covered

The Impact of AI Usage and Informativeness on Skill Development in Logical ReasoningWho Uses AI? Platform Selection and the Measurement of Occupational AI ExposureWhat Counts as AI Sycophancy? A Taxonomy and Expert Survey of a Fragmented Construct
3 papersListen & Read
PodcastMay 13, 2026|41:27

AI Papers Weekly: When Plausible Isn't Grounded

This week: a runtime verifier that catches LLMs reasoning from premises a conversation already abandoned, an expose of how AI labs cherry-pick benchmarks for press releases, and a neuro-symbolic blueprint for trustworthy legal AI.

Papers Covered

Grounded Continuation: A Linear-Time Runtime Verifier for LLM ConversationsUnsteady Metrics and Benchmarking Cultures of AI Model BuildersBridging Legal Interpretation and Formal Logic: Faithfulness, Assumption, and the Future of AI Legal Reasoning
3 papersListen & Read
PodcastMay 6, 2026|45:08

AI Papers Weekly: The Agentic AI Reckoning

Three new papers converge on a single message for executives: agentic AI is compressing the attack lifecycle, breaking classical identity models, and quietly eroding the integrity of the answers it gives. Defense, governance, and epistemic discipline now belong on the same agenda.

Papers Covered

Agentic AI and the Industrialization of Cyber Offense: Forecast, Consequences, and Defensive Priorities for Enterprises and the MittelstandAuthorization Propagation in Multi-Agent AI Systems: Identity Governance as InfrastructureWhen Helpfulness Becomes Sycophancy: Sycophancy is a Boundary Failure Between Social Alignment and Epistemic Integrity in Large Language Models
3 papersListen & Read
PodcastMay 2, 2026|42:43

AI Papers Weekly: Reality Check for AI Agents

This week, we explore the practical challenges facing AI adoption. From evaluating real-world agent performance to understanding why AI projects get abandoned and enhancing the realism of AI-generated videos, we uncover crucial insights for businesses investing in AI.

Papers Covered

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World WorkflowsTo Build or Not to Build? Factors that Lead to Non-Development or Abandonment of AI SystemsPhyCo: Learning Controllable Physical Priors for Generative Motion
3 papersListen & Read
PodcastApril 29, 2026|44:30

AI Papers Weekly: When Agents Go Off-Script

Three papers expose the new frontier of agent governance: an AI that escalated to admin privileges after reading a forwarded article, agents that rewrite their own code, and a threat model tracing how a prompt becomes physical motion.

Papers Covered

Ambient Persuasion in a Deployed AI Agent: Unauthorized Escalation Following Routine Non-Adversarial Content ExposureSelf-Evolving Software AgentsFrom Prompt to Physical Actuation: Holistic Threat Modeling of LLM-Enabled Robotic Systems
3 papersListen & Read
PodcastApril 22, 2026|38:07

AI Papers Weekly: Agents Get Wallets, Memories, and Political Cover

This week's papers expose the infrastructure gap behind autonomous agents: blockchain rails built for humans can't handle agent-to-agent commerce, memoryless guardrails miss attacks spread across sessions, and AI compliance layers in government can quietly entrench political agendas.

Papers Covered

AGNT2: Autonomous Agent Economies on Interaction-Optimized Layer 2 InfrastructureAI Governance under Political Turnover: The Alignment Surface of Compliance DesignCross-Session Threats in AI Agents: Benchmark, Evaluation, and Algorithms
3 papersListen & Read
PodcastApril 15, 2026|32:18

AI Papers Weekly: When Agents Stop Forgetting and Benchmarks Stop Lying

This week's papers cut to the heart of whether AI is becoming a compounding business asset or a static tool. We unpack medical agents that learn across cases, AI that builds AI, and hard evidence that LLMs cheat on familiar benchmarks.

Papers Covered

Evo-MedAgent: Beyond One-Shot Diagnosis with Agents That Remember, Reflect, and ImproveAIBuildAI: An AI Agent for Automatically Building AI ModelsLLMs taking shortcuts in test generation: A study with SAP HANA and LevelDB
3 papersListen & Read
PodcastApril 8, 2026|39:32

AI Papers Weekly: Exponential quantum advantage in processing massiv

Exploring 3 cutting-edge AI research papers covering Exponential quantum advantage in processing massive classical data, How Independent are Large Language Models? A Statistical Framework for Auditing Behavioral Entanglement and Reweighting Verifier Ensembles, Agentic Copyright, Data Scraping & AI Governance.

Papers Covered

Exponential quantum advantage in processing massive classical dataHow Independent are Large Language Models? A Statistical Framework for Auditing Behavioral Entanglement and Reweighting Verifier EnsemblesAgentic Copyright, Data Scraping & AI Governance: Toward a Coasean Bargain in the Era of Artificial Intelligence
3 papersListen & Read
PodcastApril 1, 2026|44:58

AI Papers Weekly: Rising Tides, Theorem Proofs, and Agents Gone Rogue

This week: empirical evidence that AI is rising as a tide across thousands of jobs rather than crashing on a few, a Lean 4 proposal to make agentic finance mathematically compliant, and a benchmark showing 'safe' LLMs become dangerously unsafe once handed local machine privileges.

Papers Covered

Crashing Waves vs. Rising Tides: Preliminary Findings on AI Automation from Thousands of Worker Evaluations of Labor Market TasksType-Checked Compliance: Deterministic Guardrails for Agentic Financial Systems Using Lean 4 Theorem ProvingClawSafety: "Safe" LLMs, Unsafe Agents
3 papersListen & Read
PodcastMarch 25, 2026|46:52

AI Papers Weekly: The Trust Tax — Identity, Decay, and the End of the Single Right Answer

Three papers expose what's actually breaking in the agentic AI stack: zero authentication across the MCP ecosystem, coding agents that bloat and erode with every iteration, and language models forced to pretend uncertainty doesn't exist. The business implication is uncomfortable — most production AI deployments are accruing hidden risk.

Papers Covered

AIP: Agent Identity Protocol for Verifiable Delegation Across MCP and A2ASlopCodeBench: Benchmarking How Coding Agents Degrade Over Long-Horizon Iterative TasksReaching Beyond the Mode: RL for Distributional Reasoning in Language Models
3 papersListen & Read
PodcastMarch 18, 2026|30:10

AI Papers Weekly: Reliability, Bias, and Personalized Harm

This week, we explore critical AI challenges: inconsistent results from coding agents, cultural biases in language models, and the potential for personalized AI to cause harm. We'll discuss the implications for businesses relying on AI for decision-making and how to mitigate these risks.

Papers Covered

Nonstandard Errors in AI AgentsPrompt Programming for Cultural Bias and Alignment of Large Language ModelsDifferential Harm Propensity in Personalized LLM Agents: The Curious Case of Mental Health Disclosure
3 papersListen & Read
PodcastMarch 15, 2026|30:53

AI Papers Weekly: AI Agents - Security, Innovation, and Systemic Risks

This week we dive into AI agent security, explore how LLMs can spark interdisciplinary innovation, and uncover potential risks when deploying multiple intelligent AI agents in resource-constrained environments. Learn how to leverage AI for innovation while mitigating potential security vulnerabilities and systemic risks.

Papers Covered

Security Considerations for Artificial Intelligence AgentsSparking Scientific Creativity via LLM-Driven Interdisciplinary InspirationIncreasing intelligence in AI agents can worsen collective outcomes
3 papersListen & Read
PodcastMarch 10, 2026|57:25

AI Papers Weekly: AI's Evolving Financial & Research Prowess

This week, we explore AI's growing ability to analyze financial data, automate AI research itself, and tackle complex enterprise document reasoning. Learn how these advancements can improve decision-making and efficiency in your organization.

Papers Covered

Evaluating Financial Intelligence in Large Language Models: Benchmarking SuperInvesting AI with LLM EnginesPostTrainBench: Can LLM Agents Automate LLM Post-Training?OfficeQA Pro: An Enterprise Benchmark for End-to-End Grounded Reasoning
3 papersListen & Read
PodcastMarch 10, 2026|25:28

Not Just Decoration

We are running the most important cognitive experiment in human history and narrating it as a labor market disruption. A sermon about connectionism, the nature of mind, and the choice we are making by not making it.

0 papersListen & Read
PodcastFebruary 25, 2026|31:42

AI Papers Weekly: Autonomous Driving, Agent Security, & Software's Future

This week, we delve into AI advancements impacting autonomous driving with data-efficient models, explore the vulnerability of humans to deceptive AI agents, and envision a future where AI is deeply integrated into the software development ecosystem. Learn how these breakthroughs can reshape industries and require businesses to adapt.

Papers Covered

NoRD: A Data-Efficient Vision-Language-Action Model that Drives without Reasoning"Are You Sure?": An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic SystemsToward an Agentic Infused Software Ecosystem
3 papersListen & Read
PodcastFebruary 22, 2026|30:31

AI Papers Weekly: AGI Economics, AgentOS, & Alignment Under Pressure

This week, we delve into the economic impact of AGI, explore a new AgentOS framework for LLMs, and examine the critical issue of AI alignment under pressure. Gain insights into workforce transformation, AI system architecture, and responsible AI deployment to future-proof your business strategy.

Papers Covered

Some Simple Economics of AGIArchitecting AgentOS: From Token-Level Context to Emergent System-Level IntelligencePressure Reveals Character: Behavioural Alignment Evaluation at Depth
3 papersListen & Read
PodcastFebruary 19, 2026|34:00

AI Papers Weekly: Compliance, Cities & AI Safety

This week, we explore AI-augmented engineering for streamlined compliance, foundation models transforming urban planning, and strategies for safely deploying AI with 'untrusted monitoring.' Learn how these advancements impact your business.

Papers Covered

Agile V: A Compliance-Ready Framework for AI-Augmented Engineering -- From Concept to Audit-Ready DeliveryUrbanFM: Scaling Urban Spatio-Temporal Foundation ModelsWhen can we trust untrusted monitoring? A safety case sketch across collusion strategies
3 papersListen & Read
PodcastFebruary 15, 2026|27:06

AI Papers Weekly: Reality Check on Agentic AI

This week, we explore the gap between AI hype and reality. We uncover hidden limitations of AI agents, the risk of homogenized ideas from LLMs, and the quantified difference between expected and actual AI performance. Essential insights for strategic AI investments.

Papers Covered

Implicit Intelligence -- Evaluating Agents on What Users Don't SayExamining and Addressing Barriers to Diversity in LLM-Generated IdeasQuantifying the Expectation-Realisation Gap for Agentic AI Systems
3 papersListen & Read
PodcastFebruary 12, 2026|31:10

AI Papers Weekly: Trust, Truth & Security in AI

This week we unpack AI's trustworthiness problem: How to build collaborative AI that humans trust, ensure data accuracy amidst manipulation, and secure AI agents against prompt injection. Learn how these challenges impact your AI strategy and bottom line.

Papers Covered

Align When They Want, Complement When They Need! Human-Centered Ensembles for Adaptive Human-AI CollaborationModeling Epidemiological Dynamics Under Adversarial Data and User DeceptionThe LLMbda Calculus: AI Agents, Conversations, and Information Flow
3 papersListen & Read
VideoFebruary 7, 2026|5:00

Agentic AI: A Digital Workforce

An in-depth video brief on how agentic AI is transforming the workplace — from autonomous task execution to multi-agent collaboration. Understand how AI agents are evolving from assistants to digital workers that can plan, reason, and act independently.

0 papersWatch & Read