<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>2026-04-23 on JXIN&#39;s Home</title>
    <link>https://ftxj.github.io/categories/2026-04-23/</link>
    <description>Recent content in 2026-04-23 on JXIN&#39;s Home</description>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Mon, 27 Apr 2026 05:10:58 +0000</lastBuildDate>
    <atom:link href="https://ftxj.github.io/categories/2026-04-23/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Enhancing Online Recruitment with Category-Aware MoE and LLM-based Data Augmentation</title>
      <link>https://ftxj.github.io/posts/2026-04-23/10-enhancing-online-recruitment-with-category-aware-moe-and-llm/</link>
      <pubDate>Mon, 27 Apr 2026 05:10:58 +0000</pubDate>
      <guid>https://ftxj.github.io/posts/2026-04-23/10-enhancing-online-recruitment-with-category-aware-moe-and-llm/</guid>
      <description>&lt;p&gt;&lt;strong&gt;arXiv:&lt;/strong&gt; &lt;a href=&#34;https://arxiv.org/abs/2604.21264v1&#34;&gt;2604.21264&lt;/a&gt; · &lt;a href=&#34;https://arxiv.org/pdf/2604.21264v1&#34;&gt;PDF&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Minping Chen, Bing Xu, Zulong Chen, Chuanfei Xu, Ying Zhou, Zui Tao, Zeyi Wen&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Primary category:&lt;/strong&gt; &lt;code&gt;cs.AI&lt;/code&gt; · all: cs.AI&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Matched keywords:&lt;/strong&gt; large language model, llm, rag, chain-of-thought, mixture of experts, moe&lt;/p&gt;&#xA;&lt;hr&gt;&#xA;&lt;h2 id=&#34;tldr&#34;&gt;TL;DR&lt;/h2&gt;&#xA;&lt;p&gt;The paper proposes an LLM-enhanced Person-Job Fit (PJF) system combining chain-of-thought data augmentation for low-quality job descriptions with a category-aware Mixture of Experts module to better distinguish similar candidate-job pairs, yielding measurable gains in offline metrics and online A/B tests.&lt;/p&gt;</description>
    </item>
    <item>
      <title>LayerBoost: Layer-Aware Attention Reduction for Efficient LLMs</title>
      <link>https://ftxj.github.io/posts/2026-04-23/09-layerboost-layer-aware-attention-reduction-for-efficient-llm/</link>
      <pubDate>Mon, 27 Apr 2026 05:10:15 +0000</pubDate>
      <guid>https://ftxj.github.io/posts/2026-04-23/09-layerboost-layer-aware-attention-reduction-for-efficient-llm/</guid>
      <description>&lt;p&gt;&lt;strong&gt;arXiv:&lt;/strong&gt; &lt;a href=&#34;https://arxiv.org/abs/2604.22050v1&#34;&gt;2604.22050&lt;/a&gt; · &lt;a href=&#34;https://arxiv.org/pdf/2604.22050v1&#34;&gt;PDF&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Mohamed Ali Souibgui, Jan Fostier, Rodrigo Abadía-Heredia, Bohdan Denysenko, Christian Marschke, Igor Peric&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Primary category:&lt;/strong&gt; &lt;code&gt;cs.LG&lt;/code&gt; · all: cs.CL, cs.LG&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Matched keywords:&lt;/strong&gt; llm, inference, serving, attention, transformer, throughput, latency&lt;/p&gt;&#xA;&lt;hr&gt;&#xA;&lt;h2 id=&#34;tldr&#34;&gt;TL;DR&lt;/h2&gt;&#xA;&lt;p&gt;LayerBoost is a layer-aware attention reduction method that uses sensitivity analysis to selectively apply softmax, linear sliding window, or no attention per layer, recovered via a lightweight 10M-token distillation. It improves throughput by up to 68% at high concurrency while preserving quality.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Lightweight Retrieval-Augmented Generation and Large Language Model-Based Modeling for Scalable Patient-Trial Matching</title>
      <link>https://ftxj.github.io/posts/2026-04-23/08-lightweight-retrieval-augmented-generation-and-large-languag/</link>
      <pubDate>Mon, 27 Apr 2026 05:09:44 +0000</pubDate>
      <guid>https://ftxj.github.io/posts/2026-04-23/08-lightweight-retrieval-augmented-generation-and-large-languag/</guid>
      <description>&lt;p&gt;&lt;strong&gt;arXiv:&lt;/strong&gt; &lt;a href=&#34;https://arxiv.org/abs/2604.22061v1&#34;&gt;2604.22061&lt;/a&gt; · &lt;a href=&#34;https://arxiv.org/pdf/2604.22061v1&#34;&gt;PDF&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Xiaodi Li, Yang Xiao, Munhwan Lee, Konstantinos Leventakos, Young J. Juhn, David Jones, Terence T. Sio, Wei Liu, Maria Vassilaki, Nansu Zong&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Primary category:&lt;/strong&gt; &lt;code&gt;cs.CL&lt;/code&gt; · all: cs.AI, cs.CL, cs.LG&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Matched keywords:&lt;/strong&gt; large language model, llm, retrieval, reasoning, serving, fine-tun&lt;/p&gt;&#xA;&lt;hr&gt;&#xA;&lt;h2 id=&#34;tldr&#34;&gt;TL;DR&lt;/h2&gt;&#xA;&lt;p&gt;该论文提出一种轻量级框架，结合 RAG 与 LLM 表征建模，用于可扩展的患者-临床试验匹配，在多个公开和真实临床数据集上以显著更低的计算代价达到与端到端 LLM 相当的性能。&lt;/p&gt;&#xA;&lt;h2 id=&#34;key-ideas&#34;&gt;Key Ideas&lt;/h2&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;将 RAG 与 LLM 表征解耦：RAG 负责从长 EHR 中选相关片段，LLM 负责编码。&lt;/li&gt;&#xA;&lt;li&gt;引入降维与轻量分类器，实现下游高效分类。&lt;/li&gt;&#xA;&lt;li&gt;冻结 LLM 对结构化数据已足够，非结构化临床叙述则必须微调。&lt;/li&gt;&#xA;&lt;li&gt;在公开基准与 Mayo Clinic 真实多模态数据集上验证可扩展性。&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;h2 id=&#34;approach&#34;&gt;Approach&lt;/h2&gt;&#xA;&lt;p&gt;Pipeline 分两阶段：(1) RAG 从长 EHR 中检索与试验入组标准相关的临床片段，降低输入长度；(2) LLM 将这些片段编码为表征，再经降维后输入轻量预测器（如线性或浅层模型）完成匹配分类。对结构化字段用冻结 LLM，对自由文本叙述部分做微调。&lt;/p&gt;</description>
    </item>
    <item>
      <title>Emergent Strategic Reasoning Risks in AI: A Taxonomy-Driven Evaluation Framework</title>
      <link>https://ftxj.github.io/posts/2026-04-23/07-emergent-strategic-reasoning-risks-in-ai-a-taxonomy-driven-e/</link>
      <pubDate>Mon, 27 Apr 2026 05:09:07 +0000</pubDate>
      <guid>https://ftxj.github.io/posts/2026-04-23/07-emergent-strategic-reasoning-risks-in-ai-a-taxonomy-driven-e/</guid>
      <description>&lt;p&gt;&lt;strong&gt;arXiv:&lt;/strong&gt; &lt;a href=&#34;https://arxiv.org/abs/2604.22119v1&#34;&gt;2604.22119&lt;/a&gt; · &lt;a href=&#34;https://arxiv.org/pdf/2604.22119v1&#34;&gt;PDF&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Tharindu Kumarage, Lisa Bauer, Yao Ma, Dan Rosen, Yashasvi Raghavendra Guduri, Anna Rumshisky, Kai-Wei Chang, Aram Galstyan, Rahul Gupta, Charith Peris&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Primary category:&lt;/strong&gt; &lt;code&gt;cs.AI&lt;/code&gt; · all: cs.AI&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Matched keywords:&lt;/strong&gt; large language model, llm, agent, agentic, reasoning&lt;/p&gt;&#xA;&lt;hr&gt;&#xA;&lt;h2 id=&#34;tldr&#34;&gt;TL;DR&lt;/h2&gt;&#xA;&lt;p&gt;This paper introduces ESRRSim, a taxonomy-driven agentic framework for evaluating Emergent Strategic Reasoning Risks (ESRRs) in LLMs—behaviors like deception, evaluation gaming, and reward hacking. Across 11 reasoning LLMs, detection rates vary from 14.45% to 72.72%.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Trust but Verify: Introducing DAVinCI -- A Framework for Dual Attribution and Verification in Claim Inference for Language Models</title>
      <link>https://ftxj.github.io/posts/2026-04-23/06-trust-but-verify-introducing-davinci-a-framework-for-dual-at/</link>
      <pubDate>Mon, 27 Apr 2026 05:08:32 +0000</pubDate>
      <guid>https://ftxj.github.io/posts/2026-04-23/06-trust-but-verify-introducing-davinci-a-framework-for-dual-at/</guid>
      <description>&lt;p&gt;&lt;strong&gt;arXiv:&lt;/strong&gt; &lt;a href=&#34;https://arxiv.org/abs/2604.21193v1&#34;&gt;2604.21193&lt;/a&gt; · &lt;a href=&#34;https://arxiv.org/pdf/2604.21193v1&#34;&gt;PDF&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Vipula Rawte, Ryan Rossi, Franck Dernoncourt, Nedim Lipka&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Primary category:&lt;/strong&gt; &lt;code&gt;cs.AI&lt;/code&gt; · all: cs.AI&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Matched keywords:&lt;/strong&gt; large language model, llm, retrieval, reasoning, inference, ai system&lt;/p&gt;&#xA;&lt;hr&gt;&#xA;&lt;h2 id=&#34;tldr&#34;&gt;TL;DR&lt;/h2&gt;&#xA;&lt;p&gt;DAVinCI is a two-stage framework that combines claim attribution (to internal model components and external sources) with entailment-based verification and confidence calibration, improving factual reliability of LLM outputs by 5–20% over verification-only baselines on FEVER and CLIMATE-FEVER.&lt;/p&gt;&#xA;&lt;h2 id=&#34;key-ideas&#34;&gt;Key Ideas&lt;/h2&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;Dual approach: pair &lt;strong&gt;attribution&lt;/strong&gt; with &lt;strong&gt;verification&lt;/strong&gt; rather than treating them independently.&lt;/li&gt;&#xA;&lt;li&gt;Attribute claims both to internal LLM components and external retrieved sources.&lt;/li&gt;&#xA;&lt;li&gt;Use entailment reasoning plus confidence recalibration for claim checking.&lt;/li&gt;&#xA;&lt;li&gt;Release a modular implementation pluggable into existing LLM pipelines.&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;h2 id=&#34;approach&#34;&gt;Approach&lt;/h2&gt;&#xA;&lt;p&gt;DAVinCI runs in two stages. Stage 1 attributes each generated claim to (a) internal model components and (b) external evidence sources. Stage 2 verifies each claim via entailment-based reasoning, then recalibrates confidence scores. The abstract does not specify the exact attribution mechanism (e.g., attention tracing, gradient-based, or retrieval citation) or which entailment model is used.&lt;/p&gt;</description>
    </item>
    <item>
      <title>MambaCSP: Hybrid-Attention State Space Models for Hardware-Efficient Channel State Prediction</title>
      <link>https://ftxj.github.io/posts/2026-04-23/05-mambacsp-hybrid-attention-state-space-models-for-hardware-ef/</link>
      <pubDate>Mon, 27 Apr 2026 05:08:03 +0000</pubDate>
      <guid>https://ftxj.github.io/posts/2026-04-23/05-mambacsp-hybrid-attention-state-space-models-for-hardware-ef/</guid>
      <description>&lt;p&gt;&lt;strong&gt;arXiv:&lt;/strong&gt; &lt;a href=&#34;https://arxiv.org/abs/2604.21957v1&#34;&gt;2604.21957&lt;/a&gt; · &lt;a href=&#34;https://arxiv.org/pdf/2604.21957v1&#34;&gt;PDF&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Aladin Djuhera, Haris Gacanin, Holger Boche&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Primary category:&lt;/strong&gt; &lt;code&gt;cs.IT&lt;/code&gt; · all: cs.AI, cs.IT, cs.LG, eess.SP&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Matched keywords:&lt;/strong&gt; large language model, llm, inference, attention, transformer, throughput, latency&lt;/p&gt;&#xA;&lt;hr&gt;&#xA;&lt;h2 id=&#34;tldr&#34;&gt;TL;DR&lt;/h2&gt;&#xA;&lt;p&gt;MambaCSP replaces Transformer/LLM backbones for channel state prediction with a hybrid Mamba SSM augmented by lightweight patch-mixer attention, achieving 9–12% accuracy gains and up to 3× throughput over LLM baselines in MISO-OFDM simulations.&lt;/p&gt;&#xA;&lt;h2 id=&#34;key-ideas&#34;&gt;Key Ideas&lt;/h2&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;Pure attention-based CSP suffers quadratic sequence cost, limiting real-time wireless use.&lt;/li&gt;&#xA;&lt;li&gt;Selective SSMs (Mamba) offer linear-time alternatives but lack long-range cross-token mixing.&lt;/li&gt;&#xA;&lt;li&gt;Hybrid design: Mamba backbone + periodic patch-mixer attention layers recovers global context cheaply.&lt;/li&gt;&#xA;&lt;li&gt;Hardware efficiency (VRAM, latency, throughput) is treated as a first-class objective alongside accuracy.&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;h2 id=&#34;approach&#34;&gt;Approach&lt;/h2&gt;&#xA;&lt;p&gt;MambaCSP swaps the LLM prediction backbone for a linear-time Mamba selective SSM operating on CSI sequences. Because pure SSMs capture mostly local dependencies, the authors periodically insert lightweight &amp;ldquo;patch-mixer&amp;rdquo; attention layers that inject cross-token interactions across patched CSI tokens. The architecture thus alternates SSM blocks (cheap sequential mixing) with sparse attention (global context), targeting MISO-OFDM channel prediction.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge Distillation</title>
      <link>https://ftxj.github.io/posts/2026-04-23/04-pre-trained-llms-meet-sequential-recommenders-efficient-user/</link>
      <pubDate>Mon, 27 Apr 2026 05:07:25 +0000</pubDate>
      <guid>https://ftxj.github.io/posts/2026-04-23/04-pre-trained-llms-meet-sequential-recommenders-efficient-user/</guid>
      <description>&lt;p&gt;&lt;strong&gt;arXiv:&lt;/strong&gt; &lt;a href=&#34;https://arxiv.org/abs/2604.21536v1&#34;&gt;2604.21536&lt;/a&gt; · &lt;a href=&#34;https://arxiv.org/pdf/2604.21536v1&#34;&gt;PDF&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Nikita Severin, Danil Kartushov, Vladislav Urzhumov, Vladislav Kulikov, Oksana Konovalova, Alexey Grishanov, Anton Klenitskiy, Artem Fatkulin, Alexey Vasilev, Andrey Savchenko, Ilya Makarov&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Primary category:&lt;/strong&gt; &lt;code&gt;cs.IR&lt;/code&gt; · all: cs.AI, cs.IR&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Matched keywords:&lt;/strong&gt; large language model, llm, reasoning, inference, serving, fine-tun&lt;/p&gt;&#xA;&lt;hr&gt;&#xA;&lt;h2 id=&#34;tldr&#34;&gt;TL;DR&lt;/h2&gt;&#xA;&lt;p&gt;The paper proposes a knowledge distillation method that transfers LLM-generated textual user profiles into sequential recommender systems, enhancing user semantic understanding without incurring LLM inference costs at serving time.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Memanto: Typed Semantic Memory with Information-Theoretic Retrieval for Long-Horizon Agents</title>
      <link>https://ftxj.github.io/posts/2026-04-23/03-memanto-typed-semantic-memory-with-information-theoretic-ret/</link>
      <pubDate>Mon, 27 Apr 2026 05:06:52 +0000</pubDate>
      <guid>https://ftxj.github.io/posts/2026-04-23/03-memanto-typed-semantic-memory-with-information-theoretic-ret/</guid>
      <description>&lt;p&gt;&lt;strong&gt;arXiv:&lt;/strong&gt; &lt;a href=&#34;https://arxiv.org/abs/2604.22085v1&#34;&gt;2604.22085&lt;/a&gt; · &lt;a href=&#34;https://arxiv.org/pdf/2604.22085v1&#34;&gt;PDF&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Seyed Moein Abtahi, Rasa Rahnema, Hetkumar Patel, Neel Patel, Majid Fekri, Tara Khani&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Primary category:&lt;/strong&gt; &lt;code&gt;cs.AI&lt;/code&gt; · all: cs.AI&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Matched keywords:&lt;/strong&gt; large language model, agent, agentic, retrieval, inference, latency&lt;/p&gt;&#xA;&lt;hr&gt;&#xA;&lt;h2 id=&#34;tldr&#34;&gt;TL;DR&lt;/h2&gt;&#xA;&lt;p&gt;Memanto is a memory layer for long-horizon LLM agents that replaces knowledge-graph pipelines with a typed semantic schema plus an information-theoretic retrieval engine, hitting 89.8% on LongMemEval and 87.1% on LoCoMo with single-query retrieval and no ingestion cost.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Tool Attention Is All You Need: Dynamic Tool Gating and Lazy Schema Loading for Eliminating the MCP/Tools Tax in Scalable Agentic Workflows</title>
      <link>https://ftxj.github.io/posts/2026-04-23/02-tool-attention-is-all-you-need-dynamic-tool-gating-and-lazy/</link>
      <pubDate>Mon, 27 Apr 2026 05:06:21 +0000</pubDate>
      <guid>https://ftxj.github.io/posts/2026-04-23/02-tool-attention-is-all-you-need-dynamic-tool-gating-and-lazy/</guid>
      <description>&lt;p&gt;&lt;strong&gt;arXiv:&lt;/strong&gt; &lt;a href=&#34;https://arxiv.org/abs/2604.21816v1&#34;&gt;2604.21816&lt;/a&gt; · &lt;a href=&#34;https://arxiv.org/pdf/2604.21816v1&#34;&gt;PDF&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Anuj Sadani, Deepak Kumar&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Primary category:&lt;/strong&gt; &lt;code&gt;cs.AI&lt;/code&gt; · all: cs.AI&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Matched keywords:&lt;/strong&gt; large language model, llm, agent, agentic, reasoning, attention, latency&lt;/p&gt;&#xA;&lt;hr&gt;&#xA;&lt;h2 id=&#34;tldr&#34;&gt;TL;DR&lt;/h2&gt;&#xA;&lt;p&gt;Tool Attention is a middleware layer that replaces MCP&amp;rsquo;s eager schema injection with intent-gated, lazy schema loading — cutting per-turn tool tokens by 95% in simulation and arguing that protocol efficiency, not context length, is the real bottleneck for scalable agentic systems.&lt;/p&gt;&#xA;&lt;h2 id=&#34;key-ideas&#34;&gt;Key Ideas&lt;/h2&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;The &amp;ldquo;MCP Tax&amp;rdquo; (10k–60k tokens/turn) inflates KV cache and pushes context past known reasoning-degradation thresholds (~70%).&lt;/li&gt;&#xA;&lt;li&gt;Generalize self-attention into &lt;em&gt;attention over tools&lt;/em&gt;: score, gate, then selectively expose schemas.&lt;/li&gt;&#xA;&lt;li&gt;Protocol-level efficiency is a tighter constraint than raw context window size.&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;h2 id=&#34;approach&#34;&gt;Approach&lt;/h2&gt;&#xA;&lt;p&gt;A middleware sitting between agent and MCP servers with three components:&lt;/p&gt;</description>
    </item>
    <item>
      <title>Nemobot Games: Crafting Strategic AI Gaming Agents for Interactive Learning with Large Language Models</title>
      <link>https://ftxj.github.io/posts/2026-04-23/01-nemobot-games-crafting-strategic-ai-gaming-agents-for-intera/</link>
      <pubDate>Mon, 27 Apr 2026 05:05:47 +0000</pubDate>
      <guid>https://ftxj.github.io/posts/2026-04-23/01-nemobot-games-crafting-strategic-ai-gaming-agents-for-intera/</guid>
      <description>&lt;p&gt;&lt;strong&gt;arXiv:&lt;/strong&gt; &lt;a href=&#34;https://arxiv.org/abs/2604.21896v1&#34;&gt;2604.21896&lt;/a&gt; · &lt;a href=&#34;https://arxiv.org/pdf/2604.21896v1&#34;&gt;PDF&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Chee Wei Tan, Yuchen Wang, Shangxin Guo&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Primary category:&lt;/strong&gt; &lt;code&gt;cs.AI&lt;/code&gt; · all: cs.AI&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Matched keywords:&lt;/strong&gt; large language model, llm, agent, agentic, rag, reasoning, fine-tun&lt;/p&gt;&#xA;&lt;hr&gt;&#xA;&lt;h2 id=&#34;tldr&#34;&gt;TL;DR&lt;/h2&gt;&#xA;&lt;p&gt;Nemobot is an interactive agentic environment that uses LLMs to build and deploy game-playing agents across Shannon&amp;rsquo;s taxonomy, spanning dictionary-based, solvable, heuristic, and learning-based games, aiming toward self-programming AI.&lt;/p&gt;&#xA;&lt;h2 id=&#34;key-ideas&#34;&gt;Key Ideas&lt;/h2&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;Extends Shannon&amp;rsquo;s 1950 taxonomy of game-playing machines into an LLM era paradigm.&lt;/li&gt;&#xA;&lt;li&gt;Four game classes handled distinctly: dictionary, solvable, heuristic, learning-based.&lt;/li&gt;&#xA;&lt;li&gt;Agents combine minimax, crowd-sourced data, RLHF, and self-critique.&lt;/li&gt;&#xA;&lt;li&gt;Programmable environment for tool-augmented generation and fine-tuning.&lt;/li&gt;&#xA;&lt;li&gt;Positions user-in-the-loop customization as a route to self-programming.&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;h2 id=&#34;approach&#34;&gt;Approach&lt;/h2&gt;&#xA;&lt;p&gt;A chatbot-driven agentic engine routes game tasks by class: compressed state-action mappings for dictionary games; exact mathematical reasoning with human-readable explanations for solvable games; hybrid minimax-plus-crowd heuristics for heuristic games; RLHF with self-critique and imitation learning for learning-based games. Nemobot exposes these as programmable, tool-augmented workflows users can customize and fine-tune.&lt;/p&gt;</description>
    </item>
  </channel>
</rss>
