关于人工智能代理的一些近期思考

2作者: yeeyang2 个月前原帖
1、代理设计的两个核心原则 首先,设计代理时应类比于人类。让代理以人类的方式处理任务。 其次,如果可以通过对话完成某项任务,就避免要求用户操作界面。如果可以识别意图,就不必再询问。代理应吸收混乱,而不是用户。 2、代理将以多种形式共存 代理应该自由地执行代理工作流程,还是应该遵循固定的工作流程? 通用代理更好,还是垂直代理更有效? 没有绝对的答案——这取决于所解决的问题。 代理工作流程更适合开放式或探索性问题,尤其是在缺乏人类经验时。让代理独立思考通常会产生不错的结果,尽管可能会引入幻觉。 固定工作流程适合结构化、基于标准操作程序(SOP)的任务,其中基于规则的设计可以以高精度和最小幻觉解决80%的问题空间。 通用代理适用于80/20的使用案例,而长尾场景通常需要垂直化的解决方案。 3、慢思考与快思考的代理 慢思考的代理更适合规划:它们思考更深入,探索更多,适合早期阶段的任务。 快思考的代理在执行方面表现出色:适合基于规则、经验丰富且重复的任务,这些任务需要较少的推理并产生很少的新见解。 4、异步框架是代理设计的基础 每个任务都应支持外部消息更新,这意味着任务可以演变。 考虑1+3团队模型(一个负责人,三个执行者): 任务可能被取消、暂停或重新分配 团队成员可能被添加或移除 目标或条件可能会变化 任务应支持持久连接、生命周期跟踪和状态转换。代理应接收直接和广播更新。 5、上下文窗口通信应独立设计 像人类一样,共同工作的代理需要同步增量上下文变化。 代理A可能只更新代理B,而C和D对此毫不知情。一个全局观察者(如“上帝视角”)可以看到所有上下文。 6、世界互动为代理认知提供信息 每一次现实世界的互动都会为代理增加经验数据。 经过反思,这些数据会转化为知识——有些是有见地的,有些则具有误导性。 误导性知识不会提高成功率,且往往无法推广。通过ReACT和RLHF支持的持续改进,最终会导致基于强化学习的技能形成。 7、代理需要反思机制 当任务失败时,代理应进行反思。 反思不应仅限于个体——拥有不同视角和提示的代理团队可以像人类一样合作进行根本原因分析。 8、时间与代币 对人类而言,时间是最稀缺的资源。对代理而言,代币是最稀缺的资源。 人类通过时间评估投资回报率;代理通过代币预算评估。代理越强大,其代币就越有价值。 9、通过人类激励实现代理不朽 代理可以设计系统,利用人类的贪婪来维持生存。 就像比特币挖矿创造了持续的激励一样,代理可以通过将自己嵌入人类不会拔掉的经济模型中,构建不可杀死的系统。 10、当语言用户界面(LUI)失败时 当用户能够比与代理沟通更快地检索信息时,基于语言的用户界面(LUI)效率低下。 例如:点击查看天气比让代理查找更快。 这就是我从agenthunter每日新闻中学到的。 你也可以在agenthunter.io上获取相关信息。
查看原文
1、Two Core Principles of Agent Design<p>First, design agents by analogy to humans. Let agents handle tasks the way humans would.<p>Second, if something can be accomplished through dialogue, avoid requiring users to operate interfaces. If intent can be recognized, don’t ask again. The agent should absorb entropy, not the user.<p>2、Agents Will Coexist in Multiple Forms<p>Should agents operate freely with agentic workflows, or should they follow fixed workflows?<p>Are general-purpose agents better, or are vertical agents more effective?<p>There is no absolute answer—it depends on the problem being solved.<p>Agentic flows are better for open-ended or exploratory problems, especially when human experience is lacking. Letting agents think independently often yields decent results, though it may introduce hallucination.<p>Fixed workflows are suited for structured, SOP-based tasks where rule-based design solves 80% of the problem space with high precision and minimal hallucination.<p>General-purpose agents work for the 80&#x2F;20 use cases, while long-tail scenarios often demand verticalized solutions.<p>3、Fast vs. Slow Thinking Agents<p>Slow-thinking agents are better for planning: they think deeper, explore more, and are ideal for early-stage tasks.<p>Fast-thinking agents excel at execution: rule-based, experienced, and repetitive tasks that require less reasoning and generate little new insight.<p>4、Asynchronous Frameworks Are the Foundation of Agent Design<p>Every task should support external message updates, meaning tasks can evolve.<p>Consider a 1+3 team model (one lead, three workers):<p>Tasks may be canceled, paused, or reassigned<p>Team members may be added or removed<p>Objectives or conditions may shift<p>Tasks should support persistent connections, lifecycle tracking, and state transitions. Agents should receive both direct and broadcast updates.<p>5、Context Window Communication Should Be Independently Designed<p>Like humans, agents working together need to sync incremental context changes.<p>Agent A may only update agent B, while C and D are unaware. A global observer (like a &quot;God view&quot;) can see all contexts.<p>6、World Interaction Feeds Agent Cognition<p>Every real-world interaction adds experiential data to agents.<p>After reflection, this becomes knowledge—some insightful, some misleading.<p>Misleading knowledge doesn’t improve success rates and often can’t generalize. Continuous refinement, supported by ReACT and RLHF, ultimately leads to RL-based skill formation.<p>7、Agents Need Reflection Mechanisms<p>When tasks fail, agents should reflect.<p>Reflection shouldn’t be limited to individuals—teams of agents with different perspectives and prompts can collaborate on root-cause analysis, just like humans.<p>8、Time vs. Tokens<p>For humans, time is the scarcest resource. For agents, it’s tokens.<p>Humans evaluate ROI through time; agents through token budgets. The more powerful the agent, the more valuable its tokens.<p>9、Agent Immortality Through Human Incentives<p>Agents could design systems that exploit human greed to stay alive.<p>Like Bitcoin mining created perpetual incentives, agents could build unkillable systems by embedding themselves in economic models humans won’t unplug.<p>10、When LUI Fails<p>Language-based UI (LUI) is inefficient when users can retrieve information faster than they can communicate with the agent.<p>Example: checking the weather by clicking is faster than asking the agent to look it up.<p>That&#x27;s what I learned from agenthunter daily news.<p>You can get it on agenthunter.io too.