From Skills to Talent: Organising Heterogeneous Agents as a

1. どんなもの？

マルチエージェントシステムに「組織レイヤー」を導入するフレームワーク「OneManCompany (OMC)」。
個々のエージェントの能力向上とは別に、エージェント集団を組織し、統治し、改善する仕組みを提供する。
スキル、ツール、ランタイム設定を「Talents」というポータブルなエージェントIDとしてカプセル化。
「Talent Market」を通じて、オンデマンドでエージェント（Talents）を募集・再構成できる動的な組織構造を持つ。
組織の意思決定は「Explore-Execute-Review (E^2R) ツリー探索」で実行。計画、実行、評価を統合した階層ループ。

2. 先行研究と比べてどこがすごい？

従来のマルチエージェントシステムが抱える課題（固定チーム構造、密結合な調整ロジック、セッション限定学習）を克服。
「組織レイヤー」という概念を導入し、エージェントの能力と組織のガバナンスを分離することで、より柔軟で適応性の高いシステムを実現。
動的なエージェントの募集・再構成を可能にする「Talent Market」の導入により、組織が実行中に能力ギャップを埋め、自己再構成できる。
計画、実行、評価を統合した「E^2R」ループにより、自己組織化・自己改善型のAI組織を実現し、終了とデッドロックフリーの形式的保証を提供する。
PRDBenchでの評価において、既存のSOTAを15.48%上回る84.67%の成功率を達成。

3. 技術や手法の肝はどこ？

**Talents:** スキル、ツール、ランタイム設定をカプセル化したポータブルなエージェントID。異種バックエンドを抽象化する型付き組織インターフェースを通じてオーケストレーションされる。
**Talent Market:** コミュニティ主導で、オンデマンドでTalentsを募集し、組織の能力ギャップを埋め、動的に再構成する仕組み。これにより、組織はタスクに応じて最適な人材を確保できる。
**Explore-Execute-Review (E^2R) ツリー探索:**
計画、実行、評価を統合した単一の階層ループ。タスクをトップダウンで責任単位に分解し、実行結果をボトムアップで集約して体系的なレビューと改善を促進する。
人間の企業におけるフィードバックメカニズムを模倣し、終了とデッドロックフリーの形式的保証を提供する。

4. どうやって有効だと検証した？

**PRDBenchでの経験的評価:**
OMCが84.67%の成功率を達成し、既存の最先端技術（SOTA）を15.48パーセンテージポイント上回る性能を示した。
**クロスドメインケーススタディ:**
複数の異なるドメインでのケーススタディを通じて、OMCフレームワークの汎用性と、オープンエンドなタスクへの適応能力を実証した。

5. 議論はある？

アブストラクトからは具体的な議論点や限界は明示されていない。
想定される議論点としては、コミュニティ主導のTalent Marketにおける信頼性や悪意のあるTalentの排除メカニズム、E^2Rループの計算コストと大規模な組織へのスケーラビリティ、そして組織の意思決定における倫理的な側面などが挙げられる。

6. 次に読むべき論文は？

マルチエージェントシステムの組織構造やガバナンスに関する研究論文。
自律エージェントの計画、実行、評価のループ（例: ReAct, Reflexionなど）をさらに深掘りした論文。
エージェントのスキル学習やツール利用、特に動的なツール統合に関する論文。
PRDBenchや類似のマルチエージェント評価ベンチマークに関する詳細な論文。

Abstract (原文)

Individual agent capabilities have advanced rapidly through modular skills and tool integrations, yet multi-agent systems remain constrained by fixed team structures, tightly coupled coordination logic, and session-bound learning. We argue that this reflects a deeper absence: a principled organisational layer that governs how a workforce of agents is assembled, governed, and improved over time, decoupled from what individual agents know. To fill this gap, we introduce OneManCompany (OMC), a framework that elevates multi-agent systems to the organisational level. OMC encapsulates skills, tools, and runtime configurations into portable agent identities called Talents, orchestrated through typed organisational interfaces that abstract over heterogeneous backends. A community-driven Talent Market enables on-demand recruitment, allowing the organisation to close capability gaps and reconfigure itself dynamically during execution. Organisational decision-making is operationalised through an Explore-Execute-Review (E^2R) tree search, which unifies planning, execution, and evaluation in a single hierarchical loop: tasks are decomposed top-down into accountable units and execution outcomes are aggregated bottom-up to drive systematic review and refinement. This loop provides formal guarantees on termination and deadlock freedom while mirroring the feedback mechanisms of human enterprises. Together, these contributions transform multi-agent systems from static, pre-configured pipelines into self-organising and self-improving AI organisations capable of adapting to open-ended tasks across diverse domains. Empirical evaluation on PRDBench shows that OMC achieves an 84.67% success rate, surpassing the state of the art by 15.48 percentage points, with cross-domain case studies further demonstrating its generality.

From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company💻 コードあり

Abstract (原文)