Experimental Study of shared-task-list Agent Teams and Hierarchical Subagents for end-to-end code Synthesis

Umamaheswara Rao Kukkala

Research Article

Experimental Study of shared-task-list Agent Teams and Hierarchical Subagents for end-to-end code Synthesis

by Umamaheswara Rao Kukkala

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 187 - Issue 93

Published: March 2026

Authors: Umamaheswara Rao Kukkala

10.5120/ijca2026926599

PDF

Umamaheswara Rao Kukkala . Experimental Study of shared-task-list Agent Teams and Hierarchical Subagents for end-to-end code Synthesis. International Journal of Computer Applications. 187, 93 (March 2026), 8-20. DOI=10.5120/ijca2026926599

                        @article{ 10.5120/ijca2026926599,
                        author  = { Umamaheswara Rao Kukkala },
                        title   = { Experimental Study of shared-task-list Agent Teams and Hierarchical Subagents for end-to-end code Synthesis },
                        journal = { International Journal of Computer Applications },
                        year    = { 2026 },
                        volume  = { 187 },
                        number  = { 93 },
                        pages   = { 8-20 },
                        doi     = { 10.5120/ijca2026926599 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }

                        %0 Journal Article
                        %D 2026
                        %A Umamaheswara Rao Kukkala
                        %T Experimental Study of shared-task-list Agent Teams and Hierarchical Subagents for end-to-end code Synthesis%T 
                        %J International Journal of Computer Applications
                        %V 187
                        %N 93
                        %P 8-20
                        %R 10.5120/ijca2026926599
                        %I Foundation of Computer Science (FCS), NY, USA

Abstract

Large language models (LLMs) are increasingly being deployed as autonomous software engineering agents capable of decomposing tasks, generating code, and iteratively refining solutions. However, the impact of coordination architecture on system performance remains underexplored. This study presents a controlled empirical comparison between hierarchical subagent delegation and collaborative shared-task-list agent teams for end-to-end code synthesis. Using SWE-bench Verified tasks and integration-heavy repository builds, this study evaluates the solve rate, regression stability, token cost, and coordination overhead across varying dependency coupling regimes. The results show that collaborative agent teams achieve up to 17% higher solve rates in moderately coupled tasks and reduce regression errors by 25% but incur up to 2.9× higher token cost. Performance gains diminish in highly coupled scenarios due to coordination overhead. This study introduces a coupling-sensitive coordination framework that explains these trade-offs and provides a principled basis for selecting orchestration strategies. These findings contribute to the design of efficient multi-agent LLM systems and advance the understanding of coordination dynamics in autonomous software engineering.

References

Vaswani, A., Shazeer, N., Parmar, N., et al. 2017. Attention is all that you need. In: Advances in Neural Information Processing Systems.
Brown, T. B., Mann, B., Ryder, N., et al. 2020. Language models are based on few-shot learning. In: Advances in Neural Information Processing Systems.
Chen, M., Tworek, J., Jun, H., et al. 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374.
Devlin, J., Chang, M. W., Lee, K., and Toutanova, K. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL.
Du, Y., Li, S., Torralba, A., et al. 2023. Improving reasoning in large language models with agents. arXiv preprint arXiv:2305.17126.
Shinn, N., Labash, B., and Gopinath, D. 2023. Reflexion: Language agents with verbal reinforcement learning. arXiv preprint arXiv:2303.11366.
Wang, W., Xie, Y., Zhang, S., et al. 2023. Voyager: An open-ended embodied agent with large language models . arXiv preprint arXiv:2305.16291.
Parnas, D. L. 1972. On the criteria to be used in decomposing systems into modules. Communications of the ACM, 15(12), 1053–1058.
Simon, H. A. 1996. The sciences of the artificial. MIT Press.
Anthropic. 2024. Agent teams: Control your agent team. Available at https://code.claude.com/docs.
OpenAI. 2024. Multi-agent systems and orchestration. Technical Report.
Liu, X., et al. 2023. Challenges in benchmarking large language models. arXiv preprint.
Narayanan, K., et al. 2024. Autonomous coding agents in enterprise systems. Technical Report.
Jimenez, C., Yang, J., et al. 2023. SWE-bench: Can language models resolve real-world GitHub issues? arXiv preprint arXiv:2305.10601.
Smith, R. G. 1980. The contract net protocol: High-level communication and control in a distributed problem solver. IEEE Transactions on Computers, 29(12), 1104–1113.
Hutchins, E. 1995. Cognition in the wild. MIT Press.
Wooldridge, M. 2009. An introduction to multiagent systems. Wiley.
Bommasani, R., et al. 2021. On the opportunities and risks of foundation models. arXiv preprint.
OpenAI. 2021. OpenAI Codex: Evaluating large language models for code. Technical Report.
DeepMind. 2022. AlphaCode: Competitive programming with large language models. Science.
Park, J. S., et al. 2023. Generative agents: Interactive simulacra of human behavior. In Proceedings of CHI.
AutoGPT. 2024. Autonomous GPT agent documentation. Available online.
Meta AI. 2024. Collaborative AI agents and coordination systems. Technical Report.
Cataldo, M., Wagstrom, P., Herbsleb, J., and Carley, K. 2006. Identification of coordination requirements. Computer Supported Cooperative Work, 15(4), 331–360.
Brooks, F. P. 1975. The mythical man-month: Essays on software engineering. Addison-Wesley.
Newell, A. 1990. Unified theories of cognition. Harvard University Press.
Laird, J. E. 2012. The Soar cognitive architecture. MIT Press.
Sutton, R. S. and Barto, A. G. 2018. Reinforcement learning: An introduction. MIT Press.
Silver, D., et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature.
Google DeepMind. 2024. Gemini technical report. Technical Report.
Anthropic. 2024. Claude model family technical report. Technical Report.
OpenAI. 2023. GPT-4 technical report. arXiv preprint.
Li, Y., et al. 2023. Tool use in large language models. arXiv preprint.
Schick, T., et al. 2023. Toolformer: Language models can teach themselves to use tools. arXiv preprint.
Yao, S., et al. 2023. Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint.
Zhou, D., et al. 2023. Self-refine: Iterative refinement with language models. arXiv preprint.
Bubeck, S., et al. 2023. Sparks of artificial general intelligence. Microsoft Research.
Karpas, E., et al. 2022. MRKL systems: A modular neuro-symbolic architecture. arXiv preprint.
LeCun, Y., Bengio, Y., and Hinton, G. 2015. Deep learning. Nature, 521(7553), 436–444.
Dean, J. and Ghemawat, S. 2012. MapReduce: Simplified data processing on large clusters. Communications of the ACM.
Zaharia, M., et al. 2016. Apache Spark: A unified engine for big data processing. Communications of the ACM.
Armbrust, M., et al. 2020. Delta Lake: High-performance ACID table storage over cloud object stores. Proceedings of VLDB.
Databricks. 2024. Lakehouse architecture documentation. Technical Whitepaper.
Google Cloud. 2024. AI infrastructure architecture. Technical Documentation.
Amazon Web Services. 2024. AI and ML architecture best practices. Technical Documentation.
Microsoft Azure. 2024. AI platform architecture. Technical Documentation.
Hinton, G., Vinyals, O., and Dean, J. 2015. Distilling the knowledge in a neural network. arXiv preprint.
Radford, A., et al. 2019. Language models are unsupervised multitask learners. OpenAI Report.
Kaplan, J., et al. 2020. Scaling laws for neural language models. arXiv preprint.
Hoffmann, J., et al. 2022. Training compute-optimal large language models. arXiv preprint.

Index Terms

Computer Science

Information Sciences

No index terms available.

Keywords

Agent teams subagents Artificial Intelligence end-to-end code synthesis SWE-bench autonomous agents Multi-Agent Coordination LLM Orchestration Autonomous Software Engineering Token Cost Optimization