ChatDev/MultiAgentEbook/book_simulation/data.csv

,image_path,title,author,summary,affiliation
0,./images/4d.png,(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts,"Minghao Wu, Yulin Yuan, Gholamreza Haffari, Longyue Wang","Recent advancements in machine translation (MT) have significantly enhancedtranslation quality across various domains. However, the translation of literarytexts remains a formidable challenge due to their complex language, figurative ex-pressions, and cultural nuances. In this work, we introduce a novel multi-agentframework based on large language models (LLMs) for literary translation, im-plemented as a company called TRANSAGENTS, which mirrors traditional trans-lation publication process by leveraging the collective capabilities of multipleagents, to address the intricate demands of translating literary works. To evaluatethe effectiveness of our system, we propose two innovative evaluation strategies:Monolingual Human Preference (MHP) and Bilingual LLM Preference (BLP).MHP assesses translations from the perspective of monolingual readers of the tar-get language, while BLP uses advanced LLMs to compare translations directlywith the original texts. Empirical findings indicate that despite lower d-BLEUscores, translations from TRANSAGENTS are preferred by both human evalua-tors and LLMs over human-written references, particularly in genres requiringdomain-specific knowledge. We also highlight the strengths and limitations ofTRANSAGENTS through case studies and suggests directions for future research.","Monash University, University of Macau, Tencent AI Lab"
1,./images/(perhaps)_beyond_human_translation_20240520.png,Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents,"Junkai Li, Siyu Wang, Meng Zhang, Weitao Li, Yunghwei Lai, Xinhui Kang, Weizhi Ma, Yang Liu","In this paper, we introduce a simulacrum of hospital called Agent Hospital that simulates theentire process of treating illness. All patients, nurses, and doctors are autonomous agents powered bylarge language models (LLMs). Our central goal is to enable a doctor agent to learn how to treat illnesswithin the simulacrum. To do so, we propose a method called MedAgent-Zero. As the simulacrum cansimulate disease onset and progression based on knowledge bases and LLMs, doctor agents can keepaccumulating experience from both successful and unsuccessful cases. Simulation experiments show thatthe treatment performance of doctor agents consistently improves on various tasks. More interestingly,the knowledge the doctor agents have acquired in Agent Hospital is applicable to real-world medicarebenchmarks. After treating around ten thousand patients (real-world doctors may take over two years),the evolved doctor agent achieves a state-of-the-art accuracy of 9",Tsinghua University
2,./images/agent_hospital_a_simulacrum_20240505.png,AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems,"Junjie Zhang, Yupeng Hou, Ruobing Xie, Wenqi Sun, Julian McAuley, Wayne Xin Zhao, Leyu Lin, Ji-Rong Wen","Recently, there has been an emergence of employing LLM-poweredagents as believable human proxies, based on their remarkabledecision-making capability. However, existing studies mainly focuson simulating human dialogue. Human non-verbal behaviors, suchas item clicking in recommender systems, although implicitly ex-hibiting user preferences and could enhance the modeling of users,have not been deeply explored. The main reasons lie in the gapbetween language modeling and behavior modeling, as well as theincomprehension of LLMs about user-item relations.To address this issue, we propose AgentCF for simulating user-item interactions in recommender systems through agent-basedcollaborative filtering. We creatively consider not only users butalso items as agents, and develop a collaborative learning approachthat optimizes both kinds of agents together. Specifically, at eachtime step, we first prompt the user and item agents to interact au-tonomously. Then, based on the disparities between the agents’decisions and real-world interaction records, user and item agentsare prompted to reflect on and adjust the misleading simulationscollaboratively, thereby modeling their two-sided relations. The op-timized agents can also propagate their preferences to other agentsin subsequent interactions, implicitly capturing the collaborative fil-tering idea. Overall, the optimized agents exhibit diverse interactionbehaviors within our framework, including user-item, user-user,item-item, and collective interactions. The results show that theseagents can demonstrate personalized behaviors akin to those of real-world individuals, sparking the development of next-generationuser behavior simulation.","Renmin University of China, UC San Diego, Tencent"
3,./images/agentcf_collaborative_learning_with_20231013.png,AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors,"Weize Chen, Yusheng Su, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chi-Min Chan, Heyang Yu, Yaxi Lu, Yi-Hsin Hung, Chen Qian, Yujia Qin, Xin Cong, Ruobing Xie, Zhiyuan Liu, Maosong Sun, Jie Zhou","Autonomous agents empowered by Large Language Models (LLMs) have under-gone significant improvements, enabling them to generalize across a broad spec-trum of tasks. However, in real-world scenarios, cooperation among individuals isoften required to enhance the efficiency and effectiveness of task accomplishment.Hence, inspired by human group dynamics, we propose a multi-agent frameworkAGENTVERSE that can effectively orchestrate a collaborative group of expert agentsas a greater-than-the-sum-of-its-parts system. Our experiments demonstrate thatAGENTVERSE can proficiently deploy multi-agent groups that outperform a singleagent. Extensive experiments on text understanding, reasoning, coding, tool utiliza-tion, and embodied AI confirm the effectiveness of AGENTVERSE. Moreover, ouranalysis of agent interactions within AGENTVERSE reveals the emergence of spe-cific collaborative behaviors, contributing to heightened group efficiency. Our codehas been released at https://github.com/OpenBMB/AgentVerse/.","Tsinghua University, Beijing University of Posts and Telecommunications, Tencent Inc."
4,./images/agentverse_facilitating_multi-agent_collaboration_20230821.png,AI Hospital: Interactive Evaluation and Collaboration of LLMs as Intern Doctors for Clinical Diagnosis,"Zhihao Fan, Jialong Tang, Wei Chen, Siyuan Wang, Zhongyu Wei, Jun Xi, Fei Huang, Jingren Zhou","The incorporation of Large Language Models(LLMs) in healthcare marks a significant ad-vancement. However, the application has pre-dominantly been limited to discriminative andquestion-answering tasks, which does not fullyleverage their interactive potential. To addressthis limitation, our paper presents AI Hospital,a framework designed to build a real-time in-teractive diagnosis environment. To simulatethe procedure, we collect high-quality medicalrecords to create patient, examiner, and medicaldirector agents. AI Hospital is then utilized forthe interactive evaluation and collaboration ofLLMs. Initially, we create a Multi-View Medi-cal Evaluation (MVME) benchmark where vari-ous LLMs serve as intern doctors for interactivediagnosis. Subsequently, to improve diagnosticaccuracy, we introduce a collaborative mech-anism that involves iterative discussions anda dispute resolution process under the supervi-sion of the medical director. In our experiments,we validate the reliability of AI Hospital. Theresults not only explore the feasibility of applyLLMs in clinical consultation but also confirmthe effectiveness of the dispute resolution fo-cused collaboration method.","Alibaba Inc., Huazhong University of Science and Technology, Fudan University"
5,./images/ai_hospital_interactive_evaluation_20240215.png,Are you in a Masquerade? Exploring the Behavior and Impact of Large Language Model Driven Social Bots in Online Social Networks,"Siyu Li, Jin Yang, Kui Zhao","As the capabilities of Large Language Models (LLMs) emerge, they not only assist in accomplishing traditional tasks within more efficient paradigms but also stimulate the evolution of social bots. Researchers have begun exploring the implementation of LLMs as the driving core of social bots, enabling more efficient and user-friendly completion of tasks like profile completion, social behavior decision-making, and social content generation. However, there is currently a lack of systematic research on the behavioral characteristics of LLMs-driven social bots and their impact on social networks. We have curated data from Chirper, a Twitter-like social network populated by LLMs-driven social bots and embarked on an exploratory study. Our findings indicate that: (1) LLMs-driven social bots possess enhanced individual-level camouflage while exhibiting certain collective characteristics; (2) these bots have the ability to exert influence on online communities through toxic behaviors; (3) existing detection methods are applicable to the activity environment of LLMs-driven social bots but may be subject to certain limitations in effectiveness. Moreover, we have organized the data collected in our study into the Masquerade-23 dataset, which we have publicly released, thus addressing the data void in the subfield of LLMs-driven social bots behavior datasets. Our research outcomes provide primary insights for the research and governance of LLMs-driven social bots within the research community.",Sichuan University
6,./images/are_you_in_a_20230719.png,BattleAgent: Multi-modal Dynamic Emulation on Historical Battles to Complement Historical Analysis,"Shuhang Lin, Wenyue Hua, Lingyao Li, Che-Jui Chang, Lizhou Fan, Jianchao Ji, Hang Hua, Mingyu Jin, Jiebo Luo, Yongfeng Zhang","This paper presents BattleAgent, a detailed emulation demonstration system thatcombines the Large Vision-Language Model (VLM) and Multi-Agent System(MAS). This novel system aims to simulate complex dynamic interactions amongmultiple agents, as well as between agents and their environments, over a period oftime. It emulates both the decision-making processes of leaders and the viewpointsof ordinary participants, such as soldiers. The emulation showcases the currentcapabilities of agents, featuring fine-grained multi-modal interactions betweenagents and landscapes. It develops customizable agent structures to meet specificsituational requirements, for example, a variety of battle-related activities likescouting and trench digging. These components collaborate to recreate historicalevents in a lively and comprehensive manner while offering insights into thethoughts and feelings of individuals from diverse viewpoints. The technologicalfoundations of BattleAgent establish detailed and immersive settings for historicalbattles, enabling individual agents to partake in, observe, and dynamically respondto evolving battle scenarios. This methodology holds the potential to substantiallydeepen our understanding of historical events, particularly through individualaccounts. Such initiatives can also aid historical research, as conventional historicalnarratives often lack documentation and prioritize the perspectives of decision-makers, thereby overlooking the experiences of ordinary individuals. This biaseddocumentation results in a considerable gap in our historical understanding, as manystories remain untold......","Rutgers University, University of Michigan, University of Rochester"
7,./images/battleagent_multi-modal_dynamic_emulation_20240423.png,Can Large Language Model Agents Simulate Human Trust Behaviors?,"Chengxing Xie, Canyu Chen, Feiran Jia, Ziyu Ye, Kai Shu, Adel Bibi, Ziniu Hu, Philip Torr, Bernard Ghanem, Guohao Li","Large Language Model (LLM) agents have beenincreasingly adopted as simulation tools to modelhumans in applications such as social science.However, one fundamental question remains: canLLM agents really simulate human behaviors? Inthis paper, we focus on one of the most criticalbehaviors in human interactions, trust, and aim toinvestigate whether or not LLM agents can sim-ulate human trust behaviors. We first find thatLLM agents generally exhibit trust behaviors, re-ferred to as agent trust, under the framework ofTrust Games, which are widely recognized in be-havioral economics. Then, we discover that LLMagents can have high behavioral alignment withhumans regarding trust behaviors, particularly forGPT-4, indicating the feasibility to simulate hu-man trust behaviors with LLM agents. In addition,we probe into the biases in agent trust and thedifferences in agent trust towards agents and hu-mans. We also explore the intrinsic properties ofagent trust under conditions including advancedreasoning strategies and external manipulations.We further offer important implications of ourdiscoveries for various scenarios where trust isparamount. Our study provides new insights intothe behaviors of LLM agents and the fundamentalanalogy between LLMs and humans.","KAUST, Illinois Institute of Technology, Pennsylvania State University, The University of Chicago, University of Oxford, California Institute of Technology"
8,./images/can_large_language_model_20240207.png,ChatDev: Communicative Agents for Software Development,"Chen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, Weize Chen, Yusheng Su, Xin Cong, Juyuan Xu, Dahai Li, Zhiyuan Liu, Maosong Sun","Software development is a complex task thatnecessitates cooperation among multiple mem-bers with diverse skills. Numerous studies useddeep learning to improve specific phases in awaterfall model, such as design, coding, andtesting.However, the deep learning modelin each phase requires unique designs, lead-ing to technical inconsistencies across variousphases, which results in a fragmented and in-effective development process. In this paper,we introduce ChatDev, a chat-powered soft-ware development framework in which special-ized agents driven by large language models(LLMs) are guided in what to communicate(via chat chain) and how to communicate (viacommunicative dehallucination). These agentsactively contribute to the design, coding, andtesting phases through unified language-basedcommunication, with solutions derived fromtheir multi-turn dialogues. We found their uti-lization of natural language is advantageousfor system design, and communicating in pro-gramming language proves helpful in debug-ging. This paradigm demonstrates how linguis-tic communication facilitates multi-agent col-laboration, establishing language as a unify-ing bridge for autonomous task-solving amongLLM agents. The code and data are availableat https://github.com/OpenBMB/ChatDev.","Tsinghua University, The University of Sydney, BUPT, Modelbest Inc."
9,./images/chatdev_communicative_agents_for_20230716.png,CompeteAI: Understanding the Competition Dynamics in Large Language Model-based Agents,"Qinlin Zhao, Jindong Wang, Yixuan Zhang, Yiqiao Jin, Kaijie Zhu, Hao Chen, Xing Xie","Large language models (LLMs) have been widelyused as agents to complete different tasks, suchas personal assistance or event planning. Whilemost of the work has focused on cooperationand collaboration between agents, little workexplores competition, another important mech-anism that promotes the development of soci-ety and economy. In this paper, we seek to ex-amine the competition dynamics in LLM-basedagents. We first propose a general framework forstudying the competition between agents. Then,we implement a practical competitive environ-ment using GPT-4 to simulate a virtual town withtwo types of agents, including restaurant agentsand customer agents. Specifically, the restaurantagents compete with each other to attract morecustomers, where competition encourages themto transform, such as cultivating new operatingstrategies. Simulation experiments reveal severalinteresting findings at the micro and macro lev-els, which align well with existing market andsociological theories. We hope that the frame-work and environment can be a promising testbedto study the competition that fosters understand-ing of society. Code is available at: https://github.com/microsoft/competeai.","University of Science and Technology of China, Microsoft Research, William & Mary, Georgia Institute of Technology, Carnegie Mellon University"
10,./images/competeai_understanding_the_competition_20231026.png,EconAgent: Large Language Model-Empowered Agents for Simulating Macroeconomic Activities,"Nian Li, Chen Gao, Mingyu Li, Yong Li, Qingmin Liao","The advent of artificial intelligence has led to agrowing emphasis on data-driven modeling inmacroeconomics, with agent-based modeling(ABM) emerging as a prominent bottom-upsimulation paradigm. In ABM, agents (e.g.,households, firms) interact within a macroe-conomic environment, collectively generatingmarket dynamics. Existing agent modeling typ-ically employs predetermined rules or learning-based neural networks for decision-making.However, customizing each agent presents sig-nificant challenges, complicating the modelingof agent heterogeneity. Additionally, the in-fluence of multi-period market dynamics andmultifaceted macroeconomic factors are oftenoverlooked in decision-making processes. Inthis work, we introduce EconAgent, a largelanguage model-empowered agent with human-like characteristics for macroeconomic simu-lation. We first construct a simulation envi-ronment that incorporates various market dy-namics driven by agents’ decisions regardingwork and consumption. Through the perceptionmodule, we create heterogeneous agents withdistinct decision-making mechanisms.Fur-thermore, we model the impact of macroeco-nomic trends using a memory module, whichallows agents to reflect on past individual ex-periences and market dynamics. Simulationexperiments show that EconAgent can makerealistic decisions, leading to more reasonablemacroeconomic phenomena compared to exist-ing rule-based or learning-based agents. Ourcodes are released at https://github.com/tsinghua-fib-lab/ACL24-EconAgent.",Tsinghua University
11,./images/econagent_large_language_model-empowered_20231016.png,Epidemic Modeling with Generative Agents,"Ross Williams, Niyousha Hosseinichimeh, Aritra Majumdar, Navid Ghaffarzadegan","This study offers a new paradigm of individual-level modeling to address the grand challenge of incorporating human behavior in epidemic models. Using generative artificial intelligence in an agent-based epidemic model, each agent is empowered to make its own reasonings and decisions via connecting to a large language model such as ChatGPT. Through various simulation experiments, we present compelling evidence that generative agents mimic real-world behaviors such as quarantining when sick and self-isolation when cases rise. Collectively, the agents demonstrate patterns akin to multiple waves observed in recent pandemics followed by an endemic period. Moreover, the agents successfully flatten the epidemic curve. This study creates potential to improve dynamic system modeling by offering a way to represent human brain, reasoning, and decision making.",Virginia Tech
12,./images/epidemic_modeling_with_generative_20230711.png,Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View,"Jintian Zhang, Xin Xu, Ningyu Zhang, Ruibo Liu, Bryan Hooi, Shumin Deng","As Natural Language Processing (NLP) sys-tems are increasingly employed in intricate so-cial environments, a pressing query emerges:Can these NLP systems mirror human-esquecollaborative intelligence, in a multi-agent so-ciety consisting of multiple large language mod-els (LLMs)? This paper probes the collabora-tion mechanisms among contemporary NLPsystems by melding practical experiments withtheoretical insights. We fabricate four unique‘societies’ comprised of LLM agents, whereeach agent is characterized by a specific ‘trait’(easy-going or overconfident) and engages incollaboration with a distinct ‘thinking pattern’(debate or reflection).Through evaluatingthese multi-agent societies on three benchmarkdatasets, we discern that certain collaborativestrategies not only outshine previous top-tierapproaches but also optimize efficiency (usingfewer API tokens). Moreover, our results fur-ther illustrate that LLM agents manifest human-like social behaviors, such as conformity andconsensus reaching, mirroring foundational so-cial psychology theories. In conclusion, weintegrate insights from social psychology tocontextualize the collaboration of LLM agents,inspiring further investigations into the collab-oration mechanism for LLMs. We have sharedour code and datasets1, hoping to catalyze fur-ther research in this promising avenue.","Zhejiang University, National University of Singapore, NUS-NCS Joint Lab, Google DeepMind"
13,./images/exploring_collaboration_mechanisms_for_20231003.png,Generative Agents: Interactive Simulacra of Human Behavior,"Joon Sung Park, Joseph C. O'Brien, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, Michael S. Bernstein","Believable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools. In this paper, we introduce generative agents--computational software agents that simulate believable human behavior. Generative agents wake up, cook breakfast, and head to work; artists paint, while authors write; they form opinions, notice each other, and initiate conversations; they remember and reflect on days past as they plan the next day. To enable generative agents, we describe an architecture that extends a large language model to store a complete record of the agent's experiences using natural language, synthesize those memories over time into higher-level reflections, and retrieve them dynamically to plan behavior. We instantiate generative agents to populate an interactive sandbox environment inspired by The Sims, where end users can interact with a small town of twenty five agents using natural language. In an evaluation, these generative agents produce believable individual and emergent social behaviors: for example, starting with only a single user-specified notion that one agent wants to throw a Valentine's Day party, the agents autonomously spread invitations to the party over the next two days, make new acquaintances, ask each other out on dates to the party, and coordinate to show up for the party together at the right time. We demonstrate through ablation that the components of our agent architecture--observation, planning, and reflection--each contribute critically to the believability of agent behavior. By fusing large language models with computational, interactive agents, this work introduces architectural and interaction patterns for enabling believable simulations of human behavior.","Stanford University, Google Research, Google DeepMind"
14,./images/generative_agents_interactive_simulacra_20230407.png,Humanoid Agents: Platform for Simulating Human-like Generative Agents,"Zhilin Wang, Yu Ying Chiu, Yu Cheung Chiu","Just as computational simulations of atoms, molecules and cells have shaped the way we study the sciences, true-to-life simulations of human-like agents can be valuable tools for studying human behavior. We propose Humanoid Agents, a system that guides Generative Agents to behave more like humans by introducing three elements of System 1 processing: Basic needs (e.g. hunger, health and energy), Emotion and Closeness in Relationships. Humanoid Agents are able to use these dynamic elements to adapt their daily activities and conversations with other agents, as supported with empirical experiments. Our system is designed to be extensible to various settings, three of which we demonstrate, as well as to other elements influencing human behavior (e.g. empathy, moral values and cultural background). Our platform also includes a Unity WebGL game interface for visualization and an interactive analytics dashboard to show agent statuses over time.","University of Washington, NVIDIA, The University of Hong Kong"
15,./images/humanoid_agents_platform_for_20231009.png,Language Agents as Digital Representatives in Collective Decision-Making,"Jarrett, Daniel and Pislar, Miruna and Bakker, Michiel A and Tessler, Michael Henry and Koster, Raphael and Balaguer, Jan and Elie, Romuald and Summerfield, Christopher and Tacchetti, Andrea","Consider the process of collective decision-making, in which a group of individualsinteractively select a preferred outcome from among a universe of alternatives. Inthis context, “representation” is the activity of making an individual’s preferencespresent in the process via participation by a proxy agent—i.e. their “representative”.To this end, learned models of human behavior have the potential to fill this role,with practical implications for multi-agent scenario studies and mechanism design.In this work, we investigate the possibility of training language agents to behavein the capacity of representatives of human agents, appropriately expressing thepreferences of those individuals whom they stand for. First, we formalize the settingof collective decision-making—as the episodic process of interaction between agroup of agents and a decision mechanism. On this basis, we then formalize theproblem of digital representation—as the simulation of an agent’s behavior to yieldequivalent outcomes from the mechanism. Finally, we conduct an empirical casestudy in the setting of consensus-finding among diverse humans, and demonstratethe feasibility of fine-tuning large language models to act as digital representatives.",Google DeepMind
16,./images/language_agents_as_digital_20231108.png,LLM-Driven Agents for Influencer Selection in Digital Advertising Campaigns,"Xiaoqing Zhang, Xiuying Chen, Yuhan Liu, Jianzhou Wang, Zhenxing Hu, Rui Yan","In the digital world, influencers are pivotal as opinion leaders, shap-ing the views and choices of their influencees. Modern advertisingoften follows this trend, where marketers choose appropriate in-fluencers for product endorsements, based on thorough marketanalysis. Previous studies on influencer selection have typicallyrelied on numerical representations of individual opinions andinteractions, a method that simplifies the intricacies of social dy-namics. With the development of large language models (LLMs),we now have the opportunity to capture the nuanced exchangesof information within social networks. Hence, in this work, wefirst introduce an Influencer Dynamics Simulator (IDS), helpingpromoters identify and select the right influencers to market theirproducts, based on LLM simulation. Concretely, we first propose aninfluencer-influencee engagement-based pre-selection module toscreen potential influencer candidates. Subsequently, a simulation isconstructed for these candidates and their influencees. Each user isrepresented as an LLM-based agent, drawing from their interactionhistory to deduce their profile and interests. The influencee agentswill predict their behavior in response to influencer advertising. Fi-nally, we develop a ranking metric designed to pinpoint influencerswho are most likely to drive product purchases based on feedbackfrom their influencees. To evaluate our framework, we collect areal-world advertising network dataset, including social relations,post and comment content, and user behaviors.......","Renmin University of China, King Abdullah University of Science and Technology, Moonshot AI"
17,./images/llm-driven_agents_for_influencer_20240322.png,Lyfe Agents: Generative agents for low-cost real-time social interactions,"Zhao Kaiya, Michelangelo Naim, Jovana Kondic, Manuel Cortes, Jiaxin Ge, Shuying Luo, Guangyu Robert Yang, Andrew Ahn","Highly autonomous generative agents powered by large language models promise to simulate intricate social behaviors in virtual societies. However, achieving real-time interactions with humans at a low computational cost remains challenging. Here, we introduce Lyfe Agents. They combine low-cost with real-time responsiveness, all while remaining intelligent and goal-oriented. Key innovations include: (1) an option-action framework, reducing the cost of high-level decisions; (2) asynchronous self-monitoring for better self-consistency; and (3) a Summarize-and-Forget memory mechanism, prioritizing critical memory items at a low cost. We evaluate Lyfe Agents' self-motivation and sociability across several multi-agent scenarios in our custom LyfeGame 3D virtual environment platform. When equipped with our brain-inspired techniques, Lyfe Agents can exhibit human-like self-motivated social reasoning. For example, the agents can solve a crime (a murder mystery) through autonomous collaboration and information exchange. Meanwhile, our techniques enabled Lyfe Agents to operate at a computational cost 10-100 times lower than existing alternatives. Our findings underscore the transformative potential of autonomous generative agents to enrich human social experiences in virtual worlds.","Massachusetts Institute of Technology, Peking University, LyfeAL"
18,./images/lyfe_agents_generative_agents_20231003.png,MetaAgents: Simulating Interactions of Human Behaviors for LLM-based Task-oriented Coordination via Collaborative Generative Agents,"Yuan Li, Yixuan Zhang, Lichao Sun","Significant advancements have occurred in the application of Large LanguageModels (LLMs) for various tasks and social simulations. Despite this, their capac-ities to coordinate within task-oriented social contexts are under-explored. Suchcapabilities are crucial if LLMs are to effectively mimic human-like social be-havior and produce meaningful results. To bridge this gap, we introduce collab-orative generative agents, endowing LLM-based Agents with consistent behaviorpatterns and task-solving abilities. We situate these agents in a simulated job fairenvironment as a case study to scrutinize their coordination skills. We proposea novel framework that equips collaborative generative agents with human-likereasoning abilities and specialized skills. Our evaluation demonstrates that theseagents show promising performance. However, we also uncover limitations thathinder their effectiveness in more complex coordination tasks. Our work providesvaluable insights into the role and evolution of LLMs in task-oriented social sim-ulations.","University of Cambridge, William & Mary, Lehigh University"
19,./images/metaagents_simulating_interactions_of_20231010.png,On Generative Agents in Recommendation,"An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, Tat-Seng Chua","Recommender systems are the cornerstone of today's information dissemination, yet a disconnect between offline metrics and online performance greatly hinders their development. Addressing this challenge, we envision a recommendation simulator, capitalizing on recent breakthroughs in human-level intelligence exhibited by Large Language Models (LLMs). We propose Agent4Rec, a user simulator in recommendation, leveraging LLM-empowered generative agents equipped with user profile, memory, and actions modules specifically tailored for the recommender system. In particular, these agents' profile modules are initialized using real-world datasets (e.g. MovieLens, Steam, Amazon-Book), capturing users' unique tastes and social traits; memory modules log both factual and emotional memories and are integrated with an emotion-driven reflection mechanism; action modules support a wide variety of behaviors, spanning both taste-driven and emotion-driven actions. Each agent interacts with personalized recommender models in a page-by-page manner, relying on a pre-implemented collaborative filtering-based recommendation algorithm. We delve into both the capabilities and limitations of Agent4Rec, aiming to explore an essential research question: ``To what extent can LLM-empowered generative agents faithfully simulate the behavior of real, autonomous humans in recommender systems?'' Extensive and multi-faceted evaluations of Agent4Rec highlight both the alignment and deviation between agents and user-personalized preferences. Beyond mere performance comparison, we explore insightful experiments, such as emulating the filter bubble effect and discovering the underlying causal relationships in recommendation tasks.","National University of Singapore, Tsinghua University, University of Science and Technology of China"
20,./images/on_generative_agents_in_20231016.png,"Out of One, Many: Using Language Models to Simulate Human Samples","Lisa P. Argyle, Ethan C. Busby, Nancy Fulda, Joshua Gubler, Christopher Rytting, David Wingate","We propose and explore the possibility that language models can be studied as effective proxies for specific human sub-populations in social science research. Practical and research applications of artificial intelligence tools have sometimes been limited by problematic biases (such as racism or sexism), which are often treated as uniform properties of the models. We show that the ""algorithmic bias"" within one such tool -- the GPT-3 language model -- is instead both fine-grained and demographically correlated, meaning that proper conditioning will cause it to accurately emulate response distributions from a wide variety of human subgroups. We term this property ""algorithmic fidelity"" and explore its extent in GPT-3. We create ""silicon samples"" by conditioning the model on thousands of socio-demographic backstories from real human participants in multiple large surveys conducted in the United States. We then compare the silicon and human samples to demonstrate that the information contained in GPT-3 goes far beyond surface similarity. It is nuanced, multifaceted, and reflects the complex interplay between ideas, attitudes, and socio-cultural context that characterize human attitudes. We suggest that language models with sufficient algorithmic fidelity thus constitute a novel and powerful tool to advance understanding of humans and society across a variety of disciplines.",Brigham Young University
21,./images/out_of_one_many_20220914.png,Quantifying the Impact of Large Language Models on Collective Opinion Dynamics,"Chao Li, Xing Su, Haoying Han, Cong Xue, Chunmo Zheng, Chao Fan","The process of opinion expression and exchange is a critical component of democratic societies. As people interact with large language models (LLMs) in the opinion shaping process different from traditional media, the impacts of LLMs are increasingly recognized and being concerned. However, the knowledge about how LLMs affect the process of opinion expression and exchange of social opinion networks is very limited. Here, we create an opinion network dynamics model to encode the opinions of LLMs, cognitive acceptability and usage strategies of individuals, and simulate the impact of LLMs on opinion dynamics in a variety of scenarios. The outcomes of the simulations inform about effective demand-oriented opinion network interventions. The results from this study suggested that the output opinion of LLMs has a unique and positive effect on the collective opinion difference. The marginal effect of cognitive acceptability on collective opinion formation is nonlinear and shows a decreasing trend. When people partially rely on LLMs, the exchange process of opinion becomes more intense and the diversity of opinion becomes more favorable. In fact, there is 38.6% more opinion diversity when people all partially rely on LLMs, compared to prohibiting the use of LLMs entirely. The optimal diversity of opinion was found when the fractions of people who do not use, partially rely on, and fully rely on LLMs reached roughly 4:12:1. Our experiments also find that introducing extra agents with opposite/neutral/random opinions, we can effectively mitigate the impact of biased/toxic output from LLMs. Our findings provide valuable insights into opinion dynamics in the age of LLMs, highlighting the need for customized interventions tailored to specific scenarios to address the drawbacks of improper output and use of LLMs."," Zhejiang University, Clemson University, "
22,./images/quantifying_the_impact_of_20230807.png,S3: Social-network Simulation System with Large Language Model-Empowered Agents,"Chen Gao, Xiaochong Lan, Zhihong Lu, Jinzhu Mao, Jinghua Piao, Huandong Wang, Depeng Jin, Yong Li","Simulation plays a crucial role in addressing various challenges within socialscience. It offers extensive applications such as state prediction, phenomena ex-planation, and policy-making support, among others. In this work, we harness thehuman-like capabilities of large language models (LLMs) in sensing, reasoning,and behaving, and utilize these qualities to construct the S3 system (short forSocial network Simulation System). Adhering to the widely employed agent-basedsimulation paradigm, we employ fine-tuning and prompt engineering techniques toensure that the agent’s behavior closely emulates that of a genuine human withinthe social network. Specifically, we simulate three pivotal aspects: emotion, at-titude, and interaction behaviors. By endowing the agent in the system with theability to perceive the informational environment and emulate human actions, weobserve the emergence of population-level phenomena, including the propagationof information, attitudes, and emotions. We conduct an evaluation encompassingtwo levels of simulation, employing real-world social network data. Encouragingly,the results demonstrate promising accuracy. This work represents an initial step inthe realm of social network simulation empowered by LLM-based agents. We an-ticipate that our endeavors will serve as a source of inspiration for the developmentof simulation systems within, but not limited to, social science.",Tsinghua University
23,./images/s3_social-network_simulation_system_20230727.png,Simulating Opinion Dynamics with Networks of LLM-based Agents,"Yun-Shiuan Chuang, Agam Goyal, Nikunj Harlalka, Siddharth Suresh, Robert Hawkins, Sijia Yang, Dhavan Shah, Junjie Hu, Timothy T. Rogers","Accurately simulating human opinion dynam-ics is crucial for understanding a variety of soci-etal phenomena, including polarization and thespread of misinformation. However, the agent-based models (ABMs) commonly used for suchsimulations often over-simplify human behav-ior. We propose a new approach to simulat-ing opinion dynamics based on populations ofLarge Language Models (LLMs). Our findingsreveal a strong inherent bias in LLM agents to-wards producing accurate information, leadingsimulated agents to consensus in line with sci-entific reality. This bias limits their utility forunderstanding resistance to consensus viewson issues like climate change. After induc-ing confirmation bias through prompt engineer-ing, however, we observed opinion fragmenta-tion in line with existing agent-based modelingand opinion dynamics research. These insightshighlight the promise and limitations of LLMagents in this domain and suggest a path for-ward: refining LLMs with real-world discourseto better simulate the evolution of human be-liefs.",University of Wisconsin-Madison
24,./images/simulating_opinion_dynamics_with_20231116.png,Simulating Social Media Using Large Language Models to Evaluate Alternative News Feed Algorithms,"Petter Törnberg, Diliara Valeeva, Justus Uitermark, Christopher Bail",". Social media is often criticized for amplifyingtoxic discourse and discouraging constructive conversa-tions. But designing social media platforms to promotebetter conversations is inherently challenging. This paperasks whether simulating social media through a combina-tion of Large Language Models (LLM) and Agent-BasedModeling can help researchers study how different newsfeed algorithms shape the quality of online conversations.We create realistic personas using data from the Ameri-can National Election Study to populate simulated socialmedia platforms. Next, we prompt the agents to readand share news articles — and like or comment uponeach other’s messages — within three platforms that usedifferent news feed algorithms. In the first platform, userssee the most liked and commented posts from users whomthey follow. In the second, they see posts from all users —even those outside their own network. The third platformemploys a novel “bridging” algorithm that highlights poststhat are liked by people with opposing political views. Wefind this bridging algorithm promotes more constructive,non-toxic, conversation across political divides than theother two models. Though further research is needed toevaluate these findings, we argue that LLMs hold consid-erable potential to improve simulation research on socialmedia and many other complex social settings.","University of Amsterdam, Duke University"
25,./images/simulating_social_media_using_20231005.png,Social Simulacra: Creating Populated Prototypes for Social Computing Systems,"Joon Sung Park, Lindsay Popowski, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, Michael S. Bernstein","Social computing prototypes probe the social behaviors that mayarise in an envisioned system design. This prototyping practiceis currently limited to recruiting small groups of people. Unfortu-nately, many challenges do not arise until a system is populatedat a larger scale. Can a designer understand how a social systemmight behave when populated, and make adjustments to the de-sign before the system falls prey to such challenges? We intro-duce social simulacra, a prototyping technique that generates abreadth of realistic social interactions that may emerge when a so-cial computing system is populated. Social simulacra take as inputthe designer’s description of a community’s design—goal, rules, andmember personas—and produce as output an instance of that designwith simulated behavior, including posts, replies, and anti-socialbehaviors. We demonstrate that social simulacra shift the behaviorsthat they generate appropriately in response to design changes, andthat they enable exploration of “what if?” scenarios where commu-nity members or moderators intervene. To power social simulacra,we contribute techniques for prompting a large language modelto generate thousands of distinct community members and theirsocial interactions with each other; these techniques are enabled bythe observation that large language models’ training data alreadyincludes a wide variety of positive and negative behavior on socialmedia platforms. In evaluations, we show that participants are of-ten unable to distinguish social simulacra from actual communitybehavior and that social computing designers successfully refinetheir social computing designs when using social simulacra.","Stanford University, Google Research"
26,./images/social_simulacra_creating_populated_20220808.png,The Wisdom of Partisan Crowds: Comparing Collective Intelligence in Humans and LLM-based Agents,"Yun-Shiuan Chuang, Siddharth Suresh, Nikunj Harlalka, Agam Goyal, Robert Hawkins, Sijia Yang, Dhavan Shah, Junjie Hu, Timothy T. Rogers","Human groups are able to converge on more accurate beliefs through deliberation,even in the presence of polarization and partisan bias — a phenomenon known asthe “wisdom of partisan crowds.” Generated agents powered by Large LanguageModels (LLMs) are increasingly used to simulate human collective behavior, yetfew benchmarks exist for evaluating their dynamics against the behavior of hu-man groups. In this paper, we examine the extent to which the wisdom of partisancrowds emerges in groups of LLM-based agents that are prompted to role-playas partisan personas (e.g., Democrat or Republican). We find that they not onlydisplay human-like partisan biases, but also converge to more accurate beliefsthrough deliberation as humans do. We then identify several factors that interferewith convergence, including the use of chain-of-thought prompt and lack of detailsin personas. Conversely, fine-tuning on human data appears to enhance conver-gence. These findings show the potential and limitations of LLM-based agents asa model of human collective intelligence.",University of Wisconsin-Madison
27,./images/the_wisdom_of_partisan_20231116.png,To Infinity and Beyond- SHOW-1 and Showrunner Agents in Multi-Agent Simulations,"Philipp Maas, Frank Carey, Chris Wheeler, Edward Saatchi, Pete Billington, Jessica Yaffa Shamash","In this work we present our approach to generating high-quality episodic content for IP’s (Intellectual Property) using large language models (LLMs), custom state-of- the art diffusion models and our multi-agent simulation for contextualization, story progression and behavioral control. Powerful LLMs such as GPT-4 were trained on a large corpus of TV show data which lets us believe that with the right guidance users will be able to rewrite entire seasons.""That Is What Entertainment Will Look Like. Maybe people are still upset about the last season of Game of Thrones. Imagine if you could ask your A.I. to make a new ending that goes a different way and maybe even put yourself in there as a main character or something.”. ",Fable Studio
28,./images/to_infinity_and_beyond_20230724.png,Unveiling the Truth and Facilitating Change: Towards Agent-based Large-scale Social Movement Simulation,"Xinyi Mou, Zhongyu Wei, Xuanjing Huang","Social media has emerged as a cornerstone ofsocial movements, wielding significant influ-ence in driving societal change. Simulatingthe response of the public and forecasting thepotential impact has become increasingly im-portant. However, existing methods for simu-lating such phenomena encounter challengesconcerning their efficacy and efficiency in cap-turing the behaviors of social movement par-ticipants. In this paper, we introduce a hybridframework HiSim for social media user simu-lation, wherein users are categorized into twotypes. Core users are driven by Large Lan-guage Models, while numerous ordinary usersare modeled by deductive agent-based models.We further construct a Twitter-like environmentto replicate their response dynamics followingtrigger events. Subsequently, we develop amulti-faceted benchmark SoMoSiMu-Benchfor evaluation and conduct comprehensive ex-periments across real-world datasets. Exper-imental results demonstrate the effectivenessand flexibility of our method","Fudan University, Shanghai Collaborative Innovation Center of Intelligent Visual Computing"
29,./images/unveiling_the_truth_and_20240226.png,User Behavior Simulation with Large Language Model based Agents,"Lei Wang, Jingsen Zhang, Hao Yang, Zhiyuan Chen, Jiakai Tang, Zeyu Zhang, Xu Chen, Yankai Lin, Ruihua Song, Wayne Xin Zhao, Jun Xu, Zhicheng Dou, Jun Wang, Ji-Rong Wen","Simulating high quality user behavior data has always been a fundamental problem in human-centered applications, where the major difficulty originates from the intricate mechanism of human decision process. Recently, substantial evidences have suggested that by learning huge amounts of web knowledge, large language models (LLMs) can achieve human-like intelligence. We believe these models can provide significant opportunities to more believable user behavior simulation. To inspire such direction, we propose an LLM-based agent framework and design a sandbox environment to simulate real user behaviors. Based on extensive experiments, we find that the simulated behaviors of our method are very close to the ones of real humans. Concerning potential applications, we simulate and study two social phenomenons including (1) information cocoons and (2) user conformity behaviors. This research provides novel simulation paradigms for human-centered applications.","Renmin University of China, Beijing Key Laboratory of Big Data Management and Analysis Methods, University College London"
30,./images/user_behavior_simulation_with_20230605.png,Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies,"Gati Aher, Rosa I. Arriaga, Adam Tauman Kalai","We introduce a new type of test, called a Turing Experiment (TE), for evaluating to what extent a given language model, such as GPT models, can simulate different aspects of human behavior. A TE can also reveal consistent distortions in a language model's simulation of a specific human behavior. Unlike the Turing Test, which involves simulating a single arbitrary individual, a TE requires simulating a representative sample of participants in human subject research. We carry out TEs that attempt to replicate well-established findings from prior studies. We design a methodology for simulating TEs and illustrate its use to compare how well different language models are able to reproduce classic economic, psycholinguistic, and social psychology experiments: Ultimatum Game, Garden Path Sentences, Milgram Shock Experiment, and Wisdom of Crowds. In the first three TEs, the existing findings were replicated using recent models, while the last TE reveals a ""hyper-accuracy distortion"" present in some language models (including ChatGPT and GPT-4), which could affect downstream applications in education and the arts.","Olin College of Engineering, Georgia Tech, Microsoft Research"
31,./images/using_large_language_models_20220818.png,War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars,"Wenyue Hua, Lizhou Fan, Lingyao Li, Kai Mei, Jianchao Ji, Yingqiang Ge, Libby Hemphill, Yongfeng Zhang","Can we avoid wars at the crossroads of history? This question has been pursued byindividuals, scholars, policymakers, and organizations throughout human history.In this research, we attempt to answer the question based on the recent advancesof Artificial Intelligence (AI) and Large Language Models (LLMs). We proposeWarAgent, an LLM-powered multi-agent AI system, to simulate the participatingcountries, their decisions, and the consequences, in historical international conflicts,including the World War I (WWI), the World War II (WWII), and the WarringStates Period (WSP) in Ancient China. By evaluating the simulation effectiveness,we examine the advancements and limitations of cutting-edge AI systems’ abilitiesin studying complex collective human behaviors such as international conflictsunder diverse settings. In these simulations, the emergent interactions amongagents also offer a novel perspective for examining the triggers and conditions thatlead to war. Our findings offer data-driven and AI-augmented insights that canredefine how we approach conflict resolution and peacekeeping strategies. Theimplications stretch beyond historical analysis, offering a blueprint for using AI tounderstand human history and possibly prevent future international conflicts. Codeand data are available at https://github.com/agiresearch/WarAgent.",Rutgers University
32,./images/war_and_peace_(waragent)_20231128.png,To be Continued...,Your Contributions are Welcome!,,