01 珍藏资料素材 27天前

350份+多模态大模型英文原版论文电子版PDF网盘资源合集，包含大模型Agent/RLHF/多模态思维链/视觉生成/指令微调/辅助视觉…等论文

多模态大模型论文集资源合集

“珍藏资料素材”系列为 01资源网独家整合的各类精品资料和素材合集，涵盖生活日常学习工作方方面面。

◉ 找资源，就找01，公众号：01资源共享平台

🎁 点击成为VIP ☛ 一次性打包获取01资源网全站所有资源+2000T网盘群组资源（涵盖考研/考证/外刊/各知识付费平台等）+01专属代找服务

◉ 找资源，就找01，微信号：xue36658

01资源网 致力于做全网最优质的资源整合和分享网站，网站资源只是非常小的一部分，欢迎加入我们的会员，找资源从此不用愁！

多模态大模型论文集资源截图

多模态大模型指的是能够处理多种类型数据（如文本、图像、音频等）的大型深度学习模型。传统的深度学习模型主要用于处理单一类型的数据，比如只处理文本数据的模型或只处理图像数据的模型。而多模态大模型则可以同时处理多种类型的数据，使得模型能够更全面地理解和处理复杂的信息。

这种类型的模型在处理各种任务时都表现出色，比如在自然语言处理领域，模型可以同时考虑文本内容和相关的图像信息，从而提高任务的准确性和效率。多模态大模型的出现，为处理涉及多种数据类型的复杂任务提供了强大的工具。

本套资源收集整理了350份+多模态大模型英文原版论文电子版PDF合集，百度网盘分享，包含大模型Agent/RLHF/多模态思维链/视觉生成/指令微调/辅助视觉…等相关论文。

目录如下

【多模态大模型论文】 [ 1.81GB ]
┃    ┣━━ 多模态大模型资料合集 [ 1.81GB ]
┃    ┃    ┣━━ 两篇多模态大模型综述论文 [ 55.96MB ]
┃    ┃    ┃    ┣━━ 微软最全综述：Multimodal Foundation Models From Specialists to General-Purpose Assistants.pdf [ 55.47MB ]
┃    ┃    ┃    ┣━━ 首篇综述：A Survey on Multimodal Large Language Models.pdf [ 497.36kB ]
┃    ┃    ┣━━ 大模型Agent与RLHF论文 [ 702.07MB ]
┃    ┃    ┃    ┣━━ 大模型RLHF论文合集 [ 91.35MB ]
┃    ┃    ┃    ┃    ┣━━ WebGPT Browser-assisted question-answering with human feedback.pdf [ 1.51MB ]
┃    ┃    ┃    ┃    ┣━━ Visual ChatGPT Talking, Drawing and Editing with Visual Foundation Models.pdf [ 2.44MB ]
┃    ┃    ┃    ┃    ┣━━ Training language models to follow instructions with human feedback.pdf [ 1.83MB ]
┃    ┃    ┃    ┃    ┣━━ Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback.pdf [ 9.03MB ]
┃    ┃    ┃    ┃    ┣━━ Teaching language models to support answers with verified quotes.pdf [ 1.88MB ]
┃    ┃    ┃    ┃    ┣━━ Scaling Laws for Reward Model Overoptimization.pdf [ 3.10MB ]
┃    ┃    ┃    ┃    ┣━━ Scalable agent alignment via reward modeling a research direction.pdf [ 671.51kB ]
┃    ┃    ┃    ┃    ┣━━ Reward learning from human preferences and demonstrations in Atari.pdf [ 2.79MB ]
┃    ┃    ┃    ┃    ┣━━ Revisiting the Weaknesses of Reinforcement Learning for Neural Machine Translation.pdf [ 605.94kB ]
┃    ┃    ┃    ┃    ┣━━ Red Teaming Language Models to Reduce Harms Methods, Scaling Behaviors, and Lessons Learned.pdf [ 3.20MB ]
┃    ┃    ┃    ┃    ┣━━ Recursively Summarizing Books with Human Feedback.pdf [ 2.05MB ]
┃    ┃    ┃    ┃    ┣━━ Quark Controllable Text Generation with Reinforced Unlearning.pdf [ 4.46MB ]
┃    ┃    ┃    ┃    ┣━━ Pretraining Language Models with Human Preferences.pdf [ 1.41MB ]
┃    ┃    ┃    ┃    ┣━━ Non-Markovian Reward Modelling from Trajectory Labels via Interpretable Multiple Instance Learning.pdf [ 8.37MB ]
┃    ┃    ┃    ┃    ┣━━ Learning to summarize with human feedback.pdf [ 1.75MB ]
┃    ┃    ┃    ┃    ┣━━ Learning to summarize from human feedback.pdf [ 1.75MB ]
┃    ┃    ┃    ┃    ┣━━ Is Reinforcement Learning (Not) for Natural Language Processing Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization.pdf [ 2.52MB ]
┃    ┃    ┃    ┃    ┣━━ Interactive Learning from Policy-Dependent Human Feedback.pdf [ 485.17kB ]
┃    ┃    ┃    ┃    ┣━━ InstructGPT Training language models to follow instructions with human feedback.pdf [ 1.83MB ]
┃    ┃    ┃    ┃    ┣━━ Improving alignment of dialogue agents via targeted human judgements.pdf [ 4.01MB ]
┃    ┃    ┃    ┃    ┣━━ GPT-4 Technical Report.pdf [ 5.13MB ]
┃    ┃    ┃    ┃    ┣━━ Fine-Tuning Language Models from Human Preferences.pdf [ 1.03MB ]
┃    ┃    ┃    ┃    ┣━━ Few-shot Preference Learning for Human-in-the-Loop RL.pdf [ 5.51MB ]
┃    ┃    ┃    ┃    ┣━━ Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning.pdf [ 1.02MB ]
┃    ┃    ┃    ┃    ┣━━ Discovering Language Model Behaviors with Model-Written Evaluations.pdf [ 2.70MB ]
┃    ┃    ┃    ┃    ┣━━ Deep TAMER Interactive Agent Shaping in High-Dimensional State Spaces.pdf [ 1.67MB ]
┃    ┃    ┃    ┃    ┣━━ Deep Reinforcement Learning from Human Preferences.pdf [ 3.13MB ]
┃    ┃    ┃    ┃    ┣━━ Constitutional AI Harmlessness from AI Feedback.pdf [ 1.50MB ]
┃    ┃    ┃    ┃    ┣━━ Better Aligning Text-to-Image Models with Human Preference.pdf [ 4.85MB ]
┃    ┃    ┃    ┃    ┣━━ Aligning Language Models with Preferences through f-divergence Minimization.pdf [ 9.15MB ]
┃    ┃    ┃    ┣━━ 大模型Agent论文合集 [ 610.72MB ]
┃    ┃    ┃    ┃    ┣━━ You Only Look at Screens Multimodal Chain-of-Action Agents.pdf [ 5.76MB ]
┃    ┃    ┃    ┃    ┣━━ Voyager An open-ended embodied agent with large language models.pdf [ 15.13MB ]
┃    ┃    ┃    ┃    ┣━━ Trustworthy LLMs a Survey and Guideline for Evaluating Large Language Models’ Alignment.pdf [ 1.69MB ]
┃    ┃    ┃    ┃    ┣━━ TPTU Task Planning and Tool Usage of Large Language Model-based AI Agents.pdf [ 11.42MB ]
┃    ┃    ┃    ┃    ┣━━ Towards More Human-Like AI Communication.pdf [ 3.31MB ]
┃    ┃    ┃    ┃    ┣━━ Towards a unified agent with foundation models.pdf [ 6.05MB ]
┃    ┃    ┃    ┃    ┣━━ ToolLLM Facilitating large language models to master 16000+ real-world apis.pdf [ 2.03MB ]
┃    ┃    ┃    ┃    ┣━━ Toolformer Language models can teach themselves to use tools.pdf [ 728.49kB ]
┃    ┃    ┃    ┃    ┣━━ Steve-Eye Equipping LLM-based Embodied Agents with Visual Perception in Open Worlds.pdf [ 1.69MB ]
┃    ┃    ┃    ┃    ┣━━ Self-Alignment with Instruction Backtranslation.pdf [ 1.92MB ]
┃    ┃    ┃    ┃    ┣━━ SeamlessM4T-Massively Multilingual & Multimodal Machine Translation.pdf [ 3.66MB ]
┃    ┃    ┃    ┃    ┣━━ SAPIEN Affective Virtual Agents Powered by Large Language Models.pdf [ 326.60kB ]
┃    ┃    ┃    ┃    ┣━━ ROLELLM BENCHMARKING, ELICITING, AND ENHANCING ROLE-PLAYING ABILITIES OF LARGE LANGUAGE MODELS.pdf [ 5.45MB ]
┃    ┃    ┃    ┃    ┣━━ REX Rapid Exploration and eXploitation for AI agents.pdf [ 742.15kB ]
┃    ┃    ┃    ┃    ┣━━ Retroformer Retrospective Large Language Agents with Policy Gradient Optimization.pdf [ 1.59MB ]
┃    ┃    ┃    ┃    ┣━━ Reinforcement Learning for Generative AI A Survey.pdf [ 1.64MB ]
┃    ┃    ┃    ┃    ┣━━ React Synergizing reasoning and acting in language models.pdf [ 708.06kB ]
┃    ┃    ┃    ┃    ┣━━ Quantifying the Impact of Large Language Models on Collective Opinion Dynamics.pdf [ 2.26MB ]
┃    ┃    ┃    ┃    ┣━━ Put Your Money Where Your Mouth Is Evaluating Strategic Planning and Execution of LLM Agents in an Auction Arena.pdf [ 1.47MB ]
┃    ┃    ┃    ┃    ┣━━ Pal Program-aided language models.pdf [ 1.25MB ]
┃    ┃    ┃    ┃    ┣━━ OKR-Agent An Object and Key Results Driven Agent System with Hierarchical Self-Collaboration and Self-Evaluation.pdf [ 1.91MB ]
┃    ┃    ┃    ┃    ┣━━ Multimodal Web Navigation with Instruction-Finetuned Foundation Models.pdf [ 1.05MB ]
┃    ┃    ┃    ┃    ┣━━ Mind the Gap Improving Success Rate of Vision-and-Language Navigation by Revisiting Oracle Success Routes.pdf [ 1.75MB ]
┃    ┃    ┃    ┃    ┣━━ Memory Sandbox Transparent and Interactive Memory Management for Conversational Agents.pdf [ 1.03MB ]
┃    ┃    ┃    ┃    ┣━━ Memory augmented large language models are computationally universal.pdf [ 261.50kB ]
┃    ┃    ┃    ┃    ┣━━ LLM-Deliberation Evaluating LLMs with Interactive Multi-Agent Negotiation Game.pdf [ 4.77MB ]
┃    ┃    ┃    ┃    ┣━━ Lemur Harmonizing Natural Language and Code for Language Agents.pdf [ 895.25kB ]
┃    ┃    ┃    ┃    ┣━━ Learning to Reason and Memorize with Self-Notes.pdf [ 741.79kB ]
┃    ┃    ┃    ┃    ┣━━ Learning to Identify Critical States for Reinforcement Learning from Videos.pdf [ 11.76MB ]
┃    ┃    ┃    ┃    ┣━━ Large Language Models for Information Retrieval A Survey.pdf [ 3.41MB ]
┃    ┃    ┃    ┃    ┣━━ Language Agents for Detecting Implicit Stereotypes in Text-to-image Models at Scale.pdf [ 2.92MB ]
┃    ┃    ┃    ┃    ┣━━ Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models.pdf [ 1,008.87kB ]
┃    ┃    ┃    ┃    ┣━━ InstructionGPT-4 A 200-Instruction Paradigm for Fine-Tuning MiniGPT-4.pdf [ 5.04MB ]
┃    ┃    ┃    ┃    ┣━━ Gorilla Large language model connected with massive apis.pdf [ 1.23MB ]
┃    ┃    ┃    ┃    ┣━━ Generative agents Interactive simulacra of human behavior.pdf [ 11.53MB ]
┃    ┃    ┃    ┃    ┣━━ Formally Specifying the High-Level Behavior of LLM-Based Agents.pdf [ 924.36kB ]
┃    ┃    ┃    ┃    ┣━━ Few-shot learning with retrieval augmented language models.pdf [ 1.01MB ]
┃    ┃    ┃    ┃    ┣━━ Exploring Large Language Models for Communication Games An Empirical Study on Werewolf.pdf [ 2.94MB ]
┃    ┃    ┃    ┃    ┣━━ Evaluating Large Language Models at Evaluating Instruction Following.pdf [ 745.95kB ]
┃    ┃    ┃    ┃    ┣━━ Dynamic LLM-Agent Network An LLM-agent Collaboration Framework with Agent Team Optimization.pdf [ 1.04MB ]
┃    ┃    ┃    ┃    ┣━━ Does Role-Playing Chatbots Capture the Character Personalities Assessing Personality Traits for Role-Playing Chatbots.pdf [ 1.42MB ]
┃    ┃    ┃    ┃    ┣━━ Diversifying AI Towards Creative Chess with AlphaZero.pdf [ 6.16MB ]
┃    ┃    ┃    ┃    ┣━━ Deception Abilities Emerged in Large Language Models.pdf [ 598.38kB ]
┃    ┃    ┃    ┃    ┣━━ Cumulative Reasoning With Large Language Models.pdf [ 1.07MB ]
┃    ┃    ┃    ┃    ┣━━ Consciousness in Artificial Intelligence Insights from the Science of Consciousness.pdf [ 1.95MB ]
┃    ┃    ┃    ┃    ┣━━ Communicative agents for software development.pdf [ 10.47MB ]
┃    ┃    ┃    ┃    ┣━━ Code Llama Open Foundation Models for Code.pdf [ 1.60MB ]
┃    ┃    ┃    ┃    ┣━━ CLIN A Continually Learning Language Agent for Rapid Task Adaptation and Generalization.pdf [ 3.41MB ]
┃    ┃    ┃    ┃    ┣━━ ChatMOF An Autonomous AI System for Predicting and Generating Metal-Organic Frameworks.pdf [ 1.74MB ]
┃    ┃    ┃    ┃    ┣━━ CHATANYTHING FACETIME CHAT WITH LLM-ENHANCED PERSONAS.pdf [ 3.59MB ]
┃    ┃    ┃    ┃    ┣━━ Chain-of-thought prompting elicits reasoning in large language models.pdf [ 1.05MB ]
┃    ┃    ┃    ┃    ┣━━ Chain of hindsight aligns language models with feedback.pdf [ 1.81MB ]
┃    ┃    ┃    ┃    ┣━━ Benchmarking Large Language Models as AI Research Agents.pdf [ 757.12kB ]
┃    ┃    ┃    ┃    ┣━━ AutoAgents A Framework for Automatic Agent Generation.pdf [ 14.68MB ]
┃    ┃    ┃    ┃    ┣━━ Auto-GPT for Online Decision Making Benchmarks and Additional Opinions.pdf [ 764.81kB ]
┃    ┃    ┃    ┃    ┣━━ Augmenting Language Models with Long-Term Memory.pdf [ 728.68kB ]
┃    ┃    ┃    ┃    ┣━━ AudioLDM 2 Learning Holistic Audio Generation with Self-supervised Pretraining.pdf [ 11.09MB ]
┃    ┃    ┃    ┃    ┣━━ An Embodied Generalist Agent in 3D World.pdf [ 24.69MB ]
┃    ┃    ┃    ┃    ┣━━ Ambient Adventures Teaching ChatGPT on Developing Complex Stories.pdf [ 398.05kB ]
┃    ┃    ┃    ┃    ┣━━ All in One Multi-task Prompting for Graph Neural Networks.pdf [ 1.27MB ]
┃    ┃    ┃    ┃    ┣━━ AgentTuning Enabling Generalized Agent Abilities for LLMs.pdf [ 3.15MB ]
┃    ┃    ┃    ┃    ┣━━ Agent Instructs Large Language Models to be General Zero-Shot Reasoners.pdf [ 2.45MB ]
┃    ┃    ┃    ┃    ┣━━ Adapting LLM Agents Through Communication.pdf [ 23.22MB ]
┃    ┃    ┃    ┃    ┣━━ A real-world webagent with planning, long context understanding, and program synthesis.pdf [ 1.99MB ]
┃    ┃    ┃    ┃    ┣━━ A Language-Agent Approach to Formal Theorem-Proving.pdf [ 801.10kB ]
┃    ┃    ┃    ┃    ┣━━ NeurIPS2023 LLM Agent [ 22.87MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Using Large Language Model Annotations for Valid Downstream Statistical Inference in Social Science Design-Based Semi-Supervised Learning.pdf [ 1.25MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples.pdf [ 929.77kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ LLMs for Semi-Automated Data Science Introducing CAAFE for Context-Aware Automated Feature Engineering.pdf [ 657.33kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning.pdf [ 1,002.76kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Large Language Models of Code Fail at Completing Code with Potential Bugs.pdf [ 1.45MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Large Language Models can Implement Policy Iteration.pdf [ 988.15kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Large Language Models as Commonsense Knowledge for Large-Scale Task Planning.pdf [ 5.72MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Large Language Models Are Semi-Parametric Reinforcement Learning Agents.pdf [ 1.72MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ GPT4Tools Teaching Large Language Model to Use Tools via Self-instruction.pdf [ 1.06MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Describe, Explain, Plan and Select Interactive Planning with LLMs Enables Open-World Multi-Task Agents.pdf [ 8.18MB ]
┃    ┃    ┃    ┃    ┣━━ LLM-based Agent应用 [ 90.91MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models.pdf [ 2.26MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ WebGPT Browser-assisted question-answering with human feedback.pdf [ 1.51MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ The Hitchhiker’s Guide to Program Analysis A Journey with Large Language Models.pdf [ 4.84MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ SwiftSage A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks.pdf [ 3.51MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Skill Reinforcement Learning and Planning for Open-World Long-Horizon Tasks.pdf [ 1.06MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ SheetCopilot Bringing Software Productivity to the Next Level through Large Language Models.pdf [ 11.74MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ ScienceWorld Is your Agent Smarter than a 5th Grader.pdf [ 1.14MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ RoCo Dialectic Multi-Robot Collaboration with Large Language Models.pdf [ 10.21MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ ProAgent Building Proactive Cooperative AI with Large Language Models.pdf [ 1.92MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Plan, Eliminate, and Track — Language Models are Good Teachers for Embodied Agents.pdf [ 2.31MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Multi-Agent Collaboration Harnessing the Power of Intelligent LLM Agents.pdf [ 246.49kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ MetaGPT Meta Programming for A Multi-Agent Collaborative Framework.pdf [ 16.03MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Language Models as Zero-Shot Planners Extracting Actionable Knowledge for Embodied Agents.pdf [ 2.68MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ InterAct Exploring the Potentials of ChatGPT as a Cooperative Agent.pdf [ 753.43kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback.pdf [ 2.49MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Improving Factuality and Reasoning in Language Models through Multiagent Debate.pdf [ 1.68MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Do Embodied Agents Dream of Pixelated Sheep Embodied Decision Making using Language Guided World Modelling.pdf [ 972.11kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ ChatMOF An Autonomous AI System for Predicting and Generating Metal-Organic Frameworks.pdf [ 1.74MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ ChatLLM Network More brains, More intelligence.pdf [ 934.86kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ ChatEval Towards Better LLM-based Evaluators through Multi-Agent Debate.pdf [ 775.24kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ CGMI Configurable General Multi-Agent Interaction Framework.pdf [ 3.62MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Agents An Open-source Framework for Autonomous Language Agents.pdf [ 1.66MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ 3D-LLM Injecting the 3D World into Large Language Models.pdf [ 16.91MB ]
┃    ┃    ┃    ┃    ┣━━ LLM-based Agent评估 [ 3.06MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ On the Planning Abilities of Large Language Models (A Critical Investigation with a Proposed Benchmark).pdf [ 1.64MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Evaluating Cognitive Maps and Planning in Large Language Models with CogEval.pdf [ 1.41MB ]
┃    ┃    ┃    ┃    ┣━━ LLM-based Agent构建 [ 84.68MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Visual Instruction Tuning.pdf [ 4.69MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Tree of Thoughts Deliberate Problem Solving with Large Language Models..pdf [ 830.88kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model Explorations with GPT4-Vision and Beyond.pdf [ 4.10MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Think-on-Graph Deep and Responsible Reasoning of Large Language Model on Knowledge Graph.pdf [ 3.67MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ SwiftSage A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks.pdf [ 3.51MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Reflexion language agents with verbal reinforcement learning..pdf [ 681.16kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ PandaGPT One Model To Instruction-Follow Them All.pdf [ 7.91MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ MiniGPT-4 Enhancing Vision-Language Understanding with Advanced Large Language Models.pdf [ 3.57MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ LLM+P Empowering Large Language Models with Optimal Planning Proficiency.pdf [ 2.76MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Learning Distributed Representations of Sentences from Unlabelled Data.pdf [ 264.48kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Large Language Models as Tool Makers.pdf [ 1.12MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ InternGPT Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language.pdf [ 8.44MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ InstructBLIP Towards General-purpose Vision-Language Models with Instruction Tuning.pdf [ 3.65MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ HuggingGPT Solving AI Tasks with ChatGPT and its Friends in Hugging Face.pdf [ 3.34MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Clever Hans or Neural Theory of Mind Stress Testing Social Reasoning in Large Language Models.pdf [ 3.52MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ CAMEL Communicative Agents for “Mind” Exploration of Large Scale Language Model Society..pdf [ 7.45MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ AVIS Autonomous Visual Information Seeking with Large Language Model Agent.pdf [ 9.35MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ AutoGen Enabling Next-Gen LLM Applications via Multi-Agent Conversation.pdf [ 3.54MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ AudioGPT Understanding and Generating Speech, Music, Sound, and Talking Head.pdf [ 6.00MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Agents An Open-source Framework for Autonomous Language Agents.pdf [ 1.66MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Agent Instructs Large Language Models to be General Zero-Shot Reasoners.pdf [ 2.45MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity.pdf [ 2.21MB ]
┃    ┃    ┃    ┃    ┣━━ ICLR2024 LLM Agent [ 133.17MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Welfare Diplomacy Benchmarking Language Model Cooperation.pdf [ 16.59MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ WebArena A Realistic Web Environment for Building Autonomous Agents.pdf [ 9.43MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ SOTOPIA Interactive Evaluation for Social Intelligence in Language Agents.pdf [ 4.07MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ SmartPlay A Benchmark for LLMs as Intelligent Agents.pdf [ 13.18MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Playing repeated games with Large Language Models.pdf [ 2.35MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ MindAgent Emergent Gaming Interaction.pdf [ 26.70MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Lyfe Agents generative agents for low-cost real-time social interaction.pdf [ 3.98MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game.pdf [ 745.07kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Identifying the Risks of LM Agents with an LM-Emulated Sandbox.pdf [ 3.11MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Exploring Collaboration Mechanisms for LLM Agents A Social Psychology View.pdf [ 5.84MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Evaluating Multi-Agent Coordination Abilities in Large Language Models.pdf [ 1.11MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Building Cooperative Embodied Agents Modularly with Large Language Models.pdf [ 13.99MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Avalon’s Game of Thoughts Battle Against Deception through Recursive Contemplation.pdf [ 5.53MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ AgentVerse Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors.pdf [ 4.45MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ AgentBench Evaluating LLMs as Agents.pdf [ 22.10MB ]
┃    ┃    ┃    ┃    ┣━━ EMNLP2023 LLM Agent [ 22.80MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Theory of Mind for Multi-Agent Collaboration via Large Language Models.pdf [ 553.14kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ The CoT Collection Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning.pdf [ 1.84MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ SelfCheckGPT Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models.pdf [ 934.05kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Reasoning with Language Model is Planning with World Model.pdf [ 939.42kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Prompt-Based Monte-Carlo Tree Search for Goal-Oriented Dialogue Policy Planning.pdf [ 1.11MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ MoT Memory-of-Thought Enables ChatGPT to Self-Improve.pdf [ 878.96kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Logic-LM Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning.pdf [ 1,023.29kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Humanoid Agents Platform for Simulating Human-like Generative Agents.pdf [ 1.49MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Examining Inter-Consistency of Large Language Models Collaboration An In-depth Analysis via Debate.pdf [ 2.61MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Dialogue Chain-of-Thought Distillation for Commonsense-aware Conversational Agents.pdf [ 2.28MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ CRYSTAL Introspective Reasoners Reinforced with Self-Feedback.pdf [ 1.41MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ ChatCoT Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models.pdf [ 1.25MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Character-LLM A Trainable Agent for Role-Playing.pdf [ 980.42kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ AutoTrial Prompting Language Models for Clinical Trial Design.pdf [ 655.30kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ API-Bank A Comprehensive Benchmark for Tool-Augmented LLMs.pdf [ 875.53kB ]
┃    ┃    ┃    ┃    ┃    ┣━━ Answering Questions by Meta-Reasoning over Multiple Chains of Thought.pdf [ 1.54MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ AgentSims An Open-Source Sandbox for Large Language Model Evaluation.pdf [ 1.36MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ A Zero-Shot Language Agent for Computer Control with Structured Reflection.pdf [ 1.23MB ]
┃    ┃    ┃    ┃    ┣━━ 2篇综述 [ 7.83MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ The Rise and Potential of Large Language ModelBased Agents A Survey.pdf [ 6.86MB ]
┃    ┃    ┃    ┃    ┃    ┣━━ A Survey on Large Language Model-based Autonomous Agents.pdf [ 988.48kB ]
┃    ┃    ┣━━ TOP28多模态大模型 [ 420.06MB ]
┃    ┃    ┃    ┣━━ WeMM [ 3.50MB ]
┃    ┃    ┃    ┃    ┣━━ WeMM-main.zip [ 3.50MB ]
┃    ┃    ┃    ┣━━ VPGTrans [ 2.60MB ]
┃    ┃    ┃    ┃    ┣━━ VPGTrans-main.zip [ 2.60MB ]
┃    ┃    ┃    ┣━━ VisualGLM-6B [ 10.31MB ]
┃    ┃    ┃    ┃    ┣━━ VisualGLM-6B-main.zip [ 10.31MB ]
┃    ┃    ┃    ┣━━ SPHINX [ 15.00MB ]
┃    ┃    ┃    ┃    ┣━━ LLaMA2-Accessory-main.zip [ 15.00MB ]
┃    ┃    ┃    ┣━━ Skywork-MM [ 6.22MB ]
┃    ┃    ┃    ┃    ┣━━ Skywork-MM-main.zip [ 6.22MB ]
┃    ┃    ┃    ┣━━ Qwen-VL-Chat [ 25.51MB ]
┃    ┃    ┃    ┃    ┣━━ Qwen-VL-master.zip [ 25.51MB ]
┃    ┃    ┃    ┣━━ PandaGPT [ 14.07MB ]
┃    ┃    ┃    ┃    ┣━━ PandaGPT-main.zip [ 14.07MB ]
┃    ┃    ┃    ┣━━ Otter [ 323.71kB ]
┃    ┃    ┃    ┃    ┣━━ Otter-main.zip [ 323.71kB ]
┃    ┃    ┃    ┣━━ Octopus [ 554.68kB ]
┃    ┃    ┃    ┃    ┣━━ UnifiedMultimodalInstructionTuning-main.zip [ 554.68kB ]
┃    ┃    ┃    ┣━━ Multimodal-GPT [ 107.26kB ]
┃    ┃    ┃    ┃    ┣━━ Multimodal-GPT-main.zip [ 107.26kB ]
┃    ┃    ┃    ┣━━ Muffin [ 6.02MB ]
┃    ┃    ┃    ┃    ┣━━ Muffin-main.zip [ 6.02MB ]
┃    ┃    ┃    ┣━━ mPLUG-Owl [ 10.48MB ]
┃    ┃    ┃    ┃    ┣━━ mPLUG-Owl-main.zip [ 10.48MB ]
┃    ┃    ┃    ┣━━ MMICL [ 11.72MB ]
┃    ┃    ┃    ┃    ┣━━ MIC-master.zip [ 11.72MB ]
┃    ┃    ┃    ┣━━ MiniGPT-4 [ 38.46MB ]
┃    ┃    ┃    ┃    ┣━━ MiniGPT-4-main.zip [ 38.46MB ]
┃    ┃    ┃    ┣━━ Lynx [ 2.40MB ]
┃    ┃    ┃    ┃    ┣━━ lynx-llm-main.zip [ 2.40MB ]
┃    ┃    ┃    ┣━━ LRV-Instruction [ 22.27MB ]
┃    ┃    ┃    ┃    ┣━━ LRV-Instruction-main.zip [ 22.27MB ]
┃    ┃    ┃    ┣━━ LLaVA [ 12.65MB ]
┃    ┃    ┃    ┃    ┣━━ LLaVA-main.zip [ 12.65MB ]
┃    ┃    ┃    ┣━━ LLaMA-Adapter V2 [ 31.88MB ]
┃    ┃    ┃    ┃    ┣━━ LLaMA-Adapter-main.zip [ 31.88MB ]
┃    ┃    ┃    ┣━━ Lion [ 963.89kB ]
┃    ┃    ┃    ┃    ┣━━ Lion-main.zip [ 963.89kB ]
┃    ┃    ┃    ┣━━ LaVIN [ 10.26MB ]
┃    ┃    ┃    ┃    ┣━━ LaVIN-main.zip [ 10.26MB ]
┃    ┃    ┃    ┣━━ InternLM-XComposer [ 5.44MB ]
┃    ┃    ┃    ┃    ┣━━ InternLM-XComposer-main.zip [ 5.44MB ]
┃    ┃    ┃    ┣━━ InstructBLIP [ 65.62MB ]
┃    ┃    ┃    ┃    ┣━━ LAVIS-main.zip [ 65.62MB ]
┃    ┃    ┃    ┣━━ ImageBind_LLM [ 31.88MB ]
┃    ┃    ┃    ┃    ┣━━ ImageBind_LLM.zip [ 31.88MB ]
┃    ┃    ┃    ┣━━ GPT-4V [ 2.82MB ]
┃    ┃    ┃    ┃    ┣━━ GPTV_System_Card.pdf [ 2.82MB ]
┃    ┃    ┃    ┣━━ GIT2 [ 498.95kB ]
┃    ┃    ┃    ┃    ┣━━ GenerativeImage2Text-main.zip [ 498.95kB ]
┃    ┃    ┃    ┣━━ Cheetor [ 11.86MB ]
┃    ┃    ┃    ┃    ┣━━ Cheetah-main.zip [ 11.86MB ]
┃    ┃    ┃    ┣━━ BLIVA [ 11.08MB ]
┃    ┃    ┃    ┃    ┣━━ BLIVA-main.zip [ 11.08MB ]
┃    ┃    ┃    ┣━━ BLIP-2 [ 65.62MB ]
┃    ┃    ┃    ┃    ┣━━ LAVIS-main.zip [ 65.62MB ]
┃    ┃    ┣━━ 5个多模态大模型研究方向 [ 211.32MB ]
┃    ┃    ┃    ┣━━ 统一视觉模型 [ 7.85MB ]
┃    ┃    ┃    ┃    ┣━━ You Need Multiple Exiting Dynamic Early Exiting for.pdf [ 859.28kB ]
┃    ┃    ┃    ┃    ┣━━ VLMO Unified Vision-Language Pre-Training with.pdf [ 565.37kB ]
┃    ┃    ┃    ┃    ┣━━ Unified Vision-Language Pre-Training for Image Captioning and VQA.pdf [ 657.01kB ]
┃    ┃    ┃    ┃    ┣━━ UNIFIED VISION AND LANGUAGE PROMPT LEARNING.pdf [ 1.73MB ]
┃    ┃    ┃    ┃    ┣━━ Pro-tuning Unified Prompt Tuning for Vision Tasks.pdf [ 2.84MB ]
┃    ┃    ┃    ┃    ┣━━ BLIP Bootstrapping Language-Image Pre-training for.pdf [ 1.24MB ]
┃    ┃    ┃    ┣━━ 视觉生成 [ 39.51MB ]
┃    ┃    ┃    ┃    ┣━━ TextPainter Multimodal Text Image Generation with Visual-harmony and Text-comprehension for Poster Design.pdf [ 10.44MB ]
┃    ┃    ┃    ┃    ┣━━ Opal Multimodal Image Generation for News Illustration.pdf [ 4.75MB ]
┃    ┃    ┃    ┃    ┣━━ Multimodal Prompt Retrieval for Generative Visual Question Answering.pdf [ 5.23MB ]
┃    ┃    ┃    ┃    ┣━━ Multimodal Incremental Transformer with Visual Grounding for Visual Dialogue Generation.pdf [ 1.43MB ]
┃    ┃    ┃    ┃    ┣━━ Multimodal Differential Network for Visual Question Generation.pdf [ 1.44MB ]
┃    ┃    ┃    ┃    ┣━━ KM-BART Knowledge Enhanced Multimodal BART for Visual Commonsense Generation.pdf [ 4.85MB ]
┃    ┃    ┃    ┃    ┣━━ Generation of Multimodal Justification Using Visual Word Constraint Model for Explainable Computer-Aided Diagnosis.pdf [ 583.59kB ]
┃    ┃    ┃    ┃    ┣━━ Enabling Robots to Draw and Tell Towards Visually Grounded Multimodal Description Generation.pdf [ 1.33MB ]
┃    ┃    ┃    ┃    ┣━━ Coordinated Joint Multimodal Embeddings for Generalized Audio-Visual Zeroshot Classification and Retrieval of Videos.pdf [ 9.48MB ]
┃    ┃    ┃    ┣━━ 视觉理解 [ 62.07MB ]
┃    ┃    ┃    ┃    ┣━━ UReader Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model.pdf [ 5.49MB ]
┃    ┃    ┃    ┃    ┣━━ TouchStone Evaluating Vision-Language Models by Language Models.pdf [ 12.45MB ]
┃    ┃    ┃    ┃    ┣━━ PDFVQA A New Dataset for Real-World VQA on PDF Documents.pdf [ 5.36MB ]
┃    ┃    ┃    ┃    ┣━━ On the Performance of Multimodal Language Models.pdf [ 527.67kB ]
┃    ┃    ┃    ┃    ┣━━ Multimodal Transformer for Multimodal Machine Translation.pdf [ 1.84MB ]
┃    ┃    ┃    ┃    ┣━━ mPLUG-DocOwl Modularized Multimodal Large Language Model for Document Understanding.pdf [ 5.78MB ]
┃    ┃    ┃    ┃    ┣━━ M3IT A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning.pdf [ 2.33MB ]
┃    ┃    ┃    ┃    ┣━━ LLaVAR Enhanced Visual Instruction Tuning for Text-Rich Image Understanding.pdf [ 17.44MB ]
┃    ┃    ┃    ┃    ┣━━ DocFormerv2 Local Features for Document Understanding.pdf [ 2.63MB ]
┃    ┃    ┃    ┃    ┣━━ Cream Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models.pdf [ 8.25MB ]
┃    ┃    ┃    ┣━━ 多模态agent [ 57.23MB ]
┃    ┃    ┃    ┃    ┣━━ You Only Look at Screens Multimodal Chain-of-Action Agents.pdf [ 5.71MB ]
┃    ┃    ┃    ┃    ┣━━ The Importance of Multimodal Emotion Conditioning and Affect Consistency for Embodied Conversational Agents.pdf [ 24.91MB ]
┃    ┃    ┃    ┃    ┣━━ SPRING Situated Conversation Agent Pretrained with Multimodal Questions from Incremental Layout Graph.pdf [ 4.61MB ]
┃    ┃    ┃    ┃    ┣━━ Multimodal Speech Recognition for Language-Guided Embodied Agents.pdf [ 2.04MB ]
┃    ┃    ┃    ┃    ┣━━ Instruction-Following Agents with Multimodal Transformer.pdf [ 1.59MB ]
┃    ┃    ┃    ┃    ┣━━ Improving Multimodal Interactive Agents with Reinforcement Learning from Human Feedback.pdf [ 7.59MB ]
┃    ┃    ┃    ┃    ┣━━ Guide Your Agent with Adaptive Multimodal Rewards.pdf [ 5.34MB ]
┃    ┃    ┃    ┃    ┣━━ Clinically-Inspired Multi-Agent Transformers for Disease Trajectory Forecasting from Multimodal Data.pdf [ 5.19MB ]
┃    ┃    ┃    ┃    ┣━━ A Contextualized Real-Time Multimodal Emotion Recognition for Conversational Agents using Graph Convolutional Networks in Reinforcement Learning.pdf [ 260.40kB ]
┃    ┃    ┃    ┣━━ LLM加持的多模态大模型 [ 44.65MB ]
┃    ┃    ┃    ┃    ┣━━ X-LLM Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages.pdf [ 4.09MB ]
┃    ┃    ┃    ┃    ┣━━ SCITUNE Aligning Large Language Models with Scientific Multimodal.pdf [ 3.54MB ]
┃    ┃    ┃    ┃    ┣━━ MME A Comprehensive Evaluation Benchmark for Multimodal Large Language Models.pdf [ 7.63MB ]
┃    ┃    ┃    ┃    ┣━━ MM-Vet Evaluating Large Multimodal Models.pdf [ 9.28MB ]
┃    ┃    ┃    ┃    ┣━━ Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models.pdf [ 6.93MB ]
┃    ┃    ┃    ┃    ┣━━ Contextual Object Detection with Multimodal Large Language Models.pdf [ 13.19MB ]
┃    ┃    ┣━━ 4个多模态大模型关键技术 [ 459.08MB ]
┃    ┃    ┃    ┣━━ 多模态指令微调 [ 177.32MB ]
┃    ┃    ┃    ┃    ┣━━ X-LLM Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages.pdf [ 4.09MB ]
┃    ┃    ┃    ┃    ┣━━ Visual Instruction Tuning.pdf [ 5.01MB ]
┃    ┃    ┃    ┃    ┣━━ Visual Instruction Tuning with Polite Flamingo.pdf [ 1.65MB ]
┃    ┃    ┃    ┃    ┣━━ VisionLLM Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks.pdf [ 4.54MB ]
┃    ┃    ┃    ┃    ┣━━ VideoChat Chat-Centric Video Understanding.pdf [ 2.85MB ]
┃    ┃    ┃    ┃    ┣━━ Video-LLaMA An Instruction-tuned Audio-Visual Language Model for Video Understanding.pdf [ 2.01MB ]
┃    ┃    ┃    ┃    ┣━━ Video-ChatGPT Towards Detailed Video Understanding via Large Vision and Language Models.pdf [ 10.51MB ]
┃    ┃    ┃    ┃    ┣━━ Shikra Unleashing Multimodal LLM’s Referential Dialogue Magic.pdf [ 6.99MB ]
┃    ┃    ┃    ┃    ┣━━ PMC-VQA Visual Instruction Tuning for Medical Visual Question Answering.pdf [ 4.30MB ]
┃    ┃    ┃    ┃    ┣━━ PandaGPT One Model To Instruction-Follow Them All.pdf [ 7.88MB ]
┃    ┃    ┃    ┃    ┣━━ MultiModal-GPT A Vision and Language Model for Dialogue with Humans.pdf [ 2.13MB ]
┃    ┃    ┃    ┃    ┣━━ MultiInstruct Improving Multi-Modal Zero-Shot Learning via Instruction Tuning.pdf [ 1.21MB ]
┃    ┃    ┃    ┃    ┣━━ mPLUG-Owl Modularization Empowers Large Language Models with Multimodality.pdf [ 15.20MB ]
┃    ┃    ┃    ┃    ┣━━ MiniGPT-4 Enhancing Vision-Language Understanding with Advanced Large Language Models.pdf [ 3.53MB ]
┃    ┃    ┃    ┃    ┣━━ MIMIC-IT Multi-Modal In-Context Instruction Tuning.pdf [ 9.35MB ]
┃    ┃    ┃    ┃    ┣━━ Macaw-LLM Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration.pdf [ 10.48MB ]
┃    ┃    ┃    ┃    ┣━━ M3IT A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning.pdf [ 2.33MB ]
┃    ┃    ┃    ┃    ┣━━ LMEye An Interactive Perception Network for Large Language Models.pdf [ 10.12MB ]
┃    ┃    ┃    ┃    ┣━━ LLaVAR Enhanced Visual Instruction Tuning for Text-Rich Image Understanding.pdf [ 17.44MB ]
┃    ┃    ┃    ┃    ┣━━ LLaVA-Med Training a Large Language-and-Vision Assistant for Biomedicine in One Day.pdf [ 3.53MB ]
┃    ┃    ┃    ┃    ┣━━ LLaMA-Adapter V2 Parameter-Efficient Visual Instruction Model.pdf [ 3.15MB ]
┃    ┃    ┃    ┃    ┣━━ LLaMA-Adapter Efficient Fine-tuning of Language Models with Zero-init Attention.pdf [ 1.26MB ]
┃    ┃    ┃    ┃    ┣━━ Listen, Think, and Understand.pdf [ 1.25MB ]
┃    ┃    ┃    ┃    ┣━━ LAMM Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark.pdf [ 4.04MB ]
┃    ┃    ┃    ┃    ┣━━ InstructBLIP Towards General-purpose Vision-Language Models with Instruction Tuning.pdf [ 3.60MB ]
┃    ┃    ┃    ┃    ┣━━ GPT4Tools Teaching Large Language Model to Use Tools via Self-instruction.pdf [ 1,023.39kB ]
┃    ┃    ┃    ┃    ┣━━ DetGPT Detect What You Need via Reasoning.pdf [ 20.15MB ]
┃    ┃    ┃    ┃    ┣━━ Cheap and Quick Efficient Vision-Language Instruction Tuning for Large Language Models.pdf [ 2.86MB ]
┃    ┃    ┃    ┃    ┣━━ ChatBridge Bridging Modalities with Large Language Model as a Language Catalyst.pdf [ 9.54MB ]
┃    ┃    ┃    ┃    ┣━━ Aligning Large Multi-Modal Model with Robust Instruction Tuning.pdf [ 5.30MB ]
┃    ┃    ┃    ┣━━ 多模态思维链 [ 72.13MB ]
┃    ┃    ┃    ┃    ┣━━ Visual Programming Compositional visual reasoning without training.pdf [ 21.01MB ]
┃    ┃    ┃    ┃    ┣━━ Visual ChatGPT Talking, Drawing and Editing with Visual Foundation Models.pdf [ 2.34MB ]
┃    ┃    ┃    ┃    ┣━━ Visual Chain of Thought Bridging Logical Gaps with Multimodal Infillings.pdf [ 3.39MB ]
┃    ┃    ┃    ┃    ┣━━ Multimodal Chain-of-Thought Reasoning in Language Models.pdf [ 827.10kB ]
┃    ┃    ┃    ┃    ┣━━ MM-REACT Prompting ChatGPT for Multimodal Reasoning and Action.pdf [ 19.38MB ]
┃    ┃    ┃    ┃    ┣━━ Let’s Think Frame by Frame Evaluating Video Chain of Thought with Video Infilling and Prediction.pdf [ 2.83MB ]
┃    ┃    ┃    ┃    ┣━━ Learn to Explain Multimodal Reasoning via Thought Chains for Science Question Answering.pdf [ 7.69MB ]
┃    ┃    ┃    ┃    ┣━━ Explainable Multimodal Emotion Reasoning.pdf [ 1.52MB ]
┃    ┃    ┃    ┃    ┣━━ EmbodiedGPT Vision-Language Pre-Training via Embodied Chain of Thought.pdf [ 1.68MB ]
┃    ┃    ┃    ┃    ┣━━ Chameleon Plug-and-Play Compositional Reasoning with Large Language Models.pdf [ 2.09MB ]
┃    ┃    ┃    ┃    ┣━━ Chain of Thought Prompt Tuning in Vision Language Models.pdf [ 4.60MB ]
┃    ┃    ┃    ┃    ┣━━ Caption Anything Interactive Image Description with Diverse Multimodal Controls.pdf [ 4.78MB ]
┃    ┃    ┃    ┣━━ 多模态上下文学习 [ 31.96MB ]
┃    ┃    ┃    ┃    ┣━━ Revisiting Disentanglement and Fusion on Modality and Context in Conversational Multimodal Emotion Recognition.pdf [ 1.96MB ]
┃    ┃    ┃    ┃    ┣━━ Proactive Human-Robot Interaction using Visuo-Lingual Transformers.pdf [ 608.48kB ]
┃    ┃    ┃    ┃    ┣━━ Multimodal Foundation Models For Echocardiogram Interpretation.pdf [ 1.79MB ]
┃    ┃    ┃    ┃    ┣━━ MMHQA-ICL Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images.pdf [ 1.35MB ]
┃    ┃    ┃    ┃    ┣━━ Link-Context Learning for Multimodal LLMs.pdf [ 13.30MB ]
┃    ┃    ┃    ┃    ┣━━ Lightweight In-Context Tuning for Multimodal Unified Models.pdf [ 3.40MB ]
┃    ┃    ┃    ┃    ┣━━ Large Language Models are Visual Reasoning Coordinators.pdf [ 5.42MB ]
┃    ┃    ┃    ┃    ┣━━ Language as the Medium Multimodal Video Classification through text only.pdf [ 1.11MB ]
┃    ┃    ┃    ┃    ┣━━ HowToCaption Prompting LLMs to Transform Video Annotations at Scale.pdf [ 3.02MB ]
┃    ┃    ┃    ┣━━ LLM辅助视觉推理 [ 177.68MB ]
┃    ┃    ┃    ┃    ┣━━ Visual Programming Compositional visual reasoning without training.pdf [ 21.01MB ]
┃    ┃    ┃    ┃    ┣━━ Visual ChatGPT Talking, Drawing and Editing with Visual Foundation Models.pdf [ 2.34MB ]
┃    ┃    ┃    ┃    ┣━━ ViperGPT Visual Inference via Python Execution for Reasoning.pdf [ 31.64MB ]
┃    ┃    ┃    ┃    ┣━━ SuS-X Training-Free Name-Only Transfer of Vision-Language Models.pdf [ 8.91MB ]
┃    ┃    ┃    ┃    ┣━━ Socratic Models Composing Zero-Shot Multimodal Reasoning with Language.pdf [ 3.71MB ]
┃    ┃    ┃    ┃    ┣━━ Retrieving-to-Answer Zero-Shot Video Question Answering with Frozen Large Language Models.pdf [ 1.97MB ]
┃    ┃    ┃    ┃    ┣━━ Prompt, Generate, then Cache Cascade of Foundation Models makes Strong Few-shot Learners.pdf [ 3.56MB ]
┃    ┃    ┃    ┃    ┣━━ PointCLIP V2 Adapting CLIP for Powerful 3D Open-world Learning.pdf [ 1.57MB ]
┃    ┃    ┃    ┃    ┣━━ MM-REACT Prompting ChatGPT for Multimodal Reasoning and Action.pdf [ 19.38MB ]
┃    ┃    ┃    ┃    ┣━━ Mindstorms in Natural Language-Based Societies of Mind.pdf [ 23.37MB ]
┃    ┃    ┃    ┃    ┣━━ LayoutGPT Compositional Visual Planning and Generation with Large Language Models.pdf [ 17.61MB ]
┃    ┃    ┃    ┃    ┣━━ IdealGPT Iteratively Decomposing Vision and Language Reasoning via Large Language Models.pdf [ 1.20MB ]
┃    ┃    ┃    ┃    ┣━━ HuggingGPT Solving AI Tasks with ChatGPT and its Friends in HuggingFace.pdf [ 3.24MB ]
┃    ┃    ┃    ┃    ┣━━ GPT4Tools Teaching Large Language Model to Use Tools via Self-instruction.pdf [ 1,023.39kB ]
┃    ┃    ┃    ┃    ┣━━ ChatGPT Asks BLIP-2 Answers Automatic Questioning Towards Enriched Visual Descriptions.pdf [ 11.54MB ]
┃    ┃    ┃    ┃    ┣━━ Chameleon Plug-and-Play Compositional Reasoning with Large Language Models.pdf [ 2.09MB ]
┃    ┃    ┃    ┃    ┣━━ Caption Anything Interactive Image Description with Diverse Multimodal Controls.pdf [ 4.78MB ]
┃    ┃    ┃    ┃    ┣━━ AssistGPT A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn.pdf [ 2.61MB ]
┃    ┃    ┃    ┃    ┣━━ Accountable Textual-Visual Chat Learns to Reject Human Instructions in Image Re-creation.pdf [ 16.14MB ]