Top Papers of the week(Jan 29- Feb 4)
1).OLMo: Accelerating the Science of Language Models ( paper )
OLMo, a state-of-the-art open language model, offers full transparency in training data, code, and evaluation tools, empowering the research community to advance language modeling science and innovation.
2).RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval ( paper )
RAPTOR, a tree-structured retrieval system, integrates multi-level context into LLMs, outperforming traditional methods on QA tasks and setting new benchmarks with GPT-4.
3).Corrective Retrieval Augmented Generation ( paper )
The paper proposes Corrective Retrieval Augmented Generation (CRAG) to enhance the robustness of language models by addressing inaccurate retrievals. CRAG uses a lightweight evaluator to assess document quality and triggers actions like refining, correcting, or augmenting with web search, improving performance across various generation tasks.
4).Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling ( paper )
The paper introduces Web Rephrase Augmented Pre-training (WRAP), a method that uses instruction-tuned language models to rephrase web documents in specific styles for more efficient pre-training of large language models (LLMs). WRAP improves pre-training speed and performance on various NLP tasks by incorporating synthetic rephrases alongside real data, demonstrating the utility of synthetic data for LLM training.
5).The Power of Noise: Redefining Retrieval for RAG Systems ( paper )
The paper explores the impact of information retrieval components on Retrieval-Augmented Generation (RAG) systems, finding that irrelevant documents can unexpectedly enhance performance by over 30%. This challenges conventional IR strategies and suggests a need for new approaches tailored to RAG systems.
6).A Survey on Hallucination in Large Vision-Language Models ( paper )
This survey examines hallucinations in Large Vision-Language Models (LVLMs), identifying causes, evaluation methods, and mitigation strategies, aiming to guide future research and practical applications.
7).SliceGPT: Compress Large Language Models by Deleting Rows and Columns ( paper )
SliceGPT is a post-training sparsification scheme for large language models that reduces model parameters by slicing weight matrices, maintaining performance while requiring fewer GPUs and faster inference times.
8).AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning( paper | webpage )
AnimateLCM is a novel framework for accelerating video generation using decoupled consistency learning and teacher-free adaptation, achieving high-fidelity results with minimal steps and compatibility with personalized diffusion models.
9).Advances in 3D Generation: A Survey (paper )
This survey explores advancements in 3D content generation, covering methodologies, datasets, and applications, with a focus on generative models and neural scene representations, highlighting the transition from 2D to 3D and the challenges involved.
10).Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception ( paper | code )
Mobile-Agent is an autonomous multi-modal mobile device agent that uses visual perception to identify and navigate app interfaces, planning and executing tasks without system-specific customizations. It achieves high accuracy and completion rates, even in complex multi-app operations.
AIGC News of the week(Jan 29- Feb 4)
1). Yann LeCun - "Objective-Driven AI: Towards Machines that can Learn, Reason, and Plan" January 24, 2024 ( video | Slides )
2). CS294-158-SP24:Deep Unsupervised Learning Spring 2024 ( link )
3). Modular RAG and RAG Flow: Part II (link)
4). Web LLM attacks ( link )
5). A new leak just revealed a potential makeover for Google Bard – AND the launch of Google’s top-tier Gemini Ultra model. ( link )