Daily Papers
1.Pearl: A Production-ready Reinforcement Learning Agent(paper | webpage | code )
Reinforcement Learning (RL) is adaptable for long-term goals, addressing various challenges like delayed rewards, partial observability, and the exploration-exploitation trade-off. However, current RL libraries cover limited aspects. This paper introduces Pearl, a comprehensive RL agent software, offering modular solutions and demonstrating industry readiness with benchmark results.
2.Generative agent-based modeling with actions grounded in physical, social, or digital space using Concordia( paper )
Agent-based modeling, long used in social and natural sciences, is evolving with Large Language Models (LLMs). Generative Agent-Based Models (GABMs) leverage LLMs for agents to apply common sense, recall knowledge, control digital technologies via API, and communicate internally and externally. Concordia, a new library, simplifies creating these simulations in physical or digital environments. Agents in Concordia act based on LLM calls and associative memory. A unique 'Game Master' agent, inspired by tabletop RPGs, simulates the environment and translates agent actions into physical or digital outcomes, including handling external API integrations. Concordia's flexibility supports diverse scientific research and real-world digital service evaluations, including user simulations and synthetic data generation.
3.Controllable Human-Object Interaction Synthesis( paper | webpage )
This study focuses on generating realistic human-object interactions in 3D environments, guided by language descriptions. The proposed Controllable Human-Object Interaction Synthesis (CHOIS) method synchronizes object and human motions using a conditional diffusion model. Inputs include language descriptions, initial states of objects and humans, and sparse object waypoints. These waypoints, derived from high-level planning, ensure motion grounding in the scene. Traditional diffusion models struggle with aligning object motion to waypoints and ensuring realistic interactions, particularly in precise hand-object contacts and floor-grounded contacts. CHOIS addresses these challenges by introducing an object geometry loss to better align generated object motion with waypoints. Additionally, it incorporates guidance terms during the diffusion model's sampling process to enforce contact constraints, enhancing the realism of the interactions.
4.Scaling Laws of Synthetic Images for Model Training ... for Now( paper | code )
This paper investigates the use of synthetic images from advanced text-to-image models for training vision systems, assessing their effectiveness compared to real images. Key factors impacting scaling behavior include text prompts, guidance scale, and model types. While synthetic images show similar scaling trends to real images in CLIP training, they underperform in supervised image classifier training, primarily due to limitations in generating specific concepts. The study finds synthetic data most effective in limited real image scenarios, out-of-distribution cases, or when combined with real images, as seen in CLIP model training. Despite some limitations, synthetic images hold potential for augmenting training datasets, especially where real images are scarce or diverse.
5.Relightable Gaussian Codec Avatars(paper | webpage )
This paper introduces Relightable Gaussian Codec Avatars, offering high-fidelity, real-time relightable head avatars. These avatars, capable of displaying novel expressions, utilize a 3D Gaussian-based geometry model capturing detailed features like hair strands. The appearance model, handling various human head materials, employs learnable radiance transfer and global illumination-aware spherical harmonics for realistic reflections. Efficient under various lighting conditions, the method includes relightable eye models for enhanced eye reflections and gaze control. Outperforming current methods, it demonstrates real-time avatar relighting on consumer VR headsets, balancing efficiency and fidelity.
AI News
1.Introducing StableLM Zephyr 3B: A New Addition to StableLM, Bringing Powerful LLM Assistants to Edge Devices( stabilityAI news )
2.Grok begins rolling out to X Premium+ users (twitter )
3.Inflection’s Pi chatbot comes to Android (twitter )
AI Repo
1.Awesome-AIGC-3D:A curated list of awesome AIGC 3D papers( repo )
2.HALOs:Human-Centered Loss Functions (HALOs)( repo )
3.radames/Enhance-This-DemoFusion-SDXL ( huggingface space )
4.PurpleLlama:Set of tools to assess and improve LLM security( repo )