时间: 2023.5.8-2023.5.14
本周大事记
1. Meta发布可跨越六种感官的大模型ImageBind,已开源
代码: github.com
2. Google 发布大模型PaLM的更新版本PaLM2,部分任务超越GPT-4
介绍:blog.google
技术报告: ai.google
最新技术:
Cognitive Reframing of Negative Thoughts through Human-Language Model Interaction
论文:arxiv.org
Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos
论文: arxiv.org
A Suite of Generative Tasks for Multi-Level Multimodal Webpage Understanding 论文: arxiv.org
Composite Motion Learning with Task Control
论文: arxiv.org
代码: github.com
Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization
论文: arxiv.org
代码: github.com
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
论文: arxiv.org
Controllable Light Diffusion for Portraits
论文: arxiv.org
Locally Attentional SDF Diffusion for Controllable 3D Shape Generation
论文: arxiv.org
FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance
论文: arxiv.org
Code Execution with Pre-trained Language Models a
论文: arxiv.org
1.Sketching the Future (STF): Applying Conditional Control Techniques to Text-to-Video Models
论文: arxiv.org
主页:sketchingthefuture.github.io
HumanRF: High-Fidelity Neural Radiance Fields for Humans in Motion
论文:arxiv.org
Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era
论文:arxiv.org
Do LLMs Understand User Preferences? Evaluating LLMs On User Rating Prediction
论文: arxiv.org
EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention
论文: arxiv.org
代码: github.com
Simple Token-Level Confidence Improves Caption Correctness
论文: arxiv.org
Not All Languages Are Created Equal in LLMs: Improving Multilingual Capability by Cross-Lingual-Thought Prompting
论文: arxiv.org
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
论文: arxiv.org
Chain-of-Dictionary Prompting Elicits Translation in Large Language Models
论文: arxiv.org
Bot or Human? Detecting ChatGPT Imposters with A Single Question
论文: arxiv.org
V2Meow: Meowing to the Visual Beat via Music Generation
论文: arxiv.org
Exploiting Diffusion Prior for Real-World Image Super-Resolution
论文: arxiv.org
TMR: Text-to-Motion Retrieval Using Contrastive 3D Human Motion Synthesis
ArtGPT-4: Artistic Vision-Language Understanding with Adapter-enhanced MiniGPT-4
论文: arxiv.org
模型: huggingface.co
Towards best practices in AGI safety and governance: A survey of expert opinion
论文: arxiv.org
Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models
Language models can explain neurons in language models
主页:openai.com
MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers 论文: arxiv.org
巨型AI模型背后的分布式训练技术
涌现:21 世纪科学的统一主题
课程:
LLM Bootcamp - Spring 2023
Introduction to Data-Centric AI
huggingface blog更新:
Text-to-Video: The Task, Challenges and the Current State
中文版:mp.weixin.qq.com
Run a Chatgpt-like Chatbot on a Single GPU with ROCm
Assisted Generation: a new direction toward low-latency text generation
商业:
海外大数据统计至少有4个基于gpt的聊天应用,月收入超过100万美元
Midjourney 官方中文版 · 内测申请
陆奇最新演讲全文实录、完整PPT和视频:大模型带来的新范式
案例:
Open LLM Leaderboard
threestudio: 创建3D内容的统一框架
ControlNet大更新:仅靠提示词就能精准P图,保持画风不变
huggingface 发布Transformers Agent