AIGC周刊 | 第28期

pxiaoer

May 22, 2023

时间： 2023.5.15-2023.5.21

本周大事记

1. ChatGPT iOS APP上线

报道：

mp.weixin.qq.com

2. StabilityAI开源StableStudio

github： github.com

3.DragGAN

Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold

论文：vcai.mpi-inf.mpg.de

非官方实现： github.com

报道： mp.weixin.qq.com

最新技术：

Dr. LLaMA: Improving Small Language Models in Domain-Specific QA via Generative Data Augmentation
论文: arxiv.org
Masked Audio Text Encoders are Effective Multi-Modal Rescorers
论文: arxiv.org
RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs
论文: arxiv.org
GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content
论文: arxiv.org
Small Models are Valuable Plug-ins for Large Language Models 、
论文: arxiv.org
TinyStories: How Small Can Language Models Be and Still Speak Coherent English?
论文：arxiv.org
Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts
论文: arxiv.org
Understanding 3D Object Interaction from a Single Image
主页: jasonqsy.github.io
demo: huggingface.co
Towards Expert-Level Medical Question Answering with Large Language Models
论文：arxiv.org
Dual-Alignment Pre-training for Cross-lingual Sentence Embedding
论文：arxiv.org
SoundStorm: Efficient Parallel Audio Generation
论文：arxiv.org
AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation
论文: arxiv.org
Common Diffusion Noise Schedules and Sample Steps are Flawed
论文：arxiv.org
Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation
论文：arxiv.org
What You See is What You Read? Improving Text-Image Alignment Evaluation 论文: arxiv.org
Smart Word Suggestions for Writing Assistance
论文：arxiv.org
github: github.com
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining
论文：arxiv.org
Explaining black box text modules in natural language with language models
论文：arxiv.org
A Video Is Worth 4096 Tokens: Verbalize Story Videos To Understand Them In Zero Shot
论文: arxiv.org
FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
论文：arxiv.org
github: github.com

21. Going Denser with Open-Vocabulary Part Segmentation

论文： arxiv.org

SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities
论文：arxiv.org
GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework
主页: ai-muzic.github.io
LDM3D: Latent Diffusion Model for 3D
论文：arxiv.org
Going Denser with Open-Vocabulary Part Segmentation
论文：arxiv.org
代码：github.com
Comparing Software Developers with ChatGPT: An Empirical Investigation
论文：arxiv.org
Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning
论文：arxiv.org
Pengi: An Audio Language Model for Audio Tasks
论文：arxiv.org
Cross-Lingual Supervision improves Large Language Models Pre-training
论文：arxiv.org
WebCPM：首个联网支持中文问答开源模型
mp.weixin.qq.com