AIGC周刊 | 第21期

Apr 03, 2023

时间： 2023.3.27-2023.4.2

本周大事记

微软提出的TaskMatrix.AI，想通过大模型和数百万个API来完成任务还是有意思的。

TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs

论文中提出的这个TaskMatrix.AI主要是想理解多模态的输入，然后生成代码，代码里面调用API来完成任务。它有统一格式的API平台和任务库，方便开发人员定制模型，也方便大模型调用。

TaskMatrix.AI拥有终身学习能力，可以通过学习组合模型和API来完成新任务，而且这是可以解释的。

关键组件有四个：

论文中还使用RLHF来提高多模态模型和API Selector的能力。

能够完成的任务：

我的想法，大模型或者多模态模型出现确实提高了以前对话系统的能力，以前智能音箱大战畅想的很多东西都可以拿出来继续做。

参考资源：

HuggingGPT ，主要是利用 LLM（如 ChatGPT）与机器学习社区（如 HuggingFace）中的各种 AI 模型相连接，解决 AI 任务。

主要思路是使用 ChatGPT 在收到用户请求时进行任务规划，根据 HuggingFace 中提供的功能描述选择模型，用所选 AI 模型执行每个子任务，并根据执行结果对响应进行总结。

HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace

最新技术：

GlyphDraw: Learning to Draw Chinese Characters in Image Synthesis Models Coherently
论文: arxiv.org
Self-Refine: Iterative Refinement with Self-Feedback
论文：arxiv.org
主页: selfrefine.info
300美元平替ChatGPT！斯坦福130亿参数「小羊驼」诞生，暴杀「草泥马」mp.weixin.qq.com
Language Models can Solve Computer Tasks
论文: arxiv.org
Llama-X开源！呼吁每一位NLPer参与推动LLaMA成为最先进的LLM
mp.weixin.qq.com
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
论文: arxiv.org
AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control
论文: arxiv.org
主页: avatar-craft.github.io
Training Language Models with Language Feedback at Scale
论文: arxiv.org
GPTEVAL: NLG Evaluation using GPT-4 with Better Human Alignment
论文: arxiv.org
HOLODIFFUSION: Training a 3D Diffusion Model using 2D Images
论文: arxiv.org
StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing
论文: arxiv.org/abs/2303.15649
ChatGPT要把数据标注行业干掉了？比人便宜20倍，而且还更准
mp.weixin.qq.com
open flamingo 发布
laion.ai
Instruct 3D-to-3D: Text Instruction Guided 3D-to-3D conversion
论文: arxiv.org
代码：sony.github.io
Your Diffusion Model is Secretly a Zero-Shot Classifier
论文: arxiv.org
主页: diffusion-classifier.github.io
如何评估大语言模型
mp.weixin.qq.com
GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents
abs: arxiv.org

课程：

商业：

案例：