What operations are part of a standard Transformer block? (Select all that apply.)多项选择题
A
Layer normalization
B
Convolutional layer
C
Residual connection
D
Self-attention layer
登录即可查看完整答案
我们收录了全球超50000道真实原题与详细解析,现在登录,立即获得答案。
类似问题
Which of the following best describes a key property of Transformer architecture used in LLMs?
The transformer architecture only consists of a single encoder component and single decoder component.
The full transformer architecture consists of an encoder component and a decoder component. For models to specialize in accomplishing certain tasks and/or to best handle specific types of input/output data, some language models only use the encoder component, only use the decoder component, or use both the encoder and decoder (i.e., use the full transformer architecture).
What is the main architectural innovation in ChatGPT that allows it to handle complex language tasks?
更多留学生实用工具
希望你的学习变得更简单
加入我们,立即解锁 海量真题 与 独家解析,让复习快人一步!