In a self-attention transformer network, which of the following is true for sinusoid-based positional encoding vectors多项选择题
登录即可查看完整答案
我们收录了全球超50000道真实原题与详细解析,现在登录,立即获得答案。
类似问题
Which of the following best describes positional encoding in Transformers?
What is combined with the inputs (embeddings) to the transformer architecture that encodes contextual information that can be used by attention mechanisms to create embeddings with more context?
In the positional encoding of Transformers, sinusoidal functions are used with different formulas for odd and even indices, incorporating the term 10000^(2i/d_model). Analyze the following statements and choose the correct explanations for the effects of increasing or decreasing the constant 10000. See the formula below: Hint: Lec 19, Slides 29-32.
Which of the following trees corresponds to a potential parse of the ambiguous sentence below, with correct syntactic categories? Some diagnostics are provided.
更多留学生实用工具
希望你的学习变得更简单
加入我们,立即解锁 海量真题 与 独家解析,让复习快人一步!