In the positional encoding of Transformers, sinusoidal functions are used with different formulas for odd and even indices, incorporating the term 10000^(2i/d_model). Analyze the following statements and choose the correct explanations for the effects of increasing or decreasing the constant 10000. See the formula below: Hint: Lec 19, Slides 29-32. 多项选择题

题目图片
A

Decreasing the constant from 10000 to 10 would result in a narrower frequency range of positional encodings, potentially making it harder for the model to differentiate positions in longer sequences.

B

Increasing the constant from 10000 to 10000000 would generate a wider range of frequencies in the positional encodings. This could enhance the model's ability to discern positions in longer sequences but might make the encoding too complex, leading to difficulties in learning positional relationships effectively.

登录即可查看完整答案

我们收录了全球超50000道真实原题与详细解析,现在登录,立即获得答案。

类似问题

更多留学生实用工具

加入我们,立即解锁 海量真题独家解析,让复习快人一步!