Context: Let's look at a simple example of why vanishing and exploding gradients occur in RNNs. Consider a univariate version of RNN with the following update rules 𝑧 ( 𝑡 ) = 𝑢 𝑥 ( 𝑡 ) + 𝑤 ℎ ( 𝑡 − 1 ) ℎ ( 𝑡 ) = 𝜙 ( 𝑧 ( 𝑡 ) ) To keep things simple, let us assume 𝜙 is the identity function, i.e., 𝜙 ( 𝑖 ) = 𝑖 Consider we have the a final loss 𝐿 , and computed the derivative of ∂ 𝐿 ∂ ℎ 𝑇 for some 𝑡 = 𝑇 Using the update rules, the value of ∂ ℎ 𝑇 ∂ ℎ 1   comes out to be 𝑤 ( 𝑎 𝑇 + 𝑏 ) Main Question: What is the value of a?  数值题

登录即可查看完整答案

我们收录了全球超50000道真实原题与详细解析,现在登录,立即获得答案。

类似问题

更多留学生实用工具

加入我们,立即解锁 海量真题独家解析,让复习快人一步!