Question Bank
#839
Why Deep Sigmoids Stop Learning
EasyMachine Learning
Problem
Compute the maximum of for the sigmoid . Use it to show why stacking sigmoid layers starves early layers of gradient, explain what ReLU changes, and name ReLU's own failure mode.
Your answer
Accepts decimals, fractions (5/12), and percentages (25%).