You’ve probably seen very deep nets like ResNets — those use skip-connections to fight vanishing gradients. FractalNet takes a different path: it builds its depth by repeating a simple “fractal” expansion rule. That gives us multiple sub-paths of different lengths, all sharing parameters in a self-similar way, so the net can act “shallow” or “deep” as needed during training. Let’s break down the math, and the code behind it, and explore why this neat trick works.