I was curious about this paper:
https://arxiv.org/pdf/2202.05826.pdf

It explains how creating an algorithm able to solve complex tasks while being trained on simple ones only.

The idea is to use multiple time the same block until it finds the solution.

The more complex the problem, the more you iterate before finding the solution.

The paper was not clear enough for me to understand how it really works, but I finally understood from another paper:

https://proceedings.neurips.cc/paper/2021/file/3501672ebc68a5524629080e3ef60aef-Paper.pdf

There is a trick so that it won't generalize well to any kind of problem, at least for now.

For the maze, it generates intermediate solutions like this.

This way, a fully convolutional network can learn on small labyrinths, and it became intuitive why it scales if you run it iteratively.

Not that impressive after all.

Stable Diffusion is also based on training a network to solve a small part of a problem, and it's by running it through several steps that you can generate an interesting issue.

It feels that this way of training network is more profound and is underused.

Generalizing from Easy to Hard Problems with Recurrent Networks