상세 컨텐츠

본문 제목

Leveraging Catastrophic Forgetting to Transfer Sub-Optimal to the Better-Sub-Optimal

사색/인지과학적 사유

by Aesthetic Thinker 2021. 3. 19. 18:32

본문

Forgetting: A Nature of the Neural Learners

 

Catastrophic forgetting is the main deficit of neural networks occurring in the primal properties of learning process. Neural networks cannot memorize in a long term if the things to memorize are absence in the training data coming newly. By the way, there are some needs to forget the past things in some area. For example, when we learned the '+' operator for the first time, we did some finger countings to calculate the problems like '3+7', but soon we disposed such an inefficient counting strategy after being used to the simple arithmetic calculation. This intended forgetting has a critical role to progressive learning. If so, how about leveraging the catastrophic forgetting to make our model be more progressive rather than complaining about the uncorrectable nature of neural networks?

 

Learner's Habit in Super Mario Bros.

 

If the target task is too hard to achieve using direct learning strategy, we can take another strategy called curriculumn learning. In curriculumn learning, the learning agent learns some easy-made version of the main tasks (or sub-tasks) in the frontal phase, and a little bit harder tasks are given after moving to the next phase. But the progressive learning described in this article is not about the style of curriculumn learning. Rather, what I intended to say is about 'behavior optimization'. Think about the agent already can do a little task, but in an inefficient way. Say, there is a super-mario agent jumping over the pipe always doing the 'down' action three times before jumping. The agent apparently can jump over the pipe, but these unnecessary actions make the jumping performance a little bit inefficient. We hope this gonna be wiped out.

 

Then how can we remove this awkward behavior and make the clean jump? At this point, we can talk about the techniques like imposing some penalties to the agent to decrease its moves. But before considering the technical countermeasures, let us comprehend why this awkward style of behavior has been developed and solidified. In the very beginning the agent met the super-mario world first time, agent actively explored the world to find the reward signals. While doing so, the agent suddenly encountered a pipe. The agent should jump over the pipe to keep going on the game, but it didn't know how to do that. So the agent tried various sequences of actions in front of the pipe, and finally get one possible sequence to jump over it. But the very sequence of action contained the three times of 'down' action before actual 'jump' action. After then, because the agent learned from its own past experience, it had developed the 'three times down and then jump' skill, and used it every time when it need to jump over the pipe.

 

The Virtue of Forgetting

 

What does this story suggest? Learning from own experience is prone to develop the sub-optimal behaviors. Then how can we improve the sub-optimal behaviors to be the better-sub-optimal and make it be the optimal at last? Forgetting. For the same arithmetic calculation, we can develop another skill (better) if we can forget the inefficient strategy of finger counting. Now we can consider the nature of neural networks, catastrophic forgetting, to make our agent progressive. We can just leave our agent to forget itself the past sub-optimal behaviors and to learn a new better-sub-optimal. That is the basic and natural way to make it be able to perform a clean jump that we ever hoped. Some unwelcomed features of our nature might be rediscovered by taking another viewpoint.

관련글 더보기

댓글 영역