In this work, I investigated whether factual information is saved only in the FF layer or also in the attention layers, and found that from a large enough FF hidden dimension, factual information is rarely saved in the attention layers.
Anonymous: Team members hidden
Bary Levy
mentaleap