We look for induction heads by feeding in a random sequence of tokens repeated twice and looking for heads that attend from a second copy of a token to the token just after the first copy. This is to test the generality of methods for detecting induction heads by looking at attention scores on a linear scale, which leaves open questions about whether the method itself is reproducible. We make observations about the mean attention score used to determine if an attention head is an induction head in SoLU-8l-old compared to GPT2-small.
Anonymous: Team members hidden
Brian Muhia
Brian Muhia