This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
ApartSprints
Interpretability
Accepted at the 
Interpretability
 research sprint on 
July 17, 2023

One is 1- Analyzing Activations of Numerical Words vs Digits

Extensive research in mechanistic interpretability has showcased the effectiveness of a multitude of techniques for uncovering intriguing circuit patterns. We utilize these techniques to compare similarities and differences among analogous numerical sequences, such as the digits “1, 2, 3, 4”, the words “one, two, three, four”, and the months “January, February, March, April”. Our findings demonstrate preliminary evidence suggesting that these semantically related sequences share common activation patterns in GPT-2 Small.

By 
Mikhail L
🏆 
4th place
3rd place
2nd place
1st place
 by peer review