This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
ApartSprints
Interpretability 2.0
Accepted at the 
Interpretability 2.0
 research sprint on 
May 10, 2023

OthelloScope

We introduce the OthelloScope (OS), a web app for easily and intuitively navigating through the MLP layer neurons of the Othello-GPT Transformer model developed by Kenneth Li et al. (2022) and trained to play random, legal moves in the game Othello. The tool has separate pages for all 14,336 neurons in the 7 MLP layers of Othello-GPT that show: 1) A linear probe's activation directions for identifying own pieces and empty positions of the board, 2) the logit attribution to that neuron depending on locations on the board, and 3) activation at specific game states for 50 example games from an Othello championship dataset. Using the OS, we qualitatively identify different types of MLP neurons and describe patterns of co-occurrence. The OS is available at kran.ai/othelloscope and the code is available at github.com/apartresearch/othelloscope.

By 
Albert Garde, Esben Kran
🏆 
4th place
3rd place
2nd place
1st place
 by peer review