I programmatically created a set of embeddings that can be used to perfectly reconstruct a binary classification function (“embedding synthesis”). I used these embeddings to programmatically set weights for a 1-layer transformer that can also perfectly reconstruct the classification function (“transformer synthesis”). With one change, this reconstruction matches my original hypothesis of how a pre-existing transformer works. I ran several experiments on my synthesized transformer to evaluate my synthetic model.
Anonymous: Team members hidden