Apart > Blog

Blog

April 19, 2024
 – 
Events

Join us at the AI x Democracy research hackathon

Participate online or in-person on the weekend 3rd to 5th May in an exciting and intense AI safety research hackathon focused on demonstrating and extrapolating risks to democracy from real-life threat models. We invite researchers, cybersecurity professionals, and governance experts to join but it is open for everyone, and we will introduce starter code templates to help you kickstart your team's projects. Join at apartresearch.com/event/ai-democracy.
March 18, 2024
 – 
Events

Join the AI Evaluation Tasks Bounty Hackathon with METR

In this collaboration between METR and Apart, you get the chance to contribute directly to model evaluations research. Take part in the Code Red Hackathon, where you can earn money, connect with experts, and help create tasks to evaluate frontier AI systems.
February 1, 2024
 – 
AI Security

For-profit AI Safety

AI development attracts more than $67 billion in yearly investments, contrasting sharply with the $250 million allocated to AI safety. This gap suggests there's a large opportunity for AI safety to tap into the commercial market. The big question then is, how do you close that gap?
July 13, 2023
 – 
Guides

Updated quickstart guide for mechanistic interpretability

Written by Neel Nanda, who previously worked on mech interp under Chris Olah at Anthropic, who is currently a researcher on the DeepMind mechanistic interpretability team.
February 22, 2023
 – 
Events

Results from the Scale Oversight hackathon

Check out the top projects from the "Scale Oversight" hackathon hosted in February 2023: Playing games with LLMs, scaling of prompt specificity, and more.
January 2, 2023
 – 
Events

Results from the AI testing hackathon

See the winning projects from the AI testing hackathon held in December 2022: Trojan networks, unsupervised latent knowledge representation, and token loss trajectories to target interpretability methods.
November 21, 2022
 – 
Events

Results from the language model hackathon

See winning projects from the language model hackathon hosted November 2022: GPT-3 shows sycophancy, OpenAI's flagging is biased, and truthfulness is sensitive to prompt design.
November 17, 2022
 – 
Events

Results from the interpretability hackathon

Read the winning projects from the interpretability hackathon hosted in November 2022: Automatic interpretability, backup backup name mover heads, and "loud facts" in memory editing.