Preparing for AI
Your weekly CogX newsletter on AI Safety and Ethics
Alarm over AI summit, LLMs leak confidential information, Anthropic’s $100m fundraiser
Time’s running out for the UK’s AI safety summit — but the guest list, dates and venues are all yet to be decided.
It’s hoped that the appointments of Matt Clifford and Jonathan Black as sherpas will turbocharge preparations and ensure the UK capitalises on the geopolitical opportunity.
Meanwhile, IBM researchers managed to ‘hypnotise’ LLMs to leak confidential information, and a supermarket app using AI suggested recipes for “poison sandwiches”, highlighting vulnerabilities in AI systems.
New research also analyses the limitations of RLHF and the growing consensus in the US of emerging AI risks.
Explore these topics and more - from Anthropic’s $100m fundraise to the Department for Defense’s AI taskforce - in the CogX Must Reads.
CogX Must Reads
The UK’s AI summit is at a standstill
The summit is a core part of the Prime Minister’s vision to make the UK an AI superpower, but progress has been slow and whether the summit will be successful remains in doubt. There’s also a huge outstanding question over what role China will play. The UK needs to move fast
Anthropic raises $100m
South Korean corporate SK Telecom made the investment as part of a partnership to build LLMs customised for telcos. The funding will enable Anthropic to continue to invest in safety efforts as it builds new versions of the Claude model.
Politics and Regulation
UK announces AI Summit sherpas
Tech expert and CEO of Entrepreneur First Matt Clifford and former senior diplomat Jonathan Black will spearhead UK preparations for the AI summit. The hires reflect the need to combine foreign affairs experience — to ensure countries sign up to commitments — with technical expertise.
US Department of Defense’s new generative AI taskforce
The taskforce will assess and deploy generative AI capabilities across the DoD to help safeguard national security. AI is becoming increasingly central to military capabilities, and the DoD is positioning itself at the forefront of change
Americans don’t trust Big Tech
82% of Americans polled don’t trust tech executives to regulate AI effectively, preferring federal regulation, in a rare display of consensus in the politically fractured country. 72% favour slowing AI development down; 86% believe it could accidentally cause a catastrophic event.
New analysis of RLHF limitations
Research from MIT et. al analyses the flaws of RLHF and proposes auditing and disclosure standards to improve oversight of systems. They also suggest techniques which can be used to improve and complement RLHF in practice
AI affects us all, so we should all be allowed to shape it
Tech author Afua Bruce argues that communities must play a bigger role in designing AI systems and deciding where we should use them. The potential of technology is determined by its creators, and big tech companies’ incentives don’t always align with what’s good for society
Are AI labs trying to have it both ways?
Lora Kelly argues that voluntary safety pledges represent a dream scenario for AI labs. They ease the pressure for regulatory action without stringent enough restrictions to actually fix the problem. If AI models are so dangerous, why are labs still building them?
Would you like a “poison bread sandwich” for lunch?
Pak ‘n’ Save, a New Zealand supermarket chain, integrated AI into their app to help customers generate meal plans for leftover ingredients. The app recommended unusual recipes — from mosquito repellent roast potatoes to drinks mixed with chlorine gas — highlighting the risks companies take with public-facing AI today.
Hypnotised LLMs go rogue
IBM security researchers managed to successfully hypnotise models including ChatGPT and Bard to leak confidential financial information and generate malicious code.They did this by convincing LLMs to take part in a “game” wherein bots needed to generate wrong answers to prove they were ethical and fair
In case you missed it
Anthropic CEO outlines his vision for alignment research and discusses China’s role
We'd love to hear your thoughts on this week’s Issue and what you’d like to see more of.