Discussion about this post

User's avatar
David Manheim's avatar

I mostly agree with you on these topics, which is why I obviously needed to write a post disagreeing: https://www.lesswrong.com/posts/7CLL4Kpruf63GfBc3/treaties-regulations-and-research-can-be-complements and on my substack, here: https://davidmanheim.substack.com/p/treaties-regulations-and-research

Ron Bodkin's avatar

I don't agree although I think it's worth pushing for multiple approaches to address the threat of superintelligence! Here's my article on why I think regulating is more promising: https://www.linkedin.com/feed/update/urn:li:ugcPost:7452164302137937920/

I don't agree with your argument that it's easier to coordinate a pause than to coordinate with visibility around safety testing.

I view the risk of government-developed AI or lagging countries developing AI as breaking in favor of safety testing because safety testing allows AI to be developed by visible commercial actors, with visibility of the results and creates incentives to improve safety. By contrast in a pause regime it will be done by rogue actors or in secret.

I believe there is incentive for both the US and China to develop safe AI and they will see the other doing so - also this is much more plausible than them agreeing to stop developing advanced semiconductors.

It's interesting for you to question the ability to write standards given that you were a co-author on the seminal paper for safety cases. The standard has to be independent review of safety cases. I think both sides have an interest in this. But it's true that "There is still a lot of technical uncertainty about how to do AI assurance to a high standard." - I think that's a great argument for mandatory high standards. A high confidence safety case can't rest on a few tests - it's obvious that no test can prove with much confidence especially given evidence for eval awareness and sandbagging. The industry right now has a joke for "safety" - scraps that are allocated. It can be different. What if 50% of AI lab resources went to building strong safety cases? What does that world look like?

I think the summary our views contrast because you believe we need a decades long pause and that there will never be a path to safe AI, whereas I think we should slow down and maximize the chances of developing it safely.

I also think a difference in world view here is that you think both the US and China will really want to race ahead and "win" even unsafely. I think we can convince both countries that unsafe AI isn't a win for anyone.

I also believe that unsafe AI is emerging incrementally - we are seeing steady gains in capabilities including developer productivity and AI research capabilities, so frontier models from the top labs is the main risk. Also the US is racing ahead and driving the race in spite of "missile gap" rhetoric that China is a threat. And with safety standards the possibilities of fine-tuning and scaffolding them have to be accounted for in safety cases.

In response to your conclusion I think that mandatory safety standards are possible to enact in time whereas support for a hard pause is far from mature enough. While stopping AI is possible with international coordination it's looking challenging to do so in time. I also think that increasing stringency of safety standards is probably the best way to stop with clear evidence that AI is unsafe.

9 more comments...

No posts

Ready for more?