#4: AI timelines, AGI risk, and existential risk from climate change

Aug 08, 2022

But if it is held that each generation can by its own deliberate acts determine for good or evil the destinies of the race, then our duties towards others reach out through time as well as through space, and our contemporaries are only a negligible fraction of the “neighbours” to whom we owe obligations. The ethical end may still be formulated, with the Utilitarians, as the greatest happiness of the greatest number [...] This extension of the moral code, if it is not yet conspicuous in treatises on Ethics, has in late years been obtaining recognition in practice.
— John Bagnell Bury

Future Matters is a newsletter about longtermism brought to you by Matthew van der Merwe and Pablo Stafforini. Each month we collect and summarize longtermism-relevant research and share news from the longtermism community. The version crossposted to the Effective Altruism Forum includes a bonus conversation with a prominent researcher. You can also listen on your favorite podcast platform and follow on Twitter.

Research

Jacob Steinhardt's AI forecasting: one year in reports and discusses the results of a forecasting contest on AI progress that the author launched a year ago. Steinhardt's main finding is that progress on all three capability benchmarks occurred much faster than the forecasters predicted. Moreover, although the forecasters performed poorly, they would—in Steinhardt's estimate—probably have outperformed the median AI researcher. That is, the forecasters in the tournament appear to have had more aggressive forecasts than the experts did, yet their forecasts turned out to be insufficiently, rather than excessively, aggressive. The contest is still ongoing; you can participate here.

Tom Davidson’s Social returns to productivity growth estimates the long-run welfare benefits of increasing productivity via R&D funding to determine whether it might be competitive with other global health and wellbeing interventions, such as cash transfers or malaria nets. Davidson’s toy model suggests that average returns to R&D are roughly 20 times lower than Open Philanthropy’s minimum bar for funding in this space. He emphasizes that only very tentative conclusions should be drawn from this work, given substantial limitations to his modelling.

Miles Brundage discusses Why AGI timeline research/discourse might be overrated. He suggests that more work on the issue has diminishing returns, and is unlikely to narrow our uncertainty or persuade many more relevant actors that AGI could arrive soon. Moreover, Brundage is somewhat skeptical of the value of timelines information for decision-making by important actors. In the comments, Adam Gleave reports finding such information useful for prioritizing within technical AI safety research, and Carl Shulman points to numerous large philanthropic decisions whose cost-benefit depends heavily on AI timelines.

In Two-year update on my personal AI timelines, Ajeya Cotra outlines how her forecasts for transformative AI (TAI) have changed since 2020. Her timelines have gotten considerably shorter: she now puts ~35% probability density on TAI by 2036 (vs. 15% previously) and her median TAI date is now 2040 (vs. 2050). One of the drivers of this update is a somewhat lowered threshold for TAI. While Cotra was previously imagining that a TAI model would have to be able to automate most of scientific research, she now believes that AI systems able to automate most of AI/ML research specifically would be sufficient to set off an explosive feedback loop of accelerating capabilities.

Back in 2016, Katja Grace and collaborators ran a survey of machine learning researchers, the main results of which were published the following year. Grace's What do ML researchers think about AI in 2022? reports on the preliminary results of a new survey that relies mostly on the same questionnaire and thus sheds light on how views in the ML research community have shifted in the intervening period. Some relevant findings are that the aggregate forecast assigns a 50% chance to high-level machine intelligence by 2059 (down from 2061 in 2016); that 69% of respondents believe society should prioritize AI safety research “more” or “much more” (up from 49% in 2016); and that the median respondent thinks it's 5% likely that advanced AI will have "extremely bad" long-run consequences for humanity (no change from 2016).

Jan Leike’s On the windfall clause (EA Forum) poses a key challenge to a 2020 proposal for ensuring the benefits of advanced AI are broadly distributed. The proposal is for AI labs to put a “windfall clause” in their charters, committing them to redistribute all profits above some extremely high level, e.g. $1 trillion/year. Firms might be open to making such commitments today since they view such profits as vanishingly unlikely, and because it yields some PR benefit. However, Leike points out that if a windfall clause were ever triggered, the organization would be incentivized and resourced to spend trillions of dollars on lawyers to find a loophole. Crafting and implementing a windfall clause today that could meet this challenge in the future is akin to winning a legal battle with an adversary with many orders of magnitude more resources.

Consider a race among AI companies each attempting to train a neural network to master a wide variety of challenging tasks via reinforcement learning on human feedback and other metrics of performance. How will this process culminate if the companies involved in the race do not take appropriate safety precautions? Ajeya Cotra's answer is that the most likely result is an existential catastrophe. Her 25,000-word report is far too rich and detailed to be adequately summarized here, but we encourage readers to check it out: Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover. (Note that Cotra does not think that AI companies will necessarily race forward in this way, or that the companies will necessarily make only the most basic and obvious AI safety efforts. These assumptions are made to isolate and explore the implications of a particular AI scenario; she personally assigns a ~25% chance to doom from AI.)

Matthijs Maas’ Introduction to strategic perspectives on long-term AI governance summarizes 15 different perspectives on AI governance in slogan-form, and places them along two axes: degree of optimism about the potential of either technical or governance solutions to mitigating AI risk. We found this to be a useful mapping of the terrain.

How influential are moral arguments—such as arguments for longtermism—expected to be, relative to other ways of influencing world events? A natural approach to answering this question is to look at the relevant base rates, and consider how influential such arguments have been historically. Rose Hadshar's How moral progress happens: the decline of footbinding as a case study attempts to contribute to this effort by examining why the Chinese custom of tightly binding the feet of young girls disappeared over the first half of the 20th century. Hadshar's very tentative conclusion is that the moral campaign against footbinding, though not counterfactually necessary for the decline of the practice, expedited its decline in urban areas probably by years and perhaps by decades.

Lukas Trötzmüller's Why EAs are skeptical about AI safety summarizes arguments made in conversation with the author by two dozen or so effective altruists who believe that the EA community currently overrates the magnitude of existential risk from AI or the importance of AI safety as a cause. As Trötzmüller notes, the quality of the different arguments varies greatly. It might be valuable for future research to focus on the most promising arguments, attempt to develop them more rigorously, and seek feedback from those who do not share this skepticism.

Among ways to positively affect the long-term future, most longtermists focus on reducing risks of human extinction and unrecoverable civilisational collapse. Some, however, focus instead on moral circle expansion, that is, promoting concern for a wider set of moral patients (animals, future people, digital minds) to mitigate the risk of humanity inflicting great harms to such beings in the far future. Stefan Schubert argues that this Isn’t the key value change we need: the greatest threat to our potential in scenarios in which humanity survives long-term isn't that our descendants will fail to appreciate the moral patienthood of morally relevant beings, but rather that they will fail to make the world radically better than it is (both for humans and other morally-relevant beings).

In Wild animal welfare in the far future, Saulius Šimčikas explores a variety of far-future scenarios potentially involving vast numbers of wild animals. He finds that scenarios related to terraforming other planets are by far the most significant: efforts by the wild animal welfare (WAW) movement to shape how such scenarios unfold are expected to alleviate about three orders of magnitude more suffering than efforts directed towards the next most significant class of scenarios, involving the spread of life to other planets for reasons other than space settlement (see his Guesstimate model). Šimčikas also considers several concrete interventions that the WAW movement could pursue, including (in decreasing order of priority) directly discussing far-future WAW scenarios; expanding laws and enforcement to prevent interplanetary contamination; and ensuring that there will be people in the future who care about WAW.

In The case for strong longtermism, Hilary Greaves and Will MacAskill argued that strong longtermism is robust to several plausible variations in axiology. Karri Heikkinen's Strong longtermism and the challenge from anti-aggregative moral views presents an objection to this argument. Even granting that the future could be vast, that future people matter, and that we can predictably influence these future lives, a proponent of certain non-aggregative or partially aggregative moral views may, according to Heikkinen, reject strong longtermism, if the path to influencing the long-term future involves either lots of small future benefits or a small probability of a huge future benefit.

Rational Animations, a YouTube channel that has produced excellent videos on longtermism, Grabby Aliens, and other important ideas, released a new video on Holden Karnofsky's Most Important Century. (The script, written mostly by Matthew Barnett, may be found here.) The video recapitulates, with engaging animations and clever visualizations, the main claims in Karnofsky's series, namely (1) that long-run trends in growth of gross world product suggest that the 21st century could be radically transformative; (2) that a hypothetical “duplicator”—allowing humans to make quick copies of themselves—could explain how this transformation occurs; (3) that AGI could have effects similar to such a duplicator; and (4) that expert surveys and sophisticated modeling tentatively suggesting that AGI may in fact arrive in the 21st century.

Other research:

Eli Lifland critically examines several reasons why he feels skeptical of high levels of near-term AI risk (Lifland doesn’t endorse many of these reasons).
Will MacAskill’s makes The case for longtermism in a New York Times essay adapted from his forthcoming book, What We Owe the Future, which we will cover in the next issue of FM.
Zack M. Davis’s Comment on "Propositions concerning digital minds and society" discusses the working paper by Nick Bostrom and Carl Shulman (summarized in FM#2).
Maxwell Tabarrok’s Enlightenment values in a vulnerable world argues that, given certain assumptions, Nick Bostrom's vulnerable world hypothesis does not undermine the traditional Enlightenment values of technological progress and political liberty.
Robert Long’s Digital people: biology versus silicon considers whether it will be possible to create human-like minds digitally, and whether we should expect this to happen.
Thomas Moynihan’s How insect 'civilisations' recast our place in the universe chronicles how discoveries about the social complexity of ants and other insects in the late 1800s and early 1900s influenced thinking about humanity's long-term prospects.
Stefan Schubert’s Bystander effects regarding AI risk considers whether we should worry that people may be disinclined to invest resources into addressing AI risk because they observe that others are already doing so.

Venice from across the sea in stormy darkness as imagined by DALL·E 2

News

Rob Wiblin interviewed Max Tegmark about recent advances in AI capability and alignment for the 80,000 Hours Podcast. He also interviewed Ian Morris on lessons from “big picture” history.

Tim Ferriss interviewed Will MacAskill on What We Owe the Future.

Dwarkesh Patel released three relevant interviews for the Lunar Society podcast: Sam Bankman-Fried, Fin Moorhouse and Joseph Carlsmith.

Nick Bostrom was profiled in The Spectator by Sam Leith

The Global Priorities Institute published a summary of Nick Beckstead and Teruji Thomas’s A paradox for tiny probabilities and enormous values.

The Fund for Alignment Research, a new organization that helps AI safety researchers pursue high-impact research by hiring contractors, is hiring research engineers and communication specialists.

The Future Forum, an experimental four-day conference on the long-term future of humanity, took place on August 4–7.

The launch of the Center for Space Governance, a non-profit research organization dedicated to exploring current, emerging, and future issues in space policy, was announced.

Radio Bostrom is a new podcast featuring high-quality narrations of Nick Bostrom’s written work.

Metaculus is hiring for several roles, including CTO and Chief of Staff.

The Berkeley Existential Risk Initiative (BERI) is hiring a Deputy Director.

The United Nations released a long-awaited update to its demographic projections. You can explore the dataset in Our World in Data's Population & Demography Data Explorer.

Open Philanthropy is seeking applicants for a US policy fellowship program focused on high-priority emerging technologies, especially AI and biotechnology. Apply by September 15th.

The Centre for the Governance of AI is setting up a Policy Team to explore ways influential actors could prepare the world for advanced AI.

Thomas Woodside, Dan Hendrycks and Oliver Zhang announced $20,000 in bounties for publicly-understandable explainers of AI safety concepts.

Conversation with John Halstead

To read our conversation with John Halstead on climate change, please go to the version of this issue crossposted on the Effective Altruism Forum.

We thank Leonardo Picón for editorial assistance and Thomas Moynihan for the epigram quote.