AI News, Fractional moments on bandit problems artificial intelligence

2019 AI Alignment Literature Review and Charity Comparison

As in 2016, 2017 and 2018, I have attempted to review the research that has been produced by various organisations working on AI safety, to help potential donors gain a better understanding of the landscape.

This is a similar role to that which GiveWell performs for global health charities, and somewhat similar to a securities analyst with regards to possible investments.

As ever I am painfully aware of the various corners I have had to cut due to time constraints from my job, as well as being distracted by 1) another existential risk capital allocation project, 2) the miracle of life and 3) computer games.

This document is fairly extensive, and some parts (particularly the methodology section) are the same as last year, so I don’t recommend reading from start to finish.

Here are the un-scientifically-chosen hashtags: If you are new to the idea of General Artificial Intelligence as presenting a major risk to the survival of human value, I recommend this Vox piece by Kelsey Piper.

If you are already convinced and are interested in contributing technically, I recommend this piece by Jacob Steinheart, as unlike this document Jacob covers pre-2019 research and organises by topic, not organisation.

Their research is more varied than MIRI's, including strategic work, work directly addressing the value-learning problem, and corrigibility work.

Drexler's Reframing Superintelligence: Comprehensive AI Services as General Intelligence is a massive document arguing that superintelligent AI will be developed for individual discrete services for specific finite tasks, rather than as general-purpose agents.

To some extent this seems to match what is happening - we do have many specialised AIs - but on the other hand there are teams working directly on AGI, and often in ML 'build an ML system that does it all'

While most books are full of fluff and should be blog posts, this is a super dense document - a bit like Superintelligence in this regard - and even more than most research I struggle to summarize it here - so I recommend reading it.

It derives some neat results, like whether or not we almost certainly go extinct depends on whether safety investments scale faster than the risk from consumption, and that generally speeding things up is better, because if there is a temporary risky phase it gets us through it faster - whereas if risk never converges to zero we will go extinct anyway.

The paper first proves some additional results and then runs some empirical tests with plausible real-life scenarios, showing that the technique does a decent job improving true performance (by avoiding excessive optimisation on the imperfect proxy).

It is interesting but I am sceptical it is very helpful for AI Alignment, where forcing one group / AI that has suddenly become much more powerful to abide by their previous commitments seems like more of a challenge;

To avoid the impossibility of deducing values from behaviour, we build agents with accurate models of the way human minds represent the world, and extract (partial) preferences from there.

I don't really understand how it is getting at this though - the hazard (lava) is the same in train and test, and the poor catastrophe-avoidance seems to simply be the result of the weak penalty placed on it during training (-1).

I'm not sure I see how these are directly helpful for the long-term problem while we don't yet have a technical solution - I generally think of these sorts of standards as mandating best practices, but in this case we need to develop those best practices.

It seems that, in their model, cyberhacking is basically the same as invasions with varying sparse defences (due to the very large number of possible zero-day 'attack beaches'.

FHI researchers contributed to the following research led by other organisations: FHI didn’t reply to my emails about donations, and seem to be more limited by talent than by money.

Rohin Shah, now with additional help, continue to produce the AI Alignment Newsletter, covering in detail a huge number of interesting new developments, especially new papers.

Shah et al.'s On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference argues that learning human values and biases at the same time, while impossible in theory, is actually possible in practice.

Attentive readers will recall Armstrong and Mindermann's paper arguing that it is impossible to co-learn human bias and values because any behaviour is consistent with any values - if we can freely vary the biases - and vice versa.

(They also discuss the potential of using some guaranteed-optimal behaviour as ground truth, but I am sceptical this would work, as I think humans are often at their most irrational when it comes to the most important topics, e.g.

#CIRL Turner et al.'s Conservative Agency attempts to prevent agents from doing irreversible damage by making them consider a portfolio of randomly generated utility functions - for which irreversible damage is probably bad for at least one of them.

I find the result a little hard to understand - I initially assumed they were relying on clustering of plausible utility functions, but it seems that they actually sampled at random from the entire space of possible functions!

#Overview Shah et al.'s Preferences Implicit in the State of the World attempts to use the fact that human environments are already semi-optimised to extract additional evidence about human preferences.

The core of the paper is a good insight - "it is easy to forget these preferences, since these preferences are already satisfied in our environment."

It seems quite plausible that some issues in AI safety will arise early on and in a relatively benign form for non-safety-orientated AI ventures (like autonomous cars or Minecraft helpers) –

Their agent foundations work is basically trying to develop the correct way of thinking about agents and learning/decision making by spotting areas where our current models fail and seeking to improve them.

However I assign little weight to this as I think most of the cross-sectional variation in organisation reported subjective effectiveness comes from variance in how optimistic/salesy/aggressive they are, rather than actually indicating much about object-level effectiveness.

MIRI, in collaboration with CFAR, runs a series of four-day workshop/camps, the AI Risk for Computer Scientists workshops, which gather mathematicians/computer scientists who are potentially interested in the issue in one place to learn and interact.

A vague hand-wave of an example might be for-profit corporations rewarding their subsidiaries based on segment PnL, or indeed evolution creating humans, which then go on to create AI.

Necessarily theoretical, the paper motivates the idea, introduces a lot of terminology, and describes conditions that might make mesa-optimisers more or less likely - for example, more diverse environments make mesa-optimisation more likely.

The idea is basically that the human intervenes to prevent the really bad actions, but because the human has some chance of selecting the optimal action afterwards, the loss of exploration value is limited.

However, they did convincingly argue that MIRI researchers hadn’t properly understood the academic work they were critiquing, an isolation which has probably gotten worse with MIRI’s current secrecy.

MIRI researchers contributed to the following research led by other organisations: Last year MIRI announced their policy of nondisclosure-by-default: [G]oing forward, most results discovered within MIRI will remain internal-only unless there is an explicit decision to release those results, based usually on a specific anticipated safety upside from their release.

This space is sufficiently full of infohazards that secrecy might be necessary, and in its absence researchers might prudently shy away from working on potentially risky things - in the same way that no-one in business sends sensitive information over email any more.

The fact that MIRI’s researchers appear intelligent suggest they at least think they are doing important and interesting issues, but history has many examples of talented reclusive teams spending years working on pointless stuff in splendid isolation.

Additionally, by hiding the highest quality work we risk impoverishing the field, making it look unproductive and unattractive to potential new researchers.

One possible solution would be for the research to be done by impeccably deontologically moral people, whose moral code you understand and trust.

this slowed down somewhat during 2019 despite taking on a second full-time staff member, which they attributed partly to timing issues (e.g.

CSER also participated in a lot of different outreach events, including to the UK parliament and by hosting various workshops, as well as submitting (along with other orgs) to the EU’s consultation, as summarised in this post.

I'm not sure how many people will be persuaded by this idea, but as a piece of philosophy I thought this was a clever idea, and it is definitely good to promote the idea that past generations have value (speaking as a future member of a past generation).

The paper makes the clever argument that the very large, barely-worth-living group might actually have more of these goods if they were offset by (lexicographically secondary) negative welfare.

They explain the specifics of the situation in extreme detail, and I was pleasantly surprised by their final recommendations, which mainly concerned removing barriers to insurance penetration.

Amadae's Autonomy and machine learning at the interface of nuclear weapons, computers and people discusses the potential dangers of incorporating narrow AI into nuclear weapon systems.

However, I am somewhat sceptical of the presentation of this as a *likely* risk as both a future shortage of soybeans and a dramatically more efficient technology for feeding livestock would both presumably be of interest to private actors, and show up in soybean future prices.

The idea is basically that even though polluting might be in a company's best interest, it hurts the other companies the investor owns, so it is overall against the best interests of the investor.

However, it somehow fails to mention even cursorily the fact that the core issue has been well studied by economists: when all the companies in an industry try to coordinate for mutual benefit, it is called a cartel, and the #1 way of achieving mutual benefit is raising prices to near-monopoly levels.

It would be extremely surprising to me if someone, acting as a self-interested owner of all the world's shoe companies (for example) found it more profitable to protect biodiversity than to raise the price of shoes.

They research methods of breaking up complex, hard-to-verify tasks into simple, easy-to-verify tasks - to ultimately allow us effective oversight over AIs.

they mention the required trees being extremely large, and my experience is that organising volunteers and getting them to actually do what they said they would has historically been a great struggle for many organisations.

The projects are mathematical decomposition (which seems very natural), decomposition computer programs (similar to how all programs can be decomposed into logic gates, although I don't really understand this one) and adaptive computation, where you figure out how much computation to dedicate to different issues.

human text than previous attempts, that was notably good at generating text that seemed human-generated - good enough that it was indistinguishable to humans who weren’t concentrating.

I thought this was a noble effort to start conversations among ML researchers about release norms, though my impression is that many thought OpenAI was just grandstanding, and I personally was sceptical of the harm potential - though a GPT 2 based intelligence did go on to almost take over LW, proving that the ‘being a good LW commenter’

Basically the idea is that governments will contract with and set outcomes for a small number of private regulators, which will then devise specific rules that need to be observed by ML shops.

The first is basically that we develop better and better optimisation techniques, but due to our inability to correctly specify what we want, we end up with worse and worse Goodheart's Law situations, ending up in Red-Queen style Moloch scenario.

At first they do so secretly, but eventually (likely in response to some form of catastrophe reducing humanity's capability to suppress them) their strategy abruptly changes towards world domination.

Given the strong funding situation at OpenAI, as well as their safety team’s position within the larger organisations, I think it would be difficult for individual donations to appreciably support their work.

The solutions they discuss seemed to me to often be fairly standard ideas from the AI safety community - thinks like teaching the AI to maximise the goal instantiated by its reward function at the start, rather than whatever happens to be in that box later, or using indifference results - but they introduce them to an RL setting, and the paper does a good job covering a lot of ground.

#RL Everitt et al.'s Modeling AGI Safety Frameworks with Causal Influence Diagrams introduces the idea of using Causal Influence Diagrams to clarify thinking around AI safety proposals and make it easier to compare proposals with different conceptual backgrounds in a standard way.

#AI_Theory Sutton's The Bitter Lesson argues that history suggests massive amounts of computer and relatively general structures perform better than human-designed specialised systems.

He uses examples like the history of vision and chess, and it seems fairly persuasive, though I wonder a little if these are cherry-picked - e.g.

This seems like an interesting idea, but seems like it would struggle with cases where increasing agent capabilities lead to new failure modes - e.g.

They bring together people who want to start doing technical AI research, hosting a 10-day camp aiming to produce publishable research.

They model this using causal agent diagrams, suggest a possible solution (making rewards a function of world-beliefs, not observations) and show that this does not work using very simple gridworld AIXIjs implementations.

As well as possibly being good for its own sake, this might help build institutional capacity to ban potentially dangerous technologies that transfer autonomous away from humans.

They do various pieces of strategic background work, especially on AI Timelines - it seems their previous work on the relative rarity of discontinuous progress has been relatively influential.

Bergal's Evidence against current methods leading to human level artificial intelligence lists a variety of arguments for why current AI techniques are insufficient for AGI.

- "Extrapolating this model implies that at a time when the economy is growing 1% per year, growth will diverge to infinity after about 200 years".

#Forecasting AI Impacts's AI Conference Attendance plots attendance at the major AI conferences over time to show the recent rapid growth in the field using a relatively stable measure.

example seemed to be basically question-begging on Newcomb's problem, and his Scots vs English example (where Scottish people choose to one-box because of their ancestral memory of the Darien scheme) seems to me to be a case of people not actually employing FDT at all.

And some of his arguments - like that it is too complicated for humans to actually calculate - seem like the same arguments he would reject as criticisms of utilitarianism, and not relevant to someone working on AGI.

This reminded me of an argument from Wei Dai that agents who cared about total value, finding themselves in a small world, might acausally trade with average value agents in large worlds.

Presumably a practical implication might be that EAs should adhere to conventional moral standards with even higher than usual moral fidelity, in exchange for shutting up and multiplying on EA issues.

FRI researchers contributed to the following research led by other organisations: EAF (of which they are a part) spent $836,622 in 2018 and $1,125,000 in 2019, and plan to spend around $995,000 in 2020.

One is that people who espouse short timelines tend to also argue for some amount of secrecy due to Infohazards, which makes their work hard for outsiders to audit.

I am more sympathetic to her second argument, but even there to the extent that 1) fields select for people who believe in them and 2) people believe what is useful for them to believe I think it is a bit harsh to call it a 'scam'.

For example, they argue that solving short-term issues can help with long-term ones, and that long-term issues will eventually become short-term issues.

However, I am inclined to agree with the review here by Habryka that a lot of the work here is being done by categorising unemployment and autonomous vehicles as long-term, and then arguing that they share many features with short-term issues.

They provide support to various university-affiliated (FHI, CSER, CHAI) existential risk groups to facilitate activities (like hiring engineers and assistants) that would be hard within the university context, alongside other activities - see their FAQ for more details.

number of papers we reviewed this year were supported by BERI, for example: Because this support tended not to be mentioned on the front page of the article (unlike direct affiliation) it is quite possible that I missed other papers they supported also.

Dimitri's The race for an artificial general intelligence: implications for public policy extends the model in Racing to the Precipice (Armstrong et al.) After a lengthy introduction to AI alignment, they make a formal model, concluding that a winner-take-all contest will have very few teams competing (which is good) Interestingly if the teams are concerned about cost minimisation this result no longer holds, as the 'best'

(as a very minor aside, I was a little surprised to see the AIImpacts survey cited as a source for expected Singularity timing given that it does not mention the word.) Overall I thought this was an excellent paper.

I think this probably takes over from Amodei et al.'s Concrete Problems (on which Jacob was a co-author) as my favour introduction to technical work, for helping new researchers locate themselves, with the one proviso that it is only in Google Docs form at the moment.

They do this by basically using an extremely myopic form of boxed oracle AIXI, that doesn't care about any rewards after the box has been opened - so all it cares about is getting rewards for answering the question well inside the box.

#AI_Theory Snyder-Beattie et al.'s An upper bound for the background rate of human extinction uses a Laplace's law of succession-style approach to bound non-anthropogenic Xrisk.

Notably, they argue that these estimates are not significantly biased by anthropic issues, because high base extinction rates mean lucky human observers would be clustered in worlds where civilisation also developed very quickly, and hence also observe short histories.

Obviously they can only provide an upper bound using such methods, so I see the paper as mainly providing evidence we should instead focus on anthropogenic risks, for which no such bound can exist.

#Forecasting Agrawal et al.'s Scaling up Psychology via Scientific Regret Minimization:A Case Study in Moral Decision-Making suggests that, in cases with large amounts data plus noise, human-interpretable models could be evaluated relative to ML predictions rather than the underlying data directly.

This suggests a multi-step program for friendliness: 1) gather data 2) train ML on data 3) evaluate simple human-evaluable rules on ML 4) have humans evaluate these rules.

The key thing I took away from it was that MIRI did not do a good job of locating their work within the broader literature - for example, he argues that FDT seems like it might actually be a special case of CDT as construed by some philosophers, which E&N should have addressed, and elsewhere he suggests E&N's criticisms of CDT and EDT present strawmen.

He also made some interesting points, for example that it seems 'FDT will sometimes recommend choosing a particular act because of the advantages of choosing a different act in a different kind of decision problem'.

Some seemed to almost beg the question, and at other times he essentially faulted FDT for addressing directly issues which any decision theory will ultimately have to address, like logical counterfactuals, or what is a 'Fair'

I think this is potentially very useful if brought to the notice of the relevant people, as the topics on the list seem useful things to work on, and I can easily imagine people not being aware of all of them.

The program doesn't appear to be directly motivated by AI alignment, but it does seem unusual in the degree to which alignment-type-issues would have to be solved for it to succeed - thereby hopefully incentivising mainstream ML guys to work on them.

from a natural language text channel, which is clearly linked to the Value Alignment problem, and similar issues like the higher optimisation power of the agent are likely to occur.

In the past I have been sceptical of the fund, as it was run by someone who already had access to far more capital (OpenPhil), and the grants were both infrequent and relatively conservative –

The fund managers are: Oliver Habryka especially has been admirably open with lengthy write-ups about his thoughts on the different grants, and I admire his commitment to intellectual integrity (you might enjoy his comments here).

Ex post this didn’t matter as CEA rejected it on substantive grounds the second time, but it makes me somewhat concerned about a risk of some of the capital going towards giving sinecures to people who are in the community, rather than objective merit.

It lists only made four AI Risk grants in 2019, though I think that their $500k grant to ESPR (The European Summer Program on Rationality) should be considered an AI Risk relevant grant also.: In contrast there are 11 AI Risk grants listed for 2018, though the total dollar value is lower.

This is partly to help prevent AGI/synth bio knowledge falling into the hands of malicious hackers (though most ML research seems to be very open), and partly because the field teaches various skills that are useful for AI safety, both high-level like Eliezer's Security Mindset and technical like crypto.

The first ($880k in total) focused on large organisations: The second round, requiring written applications, distributed money to a much wider variety of projects.

I suspect most people looking for AI jobs would find some on here they hadn't heard of otherwise, though of course for any given person many will not be appropriate.

I confess I haven't actually read the book, and have very low expectations for journalists in this regard, though Chivers is generally very good, and by all accounts this is a very fair and informative book.

In the tradition of active management, I hope to synthesise many pieces of individually well known facts into a whole which provides new and useful insight to readers.

Advantages of this are that 1) it is relatively unbiased, compared to inside information which invariably favours those you are close to socially and 2) most of it is legible and verifiable to readers.

Many capital allocators in the bay area seem to operate under a sort of Great Man theory of investment, whereby the most important thing is to identify a guy who is really clever and ‘gets it’.

(I actually think that even those with close personal knowledge should use historical results more, to help overcome their biases.) This judgement involves analysing a large number of papers relating to Xrisk that were produced during 2019.

I also attempted to include papers during December 2018, to take into account the fact that I'm missing the last month's worth of output from 2019, but I can't be sure I did this successfully.

while there has been a large increase in interest in AI safety over the last year, it’s hard to work out who to credit for this, and partly because I think progress has to come by persuading AI researchers, which I think comes through technical outreach and publishing good work, not popular/political work.

My impression is that policy on most subjects, especially those that are more technical than emotional is generally made by the government and civil servants in consultation with, and being lobbied by, outside experts and interests.

Attempts to directly influence the government to regulate AI research seem very adversarial, and risk being pattern-matched to ignorant technophobic opposition to GM foods or other kinds of progress.

AI researchers who are dismissive of safety law, regarding it as an imposition and encumbrance to be endured or evaded, will probably be harder to convince of the need to voluntarily be extra-safe - especially as the regulations may actually be totally ineffective.

The only case I can think of where scientists are relatively happy about punitive safety regulations, nuclear power, is one where many of those initially concerned were scientists themselves.

(Is it even a good idea to publicise that someone else is doing secret research?) With regard to published research, in general I think it is better for it to be open access, rather than behind journal paywalls, to maximise impact.

Reducing this impact by a significant amount in order for the researcher to gain a small amount of prestige does not seem like an efficient way of compensating researchers to me.

Having gone to all the trouble of doing useful research it is a constant shock to me how many organisations don’t take this simple step to significantly increase the reach of their work.

Similarly, I generally think concerns about algorithmic bias are essentially political - I recommend this presentation - though there is at least some connection to the value learning problem there.

Ironically, despite this view being espoused by GiveWell (albeit in 2011), this is essentially of OpenPhil’s policy of, at least in some cases, artificially limiting their funding to 50% or 60% of a charity’s need, which some charities have argued effectively provides a 1:1 match for outside donors.

While generally good, one side effect of this (perhaps combined with the fact that many low-hanging fruits of the insight tree have been plucked) is that a considerable amount of low-quality work has been produced.

So I think it is quite possible that many people will waste a lot of time as a result of this strategy, especially if they don’t happen to move in the right social circles.

and induces this in those who move there - though to be fair the people who are doing useful work in AI organisations seem to be drawn from a better distribution than the broader community.

Here is my eventual decision, rot13'd so you can do come to your own conclusions first (which I strongly recommend): Qrfcvgr univat qbangrq gb ZVEV pbafvfgragyl sbe znal lrnef nf n erfhyg bs gurve uvtuyl aba-ercynprnoyr naq tebhaqoernxvat jbex va gur svryq, V pnaabg va tbbq snvgu qb fb guvf lrne tvira gurve ynpx bs qvfpybfher.

pbagvahr gb or vzcerffrq jvgu PUNV’f bhgchg, naq guvax gurl cbgragvnyyl qb n tbbq wbo vagrenpgvat jvgu znvafgernz ZY erfrnepuref.

Gurl unir n ybg bs pnfu erfreirf, juvpu frrzf yvxr vg zvtug erqhpr gur hetrapl bs shaqvat fbzrjung, naq n pbafvqrenoyr cbegvba bs gur jbex vf ba zber arne-grez vffhrf, ohg gurer ner eryngviryl srj bccbeghavgvrf gb shaq grpuavpny NV fnsrgl jbex, fb V vagraq gb qbangr gb PUNV ntnva guvf lrne.

Qrrczvaq naq BcraNV obgu qb rkpryyrag jbex ohg V qba’g guvax vg vf ivnoyr sbe (eryngviryl) fznyy vaqvivqhny qbabef gb zrnavatshyyl fhccbeg gurve jbex.

V fgvyy vagraq gb znxr n qbangvba, va pnfr guvf vf whfg na hasbeghangr gvzvat vffhr, ohg qrsvavgryl jbhyq jnag gb frr zber arkg lrne.

PFRE’f erfrnepu vf whfg abg sbphfrq rabhtu gb jneenag qbangvbaf sbe NV Evfx jbex va zl bcvavba.

Bhtug frrzf yvxr n irel inyhnoyr cebwrpg, naq yvxr PUNV ercerfragf bar bs gur srj bccbeghavgvrf gb qverpgyl shaq grpuavpny NV fnsrgl jbex.

gubhtug NV Vzcnpgf qvq fbzr avpr fznyy cebwrpgf guvf lrne, naq ba n abg ynetr ohqtrg.

Va n znwbe qvssrerapr sebz cerivbhf lrnef, V npghnyyl cyna gb qbangr fbzr zbarl gb gur Ybat Grez Shgher Shaq.

Juvyr V unira’g nterrq jvgu nyy gurve tenagf, V guvax gurl bssre fznyy qbabef npprff gb n enatr bs fznyy cebwrpgf gung gurl pbhyq abg bgurejvfr shaq, juvpu frrzf irel inyhnoyr pbafvqrevat gur fgebat svanapvny fvghngvba bs znal bs gur orfg ynetre betnavfngvbaf (BcraNV, Qrrczvaq rgp.) Bar guvat V jbhyq yvxr gb frr zber bs va gur shgher vf tenagf sbe CuQ fghqragf jub jnag gb jbex va gur nern.

Hasbeghangryl ng cerfrag V nz abg njner bs znal jnlf sbe vaqvivqhny qbabef gb cenpgvpnyyl fhccbeg guvf.

It is the nature of making decisions under scarcity that we must prioritize some over others, and I hope that all organisations will understand that this necessarily involves negative comparisons at times.

If you found this post helpful, and especially if it helped inform your donations, please consider letting me and any organisations you donate to as a result know.

Griffiths, Thomas - Scaling up Psychology via Scientific Regret Minimization:A Case Study in Moral Decision-Making - 2019-10-16 - https://arxiv.org/abs/1910.07581 AI Impacts - AI Conference Attendance - 2019-03-06 - https://aiimpacts.org/ai-conference-attendance/ AI Impacts - Historical Economic Growth Trends - 2019-03-06 - https://aiimpacts.org/historical-growth-trends/ Alexander, Scott - Noisy Poll Results And Reptilian Muslim Climatologists from Mars - 2013-04-12 - https://slatestarcodex.com/2013/04/12/noisy-poll-results-and-reptilian-muslim-climatologists-from-mars/ Armstrong, Stuart - Research Agenda v0.9: Synthesising a human's preferences into a utility function - 2019-06-17 - https://www.lesswrong.com/posts/CSEdLLEkap2pubjof/research-agenda-v0-9-synthesising-a-human-s-preferences-into#comments Armstrong, Stuart;

Mindermann, Sören - Occam's razor is insufficient to infer the preferences of irrational agents - 2017-12-15 - https://arxiv.org/abs/1712.05812 Aschenbrenner, Leopold - Existential Risk and Economic Growth - 2019-09-03 - https://leopoldaschenbrenner.github.io/xriskandgrowth/ExistentialRiskAndGrowth050.pdf Avin, Shahar - Exploring Artificial Intelligence Futures - 2019-01-17 - https://www.shaharavin.com/publication/pdf/exploring-artificial-intelligence-futures.pdf Avin, Shahar;

Amadae, S - Autonomy and machine learning at the interface of nuclear weapons, computers and people - 2019-05-06 - https://www.sipri.org/sites/default/files/2019-05/sipri1905-ai-strategic-stability-nuclear-risk.pdf Baum, Seth - Risk-Risk Tradeoff Analysis of Nuclear Explosives for Asteroid Deflection - 2019-06-13 - https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3397559 Baum, Seth - The Challenge of Analyzing Global Catastrophic Risks - 2019-07-15 - https://higherlogicdownload.s3.amazonaws.com/INFORMS/f0ea61b6-e74c-4c07-894d-884bf2882e55/UploadedImages/2019_July.pdf#page=20 Baum, Seth;

Sandholm, Tuomas - Superhuman AI for multiplayer poker - 2019-07-17 - https://www.cs.cmu.edu/~noamb/papers/19-Science-Superhuman.pdf Caplan, Bryan - The Myth of the Rational Voter - 2008-08-24 - https://www.amazon.com/Myth-Rational-Voter-Democracies-Policies/dp/0691138737 Carey, Ryan - How useful is Quantilization for Mitigating Specification-Gaming - 2019-05-06 - https://www.fhi.ox.ac.uk/wp-content/uploads/SafeML2019_paper_40.pdf Carroll, Micah;

Dragan, Anca - The Assistive Multi-Armed Bandit - 2019-01-24 - https://arxiv.org/abs/1901.08654 Chivers, Tom - The AI Does Not Hate You: Superintelligence, Rationality and the Race to Save the World - 2019-06-13 - https://www.amazon.com/Does-Not-Hate-You-Superintelligence-ebook/dp/B07K258VCV Christiano, Paul - AI alignment landscape - 2019-10-12 - https://ai-alignment.com/ai-alignment-landscape-d3773c37ae38 Christiano, Paul - What failure looks like - 2019-03-17 - https://www.alignmentforum.org/posts/HBxe6wdjxK239zajf/what-failure-looks-like Cihon, Peter - Standards for AI Governance: International Standards to Enable Global Coordination in AI Research &

Shah, Rohin - Clarifying some key hypotheses in AI alignment - 2019-08-15 - https://www.lesswrong.com/posts/mJ5oNYnkYrd4sD5uE/clarifying-some-key-hypotheses-in-ai-alignment CSER - Policy series Managing global catastrophic risks: Part 1 Understand - 2019-08-13 - https://www.gcrpolicy.com/understand-overview Cummings, Dominic - On the referendum #31: Project Maven, procurement, lollapalooza results &

nuclear/AGI safety - 2019-03-01 - https://dominiccummings.com/2019/03/01/on-the-referendum-31-project-maven-procurement-lollapalooza-results-nuclear-agi-safety/ Dai, Wei - Problems in AI Alignment that philosophers could potentially contribute to - 2019-08-17 - https://www.lesswrong.com/posts/rASeoR7iZ9Fokzh7L/problems-in-ai-alignment-that-philosophers-could-potentially Dai, Wei - Problems in AI Alignment that philosophers could potentially contribute to - 2019-08-17 - https://www.lesswrong.com/posts/rASeoR7iZ9Fokzh7L/problems-in-ai-alignment-that-philosophers-could-potentially Dai, Wei - Two Neglected Problems in Human-AI Safety - 2018-12-16 - https://www.alignmentforum.org/posts/HTgakSs6JpnogD6c2/two-neglected-problems-in-human-ai-safety Drexler, Eric - Reframing Superintelligence: Comprehensive AI Services as General Intelligence - 2019-01-08 - https://www.fhi.ox.ac.uk/wp-content/uploads/Reframing_Superintelligence_FHI-TR-2019-1.1-1.pdf?asd=sa EU - Ethics Guidelines for Trustworthy Artificial Intelligence - 2019-04-08 - https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai Everitt, Tom;

Beard, Simon - Human Extinction and Our Obligations to thePast - 2019-11-05 - https://sci-hub.tw/https://www.cambridge.org/core/journals/utilitas/article/human-extinction-and-our-obligations-to-the-past/C29A0406EFA2B43EE8237D95AAFBB580 Kaufman, Jeff - Uber Self-Driving Crash - 2019-11-07 - https://www.jefftk.com/p/uber-self-driving-crash Kemp, Luke - Mediation Without Measures: Conflict Resolution in Climate Diplomacy - 2019-05-15 - https://www.cser.ac.uk/resources/mediation-without-measures/ Kenton, Zachary;

Evans, Owain - Generalizing from a few enviroments in Safety-Critical Reinforcement Learning - 2019-07-02 - https://arxiv.org/abs/1907.01475 Korzekwa, Rick - The unexpected difficulty of comparing AlphaStar to humans - 2019-09-17 - https://aiimpacts.org/the-unexpected-difficulty-of-comparing-alphastar-to-humans/ Kosoy, Vanessa - Delegative Reinforcement Learning: Learning to Avoid Traps with a Little Help - 2019-07-19 - https://arxiv.org/abs/1907.08461 Kovarik, Vojta;

Singh, Alok - Detecting Spiky Corruption in Markov Decision Processes - 2019-06-30 - https://arxiv.org/abs/1907.00452 Marcus, Gary - Deep Learning: A Critical Appraisal - 2018-01-02 - https://arxiv.org/ftp/arxiv/papers/1801/1801.00631.pdf McCaslin, Tegan - Investigation into the relationship between neuron count and intelligence across differing cortical architectures - 2019-02-11 - https://aiimpacts.org/investigation-into-the-relationship-between-neuron-count-and-intelligence-across-differing-cortical-architectures/ Mogensen, Andreas - ‘The only ethical argument for positive 𝛿

Dimitri, Nicola - The race for an artificial general intelligence: implications for public policy - 2019-04-22 - https://link.springer.com/article/10.1007%2Fs00146-019-00887-x Ngo, Richard - Technical AGI safety research outside AI - 2019-10-18 - https://forum.effectivealtruism.org/posts/2e9NDGiXt8PjjbTMC/technical-agi-safety-research-outside-ai O'Keefe, Cullen - Stable Agreements in Turbulent Times: A Legal Toolkit for Constrained Temporal Decision Transmission - 2019-05-01 - https://www.fhi.ox.ac.uk/wp-content/uploads/Stable-Agreements.pdf Ovadya, Aviv;

Uuk, Risto - AI Governance and the Policymaking Process: Key Considerations for Reducing AI Risk - 2019-05-08 - https://www.mdpi.com/2504-2289/3/2/26/pdf Piper, Kelsey - The case for taking AI seriousl as a threat to humanity - 2018-12-21 - https://www.vox.com/future-perfect/2018/12/21/18126576/ai-artificial-intelligence-machine-learning-safety-alignment Quigley, Ellen - Universal Ownership in the Anthropocene - 2019-05-13 - https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3457205 Roy, Mati - AI Safety Open Problems - 2019-11-02 - https://docs.google.com/document/d/1J2fOOF-NYiPC0-J3ZGEfE0OhA-QcOInhlvWjr1fAsS0/edit Russell, Stuart - Human Compatible;

Dragan, Anca - Preferences Implicit in the State of the World - 2019-02-12 - https://arxiv.org/abs/1902.04198 Shulman, Carl - Person-affecting views may be dominated by possibilities of large future populations of necessary people - 2019-11-30 - http://reflectivedisequilibrium.blogspot.com/2019/11/person-affecting-views-may-be-dominated.html Snyder-Beattie, Andrew;

- 2019-07-19 - https://arxiv.org/abs/1907.09273 Taylor, Jessica - - 1900-01-00 - https://www.aaai.org/ocs/index.php/WS/AAAIW16/paper/view/12613 Taylor, Jessica - The AI Timelines Scam - 2019-07-11 - https://unstableontology.com/2019/07/11/the-ai-timelines-scam/ Taylor, Jessica;

Kohli, Pushmeet - Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures - 2018-12-04 - https://arxiv.org/abs/1812.01647 USG - National Security Commission on Artificial Intelligence Interim Report - 2019-11-01 - https://drive.google.com/file/d/153OrxnuGEjsUvlxWsFYauslwNeCEkvUb/view Walsh, Bryan - End Times: A Brief Guide to the End of the World - 2019-08-27 - https://smile.amazon.com/End-Times-Brief-Guide-World-ebook/dp/B07J52NW99/ref=tmm_kin_swatch_0?_encoding=UTF8&qid=&sr= Weitzdörfer &

A multi-objective evolutionary hyper-heuristic algorithm for team-orienteering problem with time windows regarding rescue applications

Examples of possible uses are in factory and automation settings, robot sports teams, and urban search and rescue applications.

Hence, this paper explores a practical variant of TOP with time window (TOPTW) for rescue applications by humanoid robots called TOPTWR.

Then a robust and efficient MO and evolutionary hyper-heuristic algorithm for TOPTW based on the humanoid robot’s characteristics in the rescue applications (MOHH-TOPTWR) is proposed.

Symposium on Blockchain for Robotics and AI Systems

Robotics and AI systems are starting to revolutionize many applications, from transportation to health care, assisted by technological advancements such as ...

The Emerging Theory of Algorithmic Fairness

As algorithms reach ever more deeply into our daily lives, increasing concern that they be “fair” has resulted in an explosion of research in the theory and ...

Artificial Intelligence Full Course | Artificial Intelligence Tutorial for Beginners | Edureka

Machine Learning Engineer Masters Program: This Edureka video on "Artificial ..

A Combinatorial Proof of the Chernoff-Hoeffding Bound...- Valentine Kabanets

Valentine Kabanets Simon Fraser University; Institute for Advanced Study March 30, 2010 We give a simple combinatorial proof of the Chernoff-Hoeffding ...

Resilience and Security in Cyber-Physical Systems: Self-Driving Cars and Smart Devices

The future will be defined by autonomous computer systems that are tightly integrated with the environment, also known as Cyber-Physical systems (CPS).

Machine Learning Crash Course-2 Hours | Learn Machine Learning | Machine Learning Tutorial | Edureka

Machine Learning Masters Program: Topics Wise Machine Learning Podcast ..

Starbucks “Deep Brew”: Hyper Personalization Applications with Reinforcement - BRK2036

In 2017, Starbucks embarked on a journey to create a custom-developed, AI-driven recommendation platform (“Deep Brew”) to serve customers with relevant ...

Michael I. Jordan: Machine Learning: Dynamical, Stochastic & Economic Perspectives

2019 Purdue Engineering Distinguished Lecture Series presenter Dr. Michael I. Jordan While there has been significant progress at the interface of statistics and ...

Data Science Full Course | Learn Data Science in 3 Hours | Data Science for Beginners | Edureka

Data Science Master Program: ** This Edureka video on "Data Science Full Course" ..

Noam Chomsky et Juan Branco - Sur la Surveillance [ANGLAIS]

Entretien en deux parties de Juan Branco avec Noam Chomsky. Les sous-titres traduits en français seront rendus disponibles dans la journée.