• Ep 10 - Accelerated training to become an AI safety researcher w/ Ryan Kidd (Co-Director, MATS)

  • Nov 8 2023
  • Duración: 1 h y 17 m
  • Podcast

Ep 10 - Accelerated training to become an AI safety researcher w/ Ryan Kidd (Co-Director, MATS)

  • Resumen

  • We speak with Ryan Kidd, Co-Director at ML Alignment & Theory Scholars (MATS) program, previously "SERI MATS".

    MATS (https://www.matsprogram.org/) provides research mentorship, technical seminars, and connections to help new AI researchers get established and start producing impactful research towards AI safety & alignment.

    Prior to MATS, Ryan completed a PhD in Physics at the University of Queensland (UQ) in Australia.

    We talk about:

    * What the MATS program is
    * Who should apply to MATS (next *deadline*: Nov 17 midnight PT)
    * Research directions being explored by MATS mentors, now and in the past
    * Promising alignment research directions & ecosystem gaps , in Ryan's view

    Hosted by Soroush Pour. Follow me for more AGI content:
    * Twitter: https://twitter.com/soroushjp
    * LinkedIn: https://www.linkedin.com/in/soroushjp/

    == Show links ==

    -- About Ryan --

    * Twitter: https://twitter.com/ryan_kidd44
    * LinkedIn: https://www.linkedin.com/in/ryan-kidd-1b0574a3/
    * MATS: https://www.matsprogram.org/
    * LISA: https://www.safeai.org.uk/
    * Manifold: https://manifold.markets/

    -- Further resources --

    * Book: “The Precipice” - https://theprecipice.com/
    * Ikigai - https://en.wikipedia.org/wiki/Ikigai
    * Fermi paradox - https://en.wikipedia.org/wiki/Fermi_p...
    * Ajeya Contra - Bioanchors - https://www.cold-takes.com/forecastin...
    * Chomsky hierarchy & LLM transformers paper + external memory - https://en.wikipedia.org/wiki/Chomsky...
    * AutoGPT - https://en.wikipedia.org/wiki/Auto-GPT
    * BabyAGI - https://github.com/yoheinakajima/babyagi
    * Unilateralist's curse - https://forum.effectivealtruism.org/t...
    * Jeffrey Ladish & team - fine tuning to remove LLM safeguards - https://www.alignmentforum.org/posts/...
    * Epoch AI trends - https://epochai.org/trends
    * The demon "Moloch" - https://slatestarcodex.com/2014/07/30...
    * AI safety fundamentals course - https://aisafetyfundamentals.com/
    * Anthropic sycophancy paper - https://www.anthropic.com/index/towar...
    * Promising technical alignment research directions
    * Scalable oversight
    * Recursive reward modelling - https://deepmindsafetyresearch.medium...
    * RLHF - could work for a while, but unlikely forever as we scale
    * Interpretability
    * Mechanistic interpretability
    * Paper: GPT4 labelling GPT2 - https://openai.com/research/language-...
    * Concept based interpretability
    * Rome paper - https://rome.baulab.info/
    * Developmental interpretability
    * devinterp.com - http://devinterp.com
    * Timaeus - https://timaeus.co/
    * Internal consistency
    * Colin Burns research - https://arxiv.org/abs/2212.03827
    * Threat modelling / capabilities evaluation & demos
    * Paper: Can large language models democratize access to dual-use biotechnology? - https://arxiv.org/abs/2306.03809
    * ARC Evals - https://evals.alignment.org/
    * Palisade Research - https://palisaderesearch.org/
    * Paper: Situational awareness with Owain Evans - https://arxiv.org/abs/2309.00667
    * Gradient hacking - https://www.lesswrong.com/posts/uXH4r6MmKPedk8rMA/gradient-hacking
    * Past scholar's work
    * Apollo Research - https://www.apolloresearch.ai/
    * Leap Labs - https://www.leap-labs.com/
    * Timaeus - https://timaeus.co/
    * Other orgs mentioned
    * Redwood Research - https://redwoodresearch.org/

    Recorded Oct 25, 2023

    Más Menos
adbl_web_global_use_to_activate_webcro805_stickypopup

Lo que los oyentes dicen sobre Ep 10 - Accelerated training to become an AI safety researcher w/ Ryan Kidd (Co-Director, MATS)

Calificaciones medias de los clientes

Reseñas - Selecciona las pestañas a continuación para cambiar el origen de las reseñas.