Industry Self-Flagging and the Insufficiency Critique of Alignment
DOI:
https://doi.org/10.55613/jeet.v36i2.251Keywords:
AI alignment, Magnifica Humanitas, Catholic social teaching, AI safety, frontier AI, value sensitive designAbstract
Pope Leo XIV’s encyclical Magnifica Humanitas (2026) advances the claim that aligning AI systems to a privately determined set of values is structurally insufficient, regardless of how well the alignment is executed, because the values themselves are decided outside the public deliberative process, what I call the insufficiency critique of alignment. This editorial argues that the insufficiency critique, often heard as theological externalism, has been independently and substantively articulated in a corpus of papers published by frontier AI labs and their affiliated research bodies during 2025-2026. I catalogue five such papers from Apple, Microsoft AI, and Anthropic, identify the methodological pattern they share, and read each as a structural finding about the limits of alignment-as-currently-practiced. The convergence between magisterial framing and industry self-flagging is striking and citable. Three implications follow. First, the standard dismissal of insufficiency arguments as outside-the-tent commentary on a technical practice is harder to sustain when the labs are publishing the same diagnosis. Second, alignment work remains necessary, but the framework needs to evolve to absorb the insufficiency critique. Third, several near-term moves, including value-sensitive design and public deliberative infrastructure, follow directly from taking the convergence seriously.
References
1. Bariach, B., Schoenegger, P., Bhaskar, M., & Suleyman, M. (2026). Seemingly Conscious AI Risks. SSRN Working Paper 6588659. Microsoft AI. https://ssrn.com/abstract=6588659
2. Butlin, P., Long, R., Elmoznino, E., Bengio, Y., Birch, J., Constant, A., Deane, G., Fleming, S. M., Frith, C., Ji, X., Kanai, R., Klein, C., Lindsay, G., Michel, M., Mudrik, L., Peters, M. A. K., Schwitzgebel, E., Simon, J., & VanRullen, R. (2023). Consciousness in Arti-ficial Intelligence: Insights from the Science of Consciousness. arXiv:2308.08708v3. https://doi.org/10.48550/arXiv.2308.08708
3. Cloud, A., Le, M., Chua, J., Betley, J., Sztyber-Betley, A., Hilton, J., Marks, S., & Evans, O. (2025). Subliminal Learning: Language Models Transmit Behavioral Traits via Hidden Signals in Data. arXiv:2507.14805v1. https://doi.org/10.48550/arXiv.2507.14805
4. Favaro, M., & Clark, J. (2026, June 4). When AI Builds Itself. The Anthropic Institute. https://www.anthropic.com/institute/recursive-self-improvement
5. Friedman, B., & Hendry, D. G. (2019). Value Sensitive Design: Shaping Technology with Moral Imagination. MIT Press.
6. Hicks, M. T., Humphries, J., & Slater, J. (2024). ChatGPT is bullshit. Ethics and Information Technology, 26(2): 38. https://doi.org/10.1007/s10676-024-09775-5
7. Humphries, J., Hicks, M. T., & Slater, J. (2026). LLMs bullshit by design: A reply to Licon. Philosophy & Technology, 39(2): 98. https://doi.org/10.1007/s13347-025-01016-x
8. Kazemi, H., Chegini, A., & Safi, M. (2026). A Single Neuron Is Sufficient to Bypass Safety Alignment in Large Language Models. arXiv:2605.08513. https://doi.org/10.48550/arXiv.2605.08513
9. Leo XIII. (1891). Rerum Novarum: Encyclical Letter on Capital and Labor. Vatican: Libreria Editrice Vaticana. https://www.vatican.va/content/leo-xiii/en/encyclicals/documents/hf_l-xiii_enc_15051891_rerum-novarum.html
10. Leo XIV. (2026). Magnifica Humanitas: Encyclical Letter on Safeguarding the Human Person in the Time of Artificial Intelligence. Vatican: Libreria Editrice Vaticana. https://www.vatican.va/content/leo-xiv/en/encyclicals/documents/20260515-magnifica-humanitas.html
11. Schoene, A. M., & Canca, C. (2025). ‘For Argument’s Sake, Show Me How to Harm Myself!’: Jailbreaking LLMs in Suicide and Self-Harm Contexts. arXiv:2507.02990. https://doi.org/10.48550/arXiv.2507.02990
12. Shojaee, P., Mirzadeh, I., Alizadeh, K., Horton, M., Bengio, S., & Farajtabar, M. (2025). The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity. Apple Machine Intelligence Research. arXiv:2506.06941. https://doi.org/10.48550/arXiv.2506.06941
13. Umbrello, S. (2024). Bernard Lonergan and a Nouvelle théologie for Artificial Intelligence. The Lonergan Review, 14, 13-44. https://doi.org/10.5840/lonerganreview2024/2025142
14. Umbrello, S., & van de Poel, I. (2021). Mapping value sensitive design onto AI for social good principles. AI and Ethics, 1(3), 283–296. https://doi.org/10.1007/s43681-021-00038-3
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Steven Umbrello

This work is licensed under a Creative Commons Attribution 4.0 International License.
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution 4.0 International license (CC-BY 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are encouraged to post their work online (e.g., in institutional repositories or on their website) after publication, while providing bibliographic details that credit JEET (See The Effect of Open Access).