C-044Value Alignment and AI EthicsConfidence: Medium

The Hard Problem of AI Alignment: Value Forks in Moral Judgment

Markus Kneer & Juri Viehoff (2025)

Source link ↗Drill this reading Discuss with AI

One-Sentence Thesis

The type of agent, human or AI, influences moral judgments in complex trade-off situations, with participants preferring AI agents to prioritize fairness over utility maximization.

Argument Outline

1Introduction to the Hard Problem of AI Alignment
2Background on value alignment, moral dilemmas, and public reflective equilibrium
3Discussion of empirical findings on agent-type value forks in moral judgment
4Analysis of the implications of these findings for AI value alignment

Key Distinctions

Human vs. AI agents in moral decision-making

Top-down vs. bottom-up approaches to value alignment

Consequentialist vs. non-consequentialist moral theories

Key Terms

Value alignment

The process of ensuring that AI systems' actions and decisions align with human values

Agent-type value forks

Differences in moral judgments based on the type of agent, human or AI, making the decision

Public reflective equilibrium

A method of value alignment that combines top-down and bottom-up approaches to achieve a more secure justificatory foundation

Flashcards

17 cards