Rahulsantra
On this page, you find all documents, package deals, and flashcards offered by seller rahulsantra.
- 1
- 0
- 0
Community
- Followers
- Following
1 items
y Kind Machines
Artificial Intelligence systems are rapidly evolving, integrating extrinsic and intrinsic motivations. While these frameworks offer benefits, they risk misalignment 
at the algorithmic level while appearing superficially aligned with human values. 
In this paper, we argue that an intrinsic motivation for kindness is crucial for 
making sure these models are intrinsically aligned with human values. We argue 
that kindness, defined as a form of altruism motivated to maximize the reward 
of others,...
- Thesis
- • 8 pages •
Artificial Intelligence systems are rapidly evolving, integrating extrinsic and intrinsic motivations. While these frameworks offer benefits, they risk misalignment 
at the algorithmic level while appearing superficially aligned with human values. 
In this paper, we argue that an intrinsic motivation for kindness is crucial for 
making sure these models are intrinsically aligned with human values. We argue 
that kindness, defined as a form of altruism motivated to maximize the reward 
of others,...