100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Summary Overview of exam goals of the Multi-Agent Systems course (XM_0052) €5,99
In winkelwagen

Samenvatting

Summary Overview of exam goals of the Multi-Agent Systems course (XM_0052)

 137 keer bekeken  5 keer verkocht

A summary for the Multi-Agent Systems course given in the Artificial Intelligence master at the VU. Overview of exam goals of the Multi-Agent Systems course (XM_0052)

Voorbeeld 2 van de 8  pagina's

  • 12 april 2021
  • 8
  • 2020/2021
  • Samenvatting
Alle documenten voor dit vak (3)
avatar-seller
timdeboer
Multi Agent Systems Course VU
Exam Goals
April 12, 2021


1 Game Theory
1.1 Pay-off matrix, BR, Pareto, dominated strategies and IESDS
• We have a Pay-off matrix to determine payoff of simultaneous moves of 2 players. This can be put as
normal-form game which is a tuple of (N,A,u) with N being set of n players, A being actions (each
elements in A called action profile) and utility function (payoff u for each agent depends on the joint
actions of all agents).
• The best response in a pay-off game is the action agent i can choose to get the highest increase in
utility when the action of agent j is known. For a continuous state space, we find the maximum utility
for agent i by taking the partial derivative with respect to action of agent i and setting the derivative
to zero:
ui (x, y) = 2(x + y + bxy) − x2
Where y is the known action of other agent
ui
= 2(1 + by) − 2x = 0
x
xopt = 1 + by

• Pareto optimality is a solution property. A strategy a is Pareto dominated by another strategy a0
if the utility for a0 is higher or equal than for a with at least 1 instance which is strictly higher. A
Pareto dominant strategy is the highest mutual pay-off (but not necessarily the highest social welfare);
deviating from this strategy could increase utility for player 1 but will then decrease for player 2.
• Strictly vs weak dominance: if a set of strategies I can chose are all dominant over another set of
strategies then the first set is strictly dominant. Weakly dominant is for some point equal and other
points dominant.
• An elimination strategy we can use is IESDS: we look at what not to do; if a row or column is strictly
dominant over other row, we delete other row. We can do this recursively. By the nature of first looking
at a dominated column, than row, than column etc, we could end up with an equilibrium which does
not have to imply Pareto optimality, since the Pareto optimal strategy could be deleted.

1.2 Nash equilibrium, Vickrey auction
• A Nash equilibrium is a solution concept based on conditions instead of an algorithm. Nash can be
strict(>) or weak (≥); Nash supposes players have no regrets even though they know what other player
will do; best response to all other agents. This may not be the response with highest welfare (think
of prisoner’s dilemma). Pure Nash can be strict or weak; mixed Nash is always weak. For mixed NE,
agent makes opponent indifferent (for matrix games only). There alwyas exist a NE; but this NE may
be a mixed Nash.
• Computation of Mixed NE: assign p and (1-p) to choices of player 1, assign q and (1-q) to player 2.
Calculate utility via the pay-off matrix. For example with Battle of the Sexes: player 1 wants to choose
p so that player 2 is indifferent (which is the definition of mixed Nash), see fig. 1 for the calculation.


1

, Figure 1: Calculating utility for choosing mixed Nash equilibrium. Utility is 2/3, which is lower than choosing
the same movie; so mixed Nash is not always optimal.


• However, a mixed Nash does not always exists; for example in the Prisoner’s dilemma, the probability
of p and q end up to be higher than 1; this is not possible. Since for every game, a NE must exists,
this implies that the Prisoner’s dilemma has a Pure NE.
• Vickrey auction: n players do a bid for an item; the highest bid will get the item and only has to
pay the second-highest price. This is interesting since no players will have regrets afterwards and the
winner will have positive utility.

1.3 Repeated games, ultimatum game, extensive form, backward induction,
SPNE
• Repeated games: One-state games, states do not change (rock-paper-scissors). Maximize total reward
of finite known number of repetitions. Infinite number of repetitions; maximize discounted total rewards
(future rewards less valuable then present rewards):
inf
X
R= δ t−1 · rt
t=1

Using mathematical derivation, we can rewrite expected R to:
1
E(R) = (r1 + δ · r2 + δ 2 · r3 ...)
1−δ
Xinf
E(R) = δ t−1 · rt
t=1


• Repeated Prisoner’s Dilemma has an interesting strategy: Grim Trigger strategy.
Start by cooperating (best social welfare) until someone defects; then always defect.
If we look at table 1, we can derive the utility for always cooperating and always defecting:

Coop Defect
C 3,3 4,1
D 1,4 2,2

Table 1: Form of Prisoner’s dilemma


2

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper timdeboer. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €5,99. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 50064 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€5,99  5x  verkocht
  • (0)
In winkelwagen
Toegevoegd