Part 1.
Intelligence: ability to deal with difficult and novel problems
(adaptivity and creativity)
A.D. de Groot: founder of psych methods. Book: “het denken van het schaken”.
What is the chess player doing in his head when playing chess?
Herbert Simon (Nobel prize); sees De Groot’s dissertation as pioneer and start of modern
cognitive psychology
Drosophila Idea = try to build an AI for chess in the hope that we can find out how chess
players think.
Initial attempts were unsuccessful
Then; Brute Force (very successful) = searching millions of positions with tree techniques to
find the best move, do not learn. Big memory is necessary and used for storing many
positions. Use simple evaluations: count pieces with weights.
Humans use complex evaluations: dynamic properties; open lines, safety of king, etc.
Why is chess difficult?
1. Explosion of possibilities (novel positions)
2. What is a “good” position?
Easier problems:
The water jug problem: “how to get exactly 4 liters in the 5-liter bottle?”
24 problem: 4 different numbers, make the number 24 out of them.
What are difficult problems in AI?
- P (Polynomial time): simple problems for which the time to solve it does not
increase fast when the size of the problem increases.
- NP (Non-deterministic Polynomial time): HARD problems; the solution time
increases very quickly. Exponential growth in time: chess, sudoku, travelling
salesman problem. The solution is easy and quick to verify, but finding the solution
requires NP time.
- If P = NP, there would be no gap between solving a problem and recognizing the
solution once it’s found. All the NP problems can then be solved deterministically in
polynomial time.
Travelling Salesman Problem
- Discover the shortest route for visiting a group of cities
- Greedy Algorithm: always choose the next closest city
- 2-opt Swap: select 2 edges and reconnect them for more efficient outcome from the
greedy algorithm. But which edges to swap? Mainly when there are many cities
involved.
- Local Search: first a random route, then apply 2-opt swap to the generated route, if
this results in shorter route -> keep it. Continue until it cannot be improved. BUT:
greedy selections early on lead to getting stuck in non-optimal configuration = Local
Minimum. Solutions to this are:
, - Simulated Annealing = attaching a temperature to the search; starts with “hot” and
easily accepts degrading solutions. As the temperature drops, the probability of
accepting bad moves decreases.
How do humans solve these problems? (how to study that)
- Error analysis
- Eye movements
- Thinking aloud protocols. Advantages: Less disturbance of task flow compared to
questions, observer does not need knowledge on the material, considered more
convincing.
Part 2.
Tree algorithms:
1. Set up a search tree
2. Search tree to a certain depth
3. Decide
Minimax Algorithm in chess = e.g., black tries to minimize, white tries to maximize. Depth
is chosen beforehand. The algorithm can see as far as the given depth to determine what
would be the most lucrative outcome. In chess, its logic is from bottom to top (top being the
move to make for that color).
Alpha-Beta pruning: deals with explosion of options by stopping to evaluate a move as soon
as it shows to be worse than a previous one.
Old school chess problem uses Minimax algorithm
Expert systems: attempt to use AI to replace experts (such as doctors); “if X, then Y”
LEARNING AI (deep learning neural networks, like the brain)
Reinforcement learning in AI: like operant conditioning (skinner). Balance between
exploration and exploitation is very important.
Versus Deep Learning: Deep Learning learns from a training dataset, which then tries to
apply this to a new dataset. Reinforcement Learning is dynamic and adjusts action based on
continuous feedback; there is no training dataset, just a task that can be perfected.
Sparse reward setting = only the moment right before the reward is actually important; thus,
many steps necessary to get there make it very difficult to understand for the Reinforcement
AI. Solution for sparse reward problem: “reward shaping” = extra rewards that guide the
direction of where the AI needs to get to. However:
- Alignment Problem: getting stuck with simply getting the most rewards, not
exploring the actual goal and performing the intended behavior.
Modern state-of-the-art reinforcement learning techniques can beat humans in chess, video
games, poker, etc.
If reward is too far away, the AI will not be able to find its way to the reward (too many steps
for random trying).
Brute force in chess was great because it worked and beat a human. But disappointing
because it did not show us how humans play chess.
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper casoosterveld. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €8,49. Je zit daarna nergens aan vast.