Part 1.
Intelligence: ability to deal with difficult and novel problems
(adaptivity and creativity)
A.D. de Groot: founder of psych methods. Book: “het denken van het schaken”.
What is the chess player doing in his head when playing chess?
Herbert Simon (Nobel prize); sees De Groot’s dissertation as pioneer and start of modern
cognitive psychology
Drosophila Idea = try to build an AI for chess in the hope that we can find out how chess
players think.
Initial attempts were unsuccessful
Then; Brute Force (very successful) = searching millions of positions with tree techniques to
find the best move, do not learn. Big memory is necessary and used for storing many
positions. Use simple evaluations: count pieces with weights.
Humans use complex evaluations: dynamic properties; open lines, safety of king, etc.
Why is chess difficult?
1. Explosion of possibilities (novel positions)
2. What is a “good” position?
Easier problems:
The water jug problem: “how to get exactly 4 liters in the 5-liter bottle?”
24 problem: 4 different numbers, make the number 24 out of them.
What are difficult problems in AI?
- P (Polynomial time): simple problems for which the time to solve it does not
increase fast when the size of the problem increases.
- NP (Non-deterministic Polynomial time): HARD problems; the solution time
increases very quickly. Exponential growth in time: chess, sudoku, travelling
salesman problem. The solution is easy and quick to verify, but finding the solution
requires NP time.
- If P = NP, there would be no gap between solving a problem and recognizing the
solution once it’s found. All the NP problems can then be solved deterministically in
polynomial time.
Travelling Salesman Problem
- Discover the shortest route for visiting a group of cities
- Greedy Algorithm: always choose the next closest city
- 2-opt Swap: select 2 edges and reconnect them for more efficient outcome from the
greedy algorithm. But which edges to swap? Mainly when there are many cities
involved.
- Local Search: first a random route, then apply 2-opt swap to the generated route, if
this results in shorter route -> keep it. Continue until it cannot be improved. BUT:
greedy selections early on lead to getting stuck in non-optimal configuration = Local
Minimum. Solutions to this are:
, - Simulated Annealing = attaching a temperature to the search; starts with “hot” and
easily accepts degrading solutions. As the temperature drops, the probability of
accepting bad moves decreases.
How do humans solve these problems? (how to study that)
- Error analysis
- Eye movements
- Thinking aloud protocols. Advantages: Less disturbance of task flow compared to
questions, observer does not need knowledge on the material, considered more
convincing.
Part 2.
Tree algorithms:
1. Set up a search tree
2. Search tree to a certain depth
3. Decide
Minimax Algorithm in chess = e.g., black tries to minimize, white tries to maximize. Depth
is chosen beforehand. The algorithm can see as far as the given depth to determine what
would be the most lucrative outcome. In chess, its logic is from bottom to top (top being the
move to make for that color).
Alpha-Beta pruning: deals with explosion of options by stopping to evaluate a move as soon
as it shows to be worse than a previous one.
Old school chess problem uses Minimax algorithm
Expert systems: attempt to use AI to replace experts (such as doctors); “if X, then Y”
LEARNING AI (deep learning neural networks, like the brain)
Reinforcement learning in AI: like operant conditioning (skinner). Balance between
exploration and exploitation is very important.
Versus Deep Learning: Deep Learning learns from a training dataset, which then tries to
apply this to a new dataset. Reinforcement Learning is dynamic and adjusts action based on
continuous feedback; there is no training dataset, just a task that can be perfected.
Sparse reward setting = only the moment right before the reward is actually important; thus,
many steps necessary to get there make it very difficult to understand for the Reinforcement
AI. Solution for sparse reward problem: “reward shaping” = extra rewards that guide the
direction of where the AI needs to get to. However:
- Alignment Problem: getting stuck with simply getting the most rewards, not
exploring the actual goal and performing the intended behavior.
Modern state-of-the-art reinforcement learning techniques can beat humans in chess, video
games, poker, etc.
If reward is too far away, the AI will not be able to find its way to the reward (too many steps
for random trying).
Brute force in chess was great because it worked and beat a human. But disappointing
because it did not show us how humans play chess.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller casoosterveld. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $9.23. You're not tied to anything after your purchase.