Knowledge Representation
Blok 2 | 2022-2023
,Inhoud
_________________________________________________________________________
Summary 1
Boolean Satisfiability Problem 1
Boolean Constraint Propagation 2
Davis-Putnam-Logemann-Loveland Algorithm 4
Conflict Driven Clause Learning 4
Clause Learning 4
Book: SAT 5
1. Introduction to KR, the Course and "Logic Engineering" 5
1. Artificial Intelligence 5
2. Knowledge representation 6
3. Logic Engineering 6
2. Propositional Logic 9
1. Syntax 9
2. Semantics 10
3. Propositionals 11
4. Propositional logic to solve sudokus 11
3. Propositional Logic, SAT-Solving, DPLL 12
1. CNF 12
2. DPLL 13
3. Heuristics for splitting 16
4. The first project (SAT for Sudoku) 17
4. Decidability, Hard & Incomplete Methods 17
1. Soundness and completeness of DPLL 17
2. Entailment 19
3. MaxSAT 20
4. What is a hard SAT-problem? 21
5. Incomplete SAT-solving 22
5. Clause Learning, Planning & Applications 23
1. Conflict-driven clause learning 23
2. SAT-Planning 24
3. Applications of SAT 24
Book: Bayesian Networks 24
Syntax & Semantics 24
6. Bayesian Networks: Foundations 26
1. Introduction 26
2. Belief Dynamics 28
3. Independence 30
4. Further properties of belief 31
7. Bayesian Networks 33
1. Directed Acyclic Graphs (DAG) 33
2. Bayesian Networks 33
3. Independence in DAGs 35
4. d-Separation 37
1
,8. Bayesian Networks: Tools for Inference 40
1. Factors 40
2. Summing out (or Marginalization of) factors 40
3. Multiplication of factors 40
4. Variable elimination 41
5. Elimination order 42
6. Interaction graphs 43
7. Graph pruning 46
9. Bayesian Networks: Exact Inference in Bayesian Networks 48
1. Computing Posterior Marginals 48
2. Maximizing-out factors 49
3. Most Likely Instantiations 49
4. Extended factors 51
5. Maximum A-Posteriori 51
6. Most Probable Explanation 51
7. Computation 51
10. Knowledge Graphs & Ontology Languages 51
1. Inference rules 52
2. Ontologies & Knowledge Graphs 54
3. Syntax of RDF 56
4. Semantics of RDF 56
5. Calculus of RDF 56
6. Real-life Ontologies 56
7. Logic for Ontologies 56
8. Restrictions for Ontologies 56
9. Properties of Properties 57
Book: Description Logic 58
11. Description Logics 1 59
1. Engineering an Ontology Logic 59
2. Syntax of Description Logic (Concepts) 60
3. Syntax of Description Logic (Knowledge Bases) 63
11. Description Logics 2 64
1. Semantics of concepts and roles 64
2. Semantics of Knowledge bases 66
3. A word on model theory 68
4. Reduction of reasoning tasks 69
2
,Summary
_________________________________________________________________________
Examples: https://cse442-17f.github.io/Conflict-Driven-Clause-Learning/
Boolean Satisfiability Problem
The Boolean Satisfiability Problem is one in which we have to decide if there exists an assignment for
some boolean variables that would make a boolean formula evaluate to true.
● If a formula has an assignment under which the formula evaluates to True, then it is
satisfiable.
● If no such assignment exists for a formula, then it is unsatisfiable.
In general, the Boolean Satisfiability Problem can take a very long time to solve (in fact, it’s
NP-Complete).
● If a boolean formula has n variables, then there are 2^n possible assignments. All known
algorithms, in the worst case, will have to do a full search on that exponentially sized search
space.
● However, in real world problems, there are often logical structures in the problem that an
algorithm can utilize to search better.
Two algorithms that try to find a satisfying assignment for a formula (SAT solvers):
● Davis-Putnam-Logemann-Loveland Algorithm, which forms the basis for modern SAT
solving algorithms today.
● Conflict-Driven Clause Learning Algorithm, which improves on the first algorithm.
Both algorithms will only work on boolean formulas that are in Conjunctive Normal Form (CNF).
● In this form:
○ Boolean formulas are composed of the ANDs (∧) of clauses
○ Clauses are the ORs (v) of literals
○ Literals are a variable or its negation
Formulas in CNF have some nice properties that make reasoning quick.
● Since the formula is just the clauses ANDed together, for our formula to be satisfied every
clause must be satisfied.
● Since each clause is made of the ORs of literals, only one literal needs to be satisfied for the
whole clause to be satisfied.
Boolean Constraint Propagation
When an algorithm searches for a satisfying assignment for a CNF formula, one optimization is to look
for unit clauses.
● A clause is unit under a partial assignment when that assignment makes every literal in the
clause unsatisfied but leaves a single literal undecided.
Because the clause must be satisfied for the formula to be satisfied, there’s no point in checking any
assignment that would make that undecided literal false. The algorithm can just make its next guess
such that the literal will be true and thus the clause true.
3
,This process is known as Boolean Constraint Propagation* (BCP). Algorithms will run BCP to
assign variables that logically only have one option due to unit clauses. Sometimes, doing BCP for
unit clauses will cause other clauses to become unit! Algorithms will run BCP repeatedly until there
are no unit clauses.
*BCP equals the simplification steps that follow from a unit literal.
Davis-Putnam-Logemann-Loveland Algorithm
The DPLL algorithm is a SAT solver based on recursive backtracking that makes use of BCP. While a
naive recursive backtracker would check every possible assignment until it found one that satisfies the
formula, DPLL makes BCP-driven inferences to cut down on the size of the search tree – removing
assignments that can’t satisfy the formula. When it encounters a conflict, it backtracks to the last
non-BCP decision it made, and makes the other choice.
Here’s the pseudo-code:
DPLL:
Run BCP on the formula.
If the formula evaluates to True, return True.
If the formula evaluates to False, return False.
If the formula is still Undecided:
Choose the next unassigned variable.
Return (DPLL with that variable True) || (DPLL with that variable False)
See how DPLL works with the example below. On the left, we have a visualization of the decision tree
which shows the history of choices that the algorithm makes. Edges that are solid represent the
guesses/decisions the algorithm makes, while edges that are dashed represent that the algorithm
used BCP to get that choice. The path in blue leads to where the algorithm currently is at.
Conflict Driven Clause Learning
DPLL has three shortcomings.
● It makes naive decisions.
● When it encounters a conflict, it doesn’t learn anything else from the conflict besides the fact
that the current partial assignment led to the conflict.
● It can only backtrack one level which can cause it to keep exploring a search space that is
doomed to fail.
Conflict Driven Clause Learning (CDCL) improves on all three aspects! We’ll focus on the
improvements that CDCL makes on the last two shortcomings:
● clause learning from conflicts
● non-chronological backtracking
Clause Learning
When CDCL reaches a conflict, it looks at the guesses it has made, and all of the assignments it was
forced to infer via BCP which eventually led to the conflict.
4
, ● If we graph out these guesses and their implications, we get a graph called the implication
graph. This is the graph of the decisions it made, the literals that were BCP’d and why, and
how it got to the conflict.
● By looking at this graph, CDCL is able to learn a clause that is potentially more useful than
just the knowledge that the current partial assignment was bad.
● This way, CDCL can avoid making the same mistake over and over and skip over large
chunks of bad partial assignments DPLL will get stuck in.
What’s great about the learned clause is that it allows CDCL to learn from its mistakes and ignore
huge sections of the search space that will never satisfy the formula.
Once CDCL has learned a clause, CDCL will be able to backtrack potentially more than one level, and
BCP using the learned clause to put what it learned into action immediately. The ability to backtrack
more than one level with the learned clause is what we call non-chronological backtracking.
5