Samenvatting

Natural Language Processing - Summary Slides

3 keer bekeken 0 keer verkocht

Instelling
Vrije Universiteit Amsterdam (VU)

A summary of all the slides for the course Natural Language Processing, MSc AI.

[Meer zien]

Voorbeeld 4 van de 87 pagina's

Bekijk voorbeeld

Geupload op 30 december 2024
Aantal pagina's 87
Geschreven in 2022/2023
Type Samenvatting

nlp
natural language
zipf law
confusion matrix
estimation theory
entropy
wordnet
vectors
pos tagging
objective function
language models
llm
dependency grammar
compositionality
em

€14,99

Ook beschikbaar in voordeelbundel v.a. €19,99

In winkelwagen

Op verlanglijstje

100% tevredenheidsgarantie
Direct beschikbaar na betaling
Zowel online als in PDF
Je zit nergens aan vast

Ook beschikbaar in voordeelbundel (1)

Natural Language Processing - Slides and Concise Summary

€ 26,98 € 19,99 2 items

1. Samenvatting - Natural language processing - summary slides
2. Samenvatting - Natural language processing - concise summary per lecture
Meer zien

Natural Language Processing - slides

Lecture 1
Natural Language
● Natural language is quite complex
● Yet, children acquire language easily and are able to understand and produce utterances
they have never heard (Poverty of the Stimulus)
● How is this possible?
○ Language is innate?
○ Learn through imitation?
○ Learn through interaction?
○ Language just like any cognitive faculty, but with more input?

Language Acquisition

Natural language is:
● Compositional
○ The meaning of a sentence is determined by the meanings of its individual words
and the way they are combined/composed
■ "The cat is on the mat." → the animal referred to as a "cat" is located "on"
top of the object referred to as a "mat."
● Arbitrary
○ There is no inherent or logical relationship between the form of a word or
expression and its meaning
■ "Dog" → refers to the domesticated four-legged animal we commonly
associate with the word "dog," but there is no inherent reason why the
sounds "d", "o", and "g" arranged in that particular order should convey
that meaning
● Creative
○ Ability of speakers of a natural language to generate new and meaningful
expressions that may not have been previously encountered or explicitly learned
■ “Selfie”
● Displaced
○ Ability of speakers to refer to things that are not directly perceivable or present
■ "Yesterday, I went to the store and bought some groceries."

1

,Natural Language Processing - slides

What does an NLP system need to know?
● Language consists of many levels of structure.
● Humans fluently integrate all of these in producing and understanding language.
● Ideally, so would a computer!

● Morphology
○ Study of words and their parts or smallest meaningful units of meaning
■ prefixes, suffixes and base words
● Parts of speech
○ Word classes or grammatical categories such as noun, verb, adjective, adverb,
pronoun, preposition, conjunction, interjection
● Syntax
○ Rules that govern the arrangement of words and phrases in a sentence, including
rules for word order, word agreement (e.g., subject-verb agreement), and the
formation of phrases (such as noun phrases, verb phrases, and adjective
phrases)
● Semantics
○ Meaning of words, phrases, sentences
● Pragmatics/discourse
○ Analysis of extended stretches of language use, such as conversations, texts,
and narratives, in their social and cultural contexts

What is Natural Language Processing?
● Core technologies:
○ Language modeling / text generation
○ Sequence / POS tagging
○ Syntactic parsing
○ Named Entity Recognition (NER)
○ Coreference resolution
○ Word disambiguation
○ Semantic role labeling

2

,Natural Language Processing - slides

● Natural language processing (NLP) refers to the branch of computer science—and more
specifically, the branch of artificial intelligence or AI—concerned with giving computers
the ability to understand text and spoken words in much the same way human beings
can.
● NLP combines computational linguistics—rule-based modeling of human language—with
statistical, machine learning, and deep learning models. Together, these technologies
enable computers to process human language in the form of text or voice data and to
‘understand’ its full meaning, complete with the speaker or writer’s intent and sentiment.

What is Natural Language Processing?
● Represent language in a way that a computer can process it
○ representing input
● Process language in a way that is useful for humans
○ generating output
● Understanding language structure and language use
○ computational modelling

Why is NLP hard?
1. Ambiguity
2. Sparse data due to Zipf’s Law
3. Variation
4. Expressivity
5. Context dependence
6. Unknown representation

Ambiguity at many levels
● Word senses: bank (noun: place where people deposit money or verb: to bounce off of
something)
● Part of speech: chair (noun: seat, person in charge of an organization or verb: act as
chairperson)
● Syntactic structure: I saw a man with a telescope (either I had a telescope or the man)
● Quantifier scope: Every child loves some movie (every child loves at least one movie or
every child loves one particular movie)
● Multiple: I saw her duck (saw as in see or a hand tool)
● How can we model ambiguity, and choose the correct analysis in context?

What can we do about ambiguity?
● Non-probabilistic methods (FSMs for morphology, CKY parsers for syntax)
○ Return all possible analyses
● Probabilistic models (HMMs for POS tagging, PCFGs for syntax) and algorithms (Viterbi,
probabilistic CKY)
○ Return the best possible analysis
● But the “best” analysis is only good if our probabilities are accurate. Where do they come
from?

3

, Natural Language Processing - slides

Statistical NLP
● Like most other parts of AI, NLP is dominated by statistical methods
○ Typically more robust than ealier rule-based methods
○ Relevant statistics/probablities are learned from data
○ Normally requires lots of data about any particular phenomenon

Sparse data due to Zipf’s Law
● We have different word counts/ frequencies in large text corpuses

○ Takeaway: Rank-frequency distribution is an inverse relation
■ To really see what’s going on, use logarithmic axes:

● Assume “word” is a string of letters separated by spaces (a great oversimplification…)
● Zipf’s law
○ Summarizes the behaviour above

○ Implications
■ Regardless of how large our corpus is, there will be a lot of infrequent
(and zero-frequency) words
■ In fact, the same holds for many other levels of linguistic structure (e.g.,
syntactic rules in a CFG)
■ This means we need to find clever ways to estimate probabilities for
things we have rarely or never seen

Variation

4

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper tararoopram. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €14,99. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 48298 samenvattingen verkocht

Opgericht in 2010, al 15 jaar dé plek om samenvattingen te kopen

Start met verkopen

Populaire Universiteiten

Populaire Hogescholen

Populaire Scholen

Populaire samengevatte studieboeken voor Communicatie en Taal

Populaire samengevatte studieboeken voor Economie en Bedrijf

Populaire samengevatte studieboeken voor Exact en Informatica

Populaire samengevatte studieboeken voor Gedrag en Maatschappij

Populaire samengevatte studieboeken voor Gezondheid en Geneeskunde

Populaire samengevatte studieboeken voor Onderwijs en Opvoeding

Populaire samengevatte studieboeken voor Recht en Bestuur

De beste samenvattingen om je Wft-diploma te behalen

De beste samenvattingen om je theorie examens te behalen

De beste samenvattingen voor je cursus in de Veiligheidsbranche

De beste samenvattingen voor Gezondheid & Hygiëne cursussen

De beste samenvattingen voor zakelijke cursussen

De beste samenvattingen voor je PABO WisCAT cursus

Populaire vakken

Populaire vakken

Populaire vakken

Boekverslagen en samenvattingen

Verkoper

Samenvatting

Natural Language Processing - Summary Slides

Document informatie

Onderwerpen

Geschreven voor

Verkoper

Voorbeeld van de inhoud

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Snel en makkelijk kopen

Focus op de essentie

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?