100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
College notes Natural Language Processing Technology (L_AAMAALG005) £5.24   Add to cart

Lecture notes

College notes Natural Language Processing Technology (L_AAMAALG005)

 42 views  1 purchase
  • Module
  • Institution

All notes for the NLPT course, got a 7.6 for the exam itself. All the necessary slides are also included with necessary explanations.

Preview 4 out of 41  pages

  • March 2, 2022
  • 41
  • 2020/2021
  • Lecture notes
  • Lisa beinborn
  • All classes
avatar-seller

Natural Language Processing
Technology
Created @March 24, 2021 2:40 PM

Class S5

Type S5

Materials



Lecture 1
Introduction
NLP:

represents language in a way that a computer can process it → representing input

Process language in a way that is useful for human → generating output

understanding language structure and language use → computational modelling



Analyzing Language

linguistic pre-processing steps

standardizing the input

normalization and cleaning

remove layout (paragraphs, underlined, bold, italics)

remove/replace emojis and urls (making URLS or sth like that)

replace numbers with NUM

anonymization: replacing phone numbers/paswords

unless you need them!

Casing: uppercase vs lowercase vs true case (for example keeping uppercase by names but
not sentence beginnings)

sentence segmentation: What are indicators for sentence boundaries?

Linguistic pre-processing

fast developments: huge research places can now be done by just one package Python.

performance: very good for generic languages and problematic for domain-specific data or
small languages.




Natural Language Processing Technology 1

, word segmentation: how can i decompose a sentence into its words?

tokenization: all things, type: amount of different tokens

morphological analysis: lemmatization, sub-words, ... [read chapter 2]

morphological analysis

we want to decompose a word into their morphemes (as small as possible): unhappier-un-
happy-er (difficult in turkish for example)

highly challenging, because most languages contain many exceptions and morpheme
boundaries can be ambiguous

subwords:

frequent tokens are unique

less frequent tokens are decomposed into subwords

really statistical, not about linguistics by the approaches!




lemmatization: dictionary word happier/happiest → happy. there are ambiguities saw → see or
saw?

now we analyze the lemma




Natural Language Processing Technology 2

, Penn Treebank: 36 labels




Natural Language Processing Technology 3

, note on image above: left more complex, deeper structure




error propagation: dat de fout gemaakt in een van de stappen overvloeit naar de volgende stap

corpora and shared tasks

1) how did automated linguistic preprocessing become so good? tools were trained on manually
annotated corpora, tuned on development data and evaluated on test data. Machine learning and
neural networks boosted the performance and facilitated transfer across languages

nlpprogress.com → good to look for which process which package is the best




Natural Language Processing Technology 4

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller MeldaMalkoc. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for £5.24. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

76799 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy revision notes and other study material for 14 years now

Start selling
£5.24  1x  sold
  • (0)
  Add to cart