Exam (elaborations)

NLP Exam Questions and Answers

1 view 0 purchase

Module
NLP

Institution
NLP

[Show more]

Preview 3 out of 17 pages

View example

Uploaded on October 27, 2024
Number of pages 17
Written in 2024/2025
Type Exam (elaborations)
Contains Questions & answers

nlp exam questions and answers

Institution NLP
Module NLP

Zanaya

Member since 1 year 42 documents sold

£12.21

Also available in package deal from £29.21

Added

Add to cart

Add to wishlist

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Also available in package deal (1)

NLP Exam Package deal.

£ 72.02 £ 29.21 6 items

1. Exam (elaborations) - Nlp exam questions and answers
2. Exam (elaborations) - Nlp master practitioner certification test questions and answers
3. Exam (elaborations) - Nlp practitioner certification test questions and answers
4. Exam (elaborations) - Nlp practitioner level test questions and answers
5. Exam (elaborations) - Nlp practitioner exam questions and correct answers
6. Exam (elaborations) - Nlp practitioner exam questions and answers
Show more

NLP Exam Questions and Answers

types and tokens - Answer-• Types: The unique words or n-grams in the text (i.e. vocabulary)

• Tokens: The words or n-grams in the text

• Relevancy: When designing NLP systems, we need to consider the relationship between types and
tokens e.g. for Information Retrieval systems, the size of the vocabulary will determine how much it
takes to store the inverted index

Zipf's Law - Answer-• If you rank words into decreasing frequency order, then plot using log scales the
word rank (r) on the x axis against frequency (f) on the y axis, you end up with a straight line plot which
is called Zipf's Law

• i.e. f . r = k (for some constant k) (2 marks for correct formula)

• Relevancy: Lots of words will be very frequent (good for training purposes) but also lots of words will
occur only once or twice (not so good)

Heap's Law - Answer-• Growth of vocabulary (number of types) is proportional to the number of tokens

• Typically V = kNB where V is the number of types, N is the number of tokens and B is a constant (2
marks for correct formula)

• Relevancy: There will always be new words no matter how large the training text

Zero frequency problem - Answer-• high chance that many words will never occur in the training corpus

• mention of Zipf's Law and/or vocabulary growth (typically V = kNB )

,• Relevancy: There may not be examples in the training data, so we have to do something to cope with
OOV (Out of Vocabulary) words

Sparse data problem - Answer-• Many words occur very infrequently

• i.e. high chance that they will never occur in the training corpus

• mention of Zipf's Law and/or vocabulary growth (typically V = kNB )

• Relevancy: We need lots of training data to overcome this problem to ensure we have enough to train
our statistical models

N-grams - Answer-• A sequence of N words, characters or symbols

• Relevancy: Most statistical NLP systems use an N-gram approach

Discuss the adequacies and inadequacies of using N-grams for Natural Language Processing - Answer-
Common objections and strengths:

• Since we do not ourselves assign probabilities, why should machines (but not so clear this is the case)

• Models are crude word-counting affairs

• But one must distinguish between statistical models and statistical methods

• Problems with sparse data

• Inability to take account of burstiness of words

• Failure to capture unbounded dependencies

• But are mathematically well-grounded

• Provide empirical means for predicting the most likely interpretations

• Have a learning component

• Requires little or no knowledge of the semantics of the domain

• Rule-based approaches are too brittle to deal with a variety of constructions

In the following sentence, identify the nouns, verbs, prepositions, and the noun phrases:

, • Now is the time for all good men to come to the aid of their party. - Answer-Now/RB is/VBZ the/DT
time/NN for/IN all/DT good/JJ men/NNS to/TO come/VB to/TO the/DT aid/NN of/IN their/PRP$
party/NN

So, parts of speech are as follows:

Nouns: time, men, aid, party (2 marks)

Prepositions: for, of (1 mark)

Verbs: is, come (1 mark)

Noun phrases: the time, all good men, the aid, their party (2 marks)

Describe two different approaches to automatically annotating text with parts of speech. - Answer-
Approach 1: Rule-based tagging (1 mark)

- Uses hand-written set of rules to assign the tags to words (1 mark).

-Typically more than 1000 hand-written rules, but may be machine-learned.

Approach 2: Stochastic or statistical / n-gram-based tagging (1 mark)

- Based on probability of certain tag occurring given previous tags and words as context (0.5 marks)

- Requires a training corpus (0.5 marks)

The following text is being used as a training corpus:

• a man a man a plan a plan a canal a canal panama panama

Build a word-based bigram model from this training corpus. Show all the bigrams (the word and its
prediction), their frequency (token) counts and their type counts. - Answer-prediction count types

a → man 2 3

a → plan 2 3

a → canal 2 3

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

You can quickly pay through credit card for the summaries. There is no membership needed.

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller Zanaya. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for £12.21. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

64438 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy revision notes and other study material for 14 years now

Start selling

Popular BTEC subjects

Popular courses for AQA

Popular courses for CCEA

Popular courses for WJEC

Popular courses for OCR

Popular courses for CIE

All qualifications

Popular books for Law and Public Services

Popular books for Medicine, Health and Social Sciences

Popular books for Technological and Physical Sciences

Popular books for Arts, Humanities and Cultures

Popular books for Business and Economics

Popular books for Environment and Biology

Popular Universities

Popular schools

Seller

Exam (elaborations)

NLP Exam Questions and Answers

Document information

Subjects

Written for

Seller

Reviews received

Content preview

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Quick and easy check-out

Focus on what matters

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?