100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
NLP - Chapter 11 Questions and Answers 2024 $12.99   Add to cart

Exam (elaborations)

NLP - Chapter 11 Questions and Answers 2024

 4 views  0 purchase
  • Course
  • Institution

Exam of 2 pages for the course NLP at NLP (NLP - Chapter 11)

Preview 1 out of 2  pages

  • October 31, 2024
  • 2
  • 2024/2025
  • Exam (elaborations)
  • Questions & answers
avatar-seller
NLP - Chapter 11

Transformers decoder - answer takes its own previous output and the encoder's output
as input

BERT - answer Bidirectional Encoder Representations from Transformers

BERT Paper - answer Pré-training of deep bidirectional transformers for language
understanding

How are input embeddings calculated in BERT? - answer the sum of the token
embeddings, the segmentation embeddings, and the position embeddings

BERT_BASE and BERT_LARGE – answer BERT_BASE has 12 layers, hidden size of
768, and 110M total parameters. BERT_LARGE has 24 layers, hidden size of 1024,
and 340M total parameters

difference between Bert and Distilbert - answerDistilbert is a distilled version of Bert,
with a reduced number of layers, parameters, and weights. It aims to maintain 97% of
Bert's performance while reducing its size by 44%. Both use a vocabulary of 30K
subwords.

Bidirectional encoders - answerhave access to tokens both before and after the current
one, allowing for greater context understanding

masked language model (MLM) - answerInstead of next word prediction, we use a cloze
task where a word is masked and we ask what word fits best. BERT training

What is the purpose of the next sentence prediction (NSP) task in training BERT? -
answerThe NSP task determines if a given sentence naturally follows another sentence.

Explain how contextual embeddings are extracted from BERT. - answerwe feed input
tokens into a bidirectional self-attention model like BERT, and the output layer provides
the contextual vectors for each token.

pretraining - answerspend a long time training the LM on massive corpora; roughly 40
epochs (passes over full training set) minimum

domain-adaptive pretraining - answerafter pretraining, continue pretraining using a
domain-specific corpus

keep pretraining - answermany studies show that continuing pretraining beyond
standard limits still shows downstream benefits

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller julianah420. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $12.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

67474 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$12.99
  • (0)
  Add to cart