100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
NLP. Week 3-4 Questions and Answers 100% Solved $13.99   Add to cart

Exam (elaborations)

NLP. Week 3-4 Questions and Answers 100% Solved

 5 views  0 purchase
  • Course
  • Institution

Exam of 23 pages for the course NLP at NLP (NLP. Week 3-4)

Preview 3 out of 23  pages

  • October 31, 2024
  • 23
  • 2024/2025
  • Exam (elaborations)
  • Questions & answers
avatar-seller
NLP. Week 3-4

Why is language a temporal phenomenon? - answer Spoken language is a sequence of
acoustic events over time

We comprehend and produce both spoken and written language as a continuous input
stream.

What is the problem with a simple feedforward sliding-window? - answer Limited context
-> Anything outside the context window has no impact on the decision
-> Yet many language tasks require access to info that can be arbitrarily distant from the
current word

- Windows make it difficult for NN to learn systematic patterns arising from constituency
or compositionality
-> the way the meaning of words in phrases combine together

Which deep learning architectures can deal with the problems associated with the
temporal nature of language? - answer1) recurrent neural networks
- offers a new way to represent the prior context
-> allows the model's decision to depend on information from hundreds of words in the
past.

2) transformer networks
- helps represent time and focus on how words relate to each other over long distances

What are probabilistic language models? - answerprobabilistic language models predict
the next word in a sequence given some preceding context.

Example: if the preceding context is "Thanks for all the" and we want to know how likely
the next word is "fish"
-> P(fish|Thanks for all the)

they assign such a conditional probability to every possible next word
-> giving us a distribution over the entire vocabulary

we evaluate language models by examining how well they predict unseen text.
-> We use perplexity to measure the quality of a language model.

What is a recurrent neural network (RNN)? - answerany network that contains a cycle
within its network connections
-> the value of some unit is directly, or indirectly, dependent on its own earlier outputs
as an input.

,Powerful, however difficult to reason about and to train.
-> However, some constrained architectures of RNN's are extremely effective when
applied to language.
->> Elman Networks or simple recurrent networks.

Elman Networks or simple recurrent networks. - answerbasis for more complex
approaches like LSTM networks
-> see pic

sequences are processed by presenting one item at a time to the network.
-> different from window-based approach

key difference from a feedforward network lies in the recurrent link
-> augments input to the hidden layer with the value of the hidden layer from the
preceding point in time.
->> form of memory, or context

Critically, this approach does not impose a fixed-length limit on this prior context
-> the context embodied in the previous hidden layer can include info extending to the
beginning of the sequence.

How does recurrence in RNN's factor into the computation at the hidden layer? -
answerSee pic

new set of weights, U, connect the hidden layer from the previous time step to the
current hidden layer.
-> determine how the network makes use of past context in calculating the output for the
current input.

- As with the other weights in the network, these connections are trained via
backpropagation.

The hidden layers will have differing values over time.
-> However, the various weight matrices are shared across time.

How are RNN's trained? - answerObtain the gradients that adjust the weights by using a
training set, a loss function, and back- propagation.

3 sets of weights to update:
- W, the weights from the input layer to the hidden layer
- U, the weights from the previous hidden layer to the current hidden layer, and finally
- V, the weights from the hidden layer to the output layer

Two considerations we didn't have to worry about with backpropagation in feedforward
networks:

, 1) to compute the loss function for the output at time t we need the hidden layer from
time t − 1.

2) the hidden layer at time t influences both the output at time t and the hidden layer at
time t + 1
-> (and hence the output and loss at t + 1).

To train them we use Backpropagation Through Time

Backpropagation Through Time - answertwo-pass algorithm for training the weights in
RNNs.

1) we perform forward inference, computing ht , yt , accumulating the loss at each step
in time, saving the value of the hidden layer at each step for use at the next time step.

2) we process the sequence in reverse, computing the required gradients as we go,
computing and saving the error term for use in the hidden layer for each step backward
in time.

with modern computational frameworks and adequate computing resources, there is no
need for a specialized approach to training RNNs
-> unrolling a recurrent network into a feedforward computational graph eliminates
recurrences
->> allows weights to be trained directly
->> See pic

When input sequences are too long (e.g., character-level processing) unrolling entire
input sequence may not be feasible.
-> In these cases, unroll the input into manageable fixed-length segments
->> treat each segment as a distinct training item.

distributional hypothesis - answerThis link between similarity in how words are
distributed and similarity in what they mean

Words that occur in similar contexts tend to have similar meanings.

words which are synonyms (like oculist and eye-doctor) tend to occur in the same
environment
-> (e.g., near words like eye or examined)
-> with the amount of meaning difference between two words "corresponding roughly to
the amount of difference in their environments"

What are embeddings? How do they relate to the distributional hypothesis? -
answervector semantics build on this linguistic hypothesis by learning representations of
the meaning of words, called embeddings, directly from their distributions in texts.

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller julianah420. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $13.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

67096 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$13.99
  • (0)
  Add to cart