Samenvatting

Samenvatting Case Studies & Trends in Data Science

8 keer verkocht

Instelling
Universiteit Antwerpen (UA)

Dit document bevat de antwoorden op de mogelijke examenvragen van de professor.

[Meer zien]

Laatste update van het document: 1 jaar geleden

Voorbeeld 3 van de 18 pagina's

Bekijk voorbeeld

Geupload op 4 juni 2023
Bestand laatst geupdate op 7 juni 2023
Aantal pagina's 18
Geschreven in 2022/2023
Type Samenvatting

Volgen

audreyvanlierde Lid sinds 6 jaar 142 documenten verkocht

€6,49

In winkelwagen

Op verlanglijstje

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten

Case Studies and Trends in Data Science
Lecture 1 – James Hinns (UA) ..................................................................................................................... 3
1. Explain the problem of mode collapse within GANs, and using an example, describe how the
architecture can allow this to happen. Discuss how the CycleGAN architecture is more resistant to
such problems. ........................................................................................................................................ 3
2. In the context of CNNs (Convolutional Neural Networks), describe the following concepts and
how they interact with one another: Kernel, Feature Map, Activation Function and (Max) Pooling.
Referring to these concepts, discuss why CNNs are much more prevalent than MLPs for most vision
(image-based) machine learning tasks. .................................................................................................. 3
3. Identify and briefly explain three reasons why deep learning techniques are much more
prevalent now than when they were conceptualised? (2 lines answer) ................................................ 4
Lecture 2 – Véronique Van Vlasselaer (Customs fraud detection) ............................................................. 4
4. Draw an analytical decision process for fraud detection at customs. Incorporate following steps:
data enriching, whitelist/allowed-list, blacklist, business knowledge, taking action (intercepting/
inspecting a package). For each of the processes, mention if it should be done in real-time, or can be
done on a different time (also mention why you would or wouldn’t do this in real-time).................... 4
5. Should an AI take over the job of human custom inspection? Talk about ethical implications, as
well as technical drawbacks. What is the most fruitful reconciliation between the two? ..................... 5
6. What are some of the techniques used to monitor the model performance? .............................. 6
7. Fill in on following table which step(s) can be done in batch, and which need to be done in real-
time 6
Lecture 3 – Tim Waegeman (Robivision) .................................................................................................... 7
8. Describe 3 use cases of deep learning in retail & agriculture, as described by Tim Waegeman. .. 7
9. What are the data related challenges and solutions for these challenges in the Robovision smart
scale case?............................................................................................................................................... 7
10. Explain the need for dynamic AI vision. ...................................................................................... 8
11. What are 5 cases where Robovision AI solutions can be used given by Tim Waegeman? ......... 8
Lecture 4 – Galit Schmueli .......................................................................................................................... 8
12. Explain the business advantage of behavioral manipulation. Why would a company do this? . 8
Lecture 5 – Walter Daelemans (UA) ........................................................................................................... 9
13. What are the potential tasks of computer generated transitions between, speech, text and
images. Also list at least one application (real-life use case) for each transition. .................................. 9
14. What are the main criticisms on Large Language Models such as GPT, as discussed by Walter
Daelemans?............................................................................................................................................. 9
15. What is the main driver of the popularity of deep learning in NLP? ........................................ 10
16. What is an autoregressive Language Model? What is an example of a known autoregressive
language model? ................................................................................................................................... 10
17. What is emergence in the context of training large language models? ................................... 10
Lecture 6 – Kris Laukens (UA) ................................................................................................................... 10
18. Explain the difference between an experimental and computational approach to learn
protein interactions. ............................................................................................................................. 10
19. Why is it beneficial to use AI in order to predict protein interactions? Which problem does
this solve? What are the current challenges with this? How was this done previously? ..................... 10
1

, 20. Discuss how bioinformatics can improve vaccine development and discuss some ethical
implications. .......................................................................................................................................... 11
21. Explain the following figure: ..................................................................................................... 11
22. What is sequencing? ................................................................................................................. 12
23. Why is sequencing so much puzzle work? ................................................................................ 12
24. What are some biases in health AI and how can that be solved? ............................................ 12
25. What is bioinformatics? ............................................................................................................ 12
26. Do you think human doctors will replace algorithms in the future? ........................................ 12
Lecture 7 – Vinayak Javaly (CUNY) ............................................................................................................ 12
27. How was Large Language Model used in one of the case studies by Vinayak Javaly? (small).. 12
28. What are useful some skills a data scientist has to have, according to Vinayak Javaly? (small)
12
Lecture 8 – Annelies De Corte (KPMG) ..................................................................................................... 13
29. What are the six discussed critical success factors in becoming a more data-driven
organisation,according to Annelies De Corte? For at least three of them, provide a business example.
13
30. Explain the difference between data usage and data management. ....................................... 13
31. What was the issue with the case from Vlaamse Waterwegen as explained by Annelies? ..... 14
32. What is the difference between data and IT, as explained by Annelies De Corte? .................. 14
Lecture 9 – Agata Bak-Geerinck (Telenet) ................................................................................................ 14
33. How does Telenet use data to make predictions about football supporters? ......................... 14
34. Discuss the advanced advertising use case with data science at Telenet. What data is used,
which methods? .................................................................................................................................... 15
35. What kind of data does Telenet have about their customers to make predictions on? .......... 15
36. How did Telenet check the accuracy of their football team prediction model?....................... 15
Lecture 10 – Steven Latré (UA) ................................................................................................................. 15
37. Discuss the safety use cases for the use of AI for cycling, as discussed by Steven Latré. Provide
details on the data used, what the output is, and how it helps safety for the UCI. ............................. 15
38. Explain how AI powers the “Weekly lactate test”, as discussed by Steven Latré. .................... 16
39. What is the use case and data used for AI in hockey? (small) .................................................. 16
Lecture 11 – Kevin Mets (UA) ................................................................................................................... 16
40. What is Reinforcement learning, discuss the components of a MDP, and apply to the case of
Deep Q-Networks? ................................................................................................................................ 16
41. Discuss the value-based, policy-based and actor-critic methods with their advantages and
disadvantages, and provide an example for each category.................................................................. 17
42. How does Reinforcement Learning differ from Supervised and Unsupervised Learning? (small)
18
43. What is a delayed reward? ....................................................................................................... 18
44. Why are simulators often used in Reinforcement Learning? ................................................... 18

2

, Lecture 1 – James Hinns (UA)
1. Explain the problem of mode collapse within GANs, and using an example, describe how the
architecture can allow this to happen. Discuss how the CycleGAN architecture is more
resistant to such problems.

GANs (Generative Adversarial Networks) are semi self-supervised and uses 2 neural networks, a
Generator and a Discriminator. The Generator will create fake samples and the Discriminator will classify
the fake and the real samples. The idea is to train both, until the Generator is good enough to fool the
Discriminator, leading to a positive evaluation.
GANs are models that can generate realistic data, but a common problem is mode collapse, where the
Generator produces repetitive outputs. This happens when the Generator fails to capture the full
diversity of the data distribution, leading to a collapse into a single mode or a small set of samples. The
Generator will always use the same result that fools the discriminator.

An example of this is in generating numbers, instead of feeding the Discriminator with a variation of
different numbers, the Generator only feeds the discriminator with the number 6 and manages to fool
the Discriminator. The Discriminator fails to learn the full range of possibilities, leading to mode collapse.

CycleGAN architecture is introduced to address this. CycleGAN involves the use of two GANs working
together on unpaired datasets (that don’t overlap), such as horses in one domain and zebras in another.
- Generator 1 will generate a zebra image based on a horse image. Discriminator 1 then decides
whether it is a real zebra or a generated one.
- Generator 2 translates the generated zebra image by Generator 1, back to the original horse image.
We call this the cyclic image. We can now calculate the identity loss by comparing the original horse
image and the cyclic image.
- This method encourages Generator 1 to take the original input (the images of the horses) and
prevents it from simply generating images that fool the discriminator.

2. In the context of CNNs (Convolutional Neural Networks), describe the following concepts
and how they interact with one another: Kernel, Feature Map, Activation Function and
(Max) Pooling. Referring to these concepts, discuss why CNNs are much more prevalent than
MLPs for most vision (image-based) machine learning tasks.

Kernel or filter: a small matrix that scans the input data and looks for specific
patterns/features by multiplying its weights with the input values.

Feature Map: After applying the kernel over the picture, we get a feature map. Each
value in the feature map represents the presence or absence of certain features or
patterns in the input data.

(We slide the matrix/kernel over the input image and calculate the number that you put in the feature map
and than we repeat it over and over until it covers the whole image.)
Activation Function: introduces non-linearity to the network’s decision-making process (to CNN model) and
allows the network to learn complex patterns and make more accurate predictions based on the input data.
For example, in image recognition, an activation function helps a neuron decide whether a certain visual
feature, like a shape, is present in an image. The activation function makes it possible for the neuron to
activate when it detects the desired features and remain inactive when it doesn't.
3

Dit zijn jouw voordelen als je samenvattingen koopt bij Stuvia:

Bewezen kwaliteit door reviews

Studenten hebben al meer dan 850.000 samenvattingen beoordeeld. Zo weet jij zeker dat je de beste keuze maakt!

In een paar klikken geregeld

Geen gedoe — betaal gewoon eenmalig met iDeal, Bancontact of creditcard en je bent klaar. Geen abonnement nodig.

Focus op de essentie

Studenten maken samenvattingen voor studenten. Dat betekent: actuele inhoud waar jij écht wat aan hebt. Geen overbodige details!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper audreyvanlierde. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €6,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 69411 samenvattingen verkocht

Opgericht in 2010, al 15 jaar dé plek om samenvattingen te kopen

Start met verkopen

Samenvatting

Samenvatting Case Studies & Trends in Data Science

Document informatie

Onderwerpen

Geschreven voor

Verkoper

Ontvangen beoordelingen

Voorbeeld van de inhoud