Examen

Data Engineering exams answers 2024 Unantwerpen

1 fois vendu

Cours
Data engineering

Établissement
Data Engineering

100-page&76 questions and deliberated answers with detailed explanation. from data engineering class in university of antwerp faculty of business and economics, master of business economics. Professor Len Feremans. 2024 new version.

[Montrer plus]

Aperçu 4 sur 109 pages

Voir l'exemple

Publié le 22 juin 2024
Nombre de pages 109
Écrit en 2023/2024
Type Examen
Contient Questions et réponses

thaboty Membre depuis 1 année 5 documents vendus

€10,49

Ajouter au panier

Ajouter au liste de veux

Garantie de satisfaction à 100%
Disponible immédiatement après paiement
En ligne et en PDF
Tu n'es attaché à rien

DE
标记准则：
序列号
1.绿色表示比较理解
1.洋红表示不是理解的很清楚，要再花时间

标题全体
完成背诵
背诵不熟

Lec 1 intro
1. What is a data pipeline? What are the different types of data processing and what is the
role of the data engineering in its development? Give an example of data pipeline in e-
commerce.
• A data pipeline is a method in which raw data is extracted from various data sources,
transformed and then loaded to a data store (ETL), such as a data lake or data
warehouse, acting as a central repository where data is stored and made available for
analysis. Data pipelines can incorporate machine learning models to enhance data
processing and analysis, providing more advanced insights and predictive capabilities.

• Three types of Data processing is either:
○ Real-time processing= online processing
○ Streaming processing= near real-time
○ offline/batch processing
And data engineering ensure processing is
○ Scalable := support huge amounts of data 可扩展的
○ Reliable / Available := minimize downtime and ensure operationally robust 可靠的
○ Maintainable := support continuous changes 可维护的
• Example e-commerce pipeline:
- Data Sources:
The e-commerce site gathers data from various sources such as sales transactions,
user interactions, inventory systems, and customer feedback.

1

, - Extraction:
The data engineer sets up connectors to pull data(user, transaction, view, clicks…)
from databases, APIs, and log files. For example, extracting transaction data from an
online store's database and user behavior data from web logs.
- Transformation:
Data is cleaned and transformed to ensure consistency. For instance, transforming
date formats, aggregating sales data by product category, and filtering out incomplete
records.
- Loading:
The transformed data is loaded into a data warehouse.
- Analysis and Reporting:
Data analysts and business intelligence tools access the data warehouse to generate
reports, dashboards, and insights, such as sales trends, customer behavior analysis,
and inventory forecasting.

2. What is the three-tier architecture? Describe the function and common technologies used
in each layer. Give an example of a three-tier architecture pipeline in e-commerce.

(主要是产品，用户，交易信息…)
Three-tier architecture is a well-established software application architecture that organizes
applications into three logical and physical computing tiers.

• Presentation Tier: This is the top-most layer of the application, often referred to as
the user interface (UI).
- The main function: present data to users & interpret commands users provide
through the interface. 负责用户的信息展示&识别用户交互信息
- Tech: html, java
- Example: In an e-commerce site, this layer would be the web pages where users
browse products, add items to their cart, and check out. 呈现搜索的产品，用户
加购结账等的页面
• Business layer/ application Logic tier: this tier sits in the middle. It manages the
application’s operations by processing commands, making logical decisions, and
performing calculations.
- Function: Coordinates up and down layers: retrieving and processing data &
sending results back to presentation tier & further to the data tier for storage. 和
上下两层交互，协调数据从库中的提取，并反馈给展示层
- Tech: python

2

, - Example: handle operations such as adding items to the shopping cart, processing
payments, and managing user authentication. 负责加购、支付等过程的有序完
成
• Data tier: The lowest layer in this architecture. Information is stored in databases or
file systems and is accessed by the logic tier. This tier is responsible for maintaining
data integrity and security. It provides the logic tier with data so it can process and
then eventually return results to the user.
- Main function: data storage and retrieval, maintain data integrity and security. 保
障数据的储存和真实
- Tech: SQL
- Example: In an e-commerce site, this layer would store product details, user
information, order history, and other transactional data. 产品，用户，订单等信
息的底层储存

3. Give three reasons why an organization would collect large datasets. Briefly discuss the
strengths (personalization, optimization of the supply chain, data-driven decision-making)
and challenges (big data, latency) of data-intensive applications. Give an example in e-
commerce.
• Why collect large datasets? Why?
- Enhance customer engagement: Algorithms analyze data online and make
personalized recommendations.
> Data selected from user side can deal with questions like “Which products are
highly popular or trending? Which products are relevant for a specific
customer?”→ good for personalization and customer engagement.
- Optimisation of supply chain and daily operational activities.
> Data on administration, stock management, shipping, payments, delivery…can
be leveraged to generate values for supply chain management and daily
operation.
- Support data-driven decision making of management.
> With the help from dashboards, OLAP and data mining approaches, managers
can make better decision.
• Discussion Strength (above)
• Discussion Challenges:
- How to better store and manage data? Billions of product, users, transactions,
media data require good data management capabilities.
- How to maintain Rapid response time (or low latency).
• For an e-commerce case, the strength of collect user data is mainly: enhance user
engagement and satisfaction through personalization recommendation; data on
administration, stock management, shipping, payments, delivery…can generate
insights and facilitate partnership with supply chain partners; with a dashboard at
hand, managers can know questions like “compare the sales volumes for a specific
products in different areas or time”, thus deciding promotions.
4. What is a relational database? Define and explain the following terms and give an example
of each: Entity-relation diagram; 1-1, 1-n and n-n relations; Relational model; Online
transactional processing; SQL.

3

, • A relational database is a type of database that stores and provides access to data
points that are related to one another. Data in a relational database is organized into
tables (also known as relations) which consist of rows and columns. Each table has a
unique key that identifies its records.
• Entity-Relation Diagram (ERD): An ERD is a visual representation of the entities within
a system and the relationships between those entities. Example: An ERD might show
entities such as Employee, Department, Project, and Location, with relationships
indicating how employees work on projects, departments are located at locations, etc.

•
• Relational model: The relational model is a theoretical framework for organizing data
into collections of tables (relations) with rows (tuples) and columns (attributes). It
defines how data should be structured and how relationships between data should be
handled. Each table has rows (or records) and columns. Each table has a unique key.
Tables link with each other.
• OLTP: OLTP systems are designed to manage transaction-oriented applications. They
handle large numbers of short online transactions, ensuring data integrity in multi-
access environments. Example: An e-commerce system where customers browse
products, add items to the cart, and complete purchases. Each action (browsing,
adding to cart, purchasing) is a transaction processed by the OLTP system.

4

Les avantages d'acheter des résumés chez Stuvia:

Qualité garantie par les avis des clients

Les clients de Stuvia ont évalués plus de 700 000 résumés. C'est comme ça que vous savez que vous achetez les meilleurs documents.

L’achat facile et rapide

Vous pouvez payer rapidement avec iDeal, carte de crédit ou Stuvia-crédit pour les résumés. Il n'y a pas d'adhésion nécessaire.

Focus sur l’essentiel

Vos camarades écrivent eux-mêmes les notes d’étude, c’est pourquoi les documents sont toujours fiables et à jour. Cela garantit que vous arrivez rapidement au coeur du matériel.

Foire aux questions

Qu'est-ce que j'obtiens en achetant ce document ?

Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.

Garantie de remboursement : comment ça marche ?

Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.

Auprès de qui est-ce que j'achète ce résumé ?

Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur thaboty. Stuvia facilite les paiements au vendeur.

Est-ce que j'aurai un abonnement?

Non, vous n'achetez ce résumé que pour €10,49. Vous n'êtes lié à rien après votre achat.

Peut-on faire confiance à Stuvia ?

4.6 étoiles sur Google & Trustpilot (+1000 avis)

69411 résumés ont été vendus ces 30 derniers jours

Fondée en 2010, la référence pour acheter des résumés depuis déjà 15 ans

Commencez à vendre!

Examen

Data Engineering exams answers 2024 Unantwerpen

Infos sur le Document

Sujets

École, étude et sujet

Vendeur

Avis reçus

Aperçu du contenu