100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Summary Interactive Data Transforming Lecture 4 | Master Data Science & Society €2,99
In winkelwagen

Samenvatting

Summary Interactive Data Transforming Lecture 4 | Master Data Science & Society

 0 keer bekeken  0 keer verkocht

Summary of Interactive Data Transforming. This is based on the lectures they give for the Master Data Science and Society in Tilburg University

Voorbeeld 2 van de 7  pagina's

  • 21 december 2024
  • 7
  • 2024/2025
  • Samenvatting
Alle documenten voor dit vak (10)
avatar-seller
iuk
Interactive Data Transforming | Lecture 4


In the early days of databases (using DBMS), data was stored using file systems, but this approach had
many issues (which was explained in lecture 1).

Following Developments  RDBMS.
It’s designed to take care of DBMS drawbacks/inefficiencies. Data is stored in the form of tables and
maintains the relationships among tables.




Big Data refers to large and complex data sets that require advanced methods of processing to
uncover valuable insights and help with decision-making. Here’s a breakdown of the key
characteristics of Big Data, known as the “Vs”:

1. Volume
Amount of generated and stored data. For example, Wikipedia has millions of articles across
different languages, contributing to massive data storage needs.
2. Velocity
This describes the speed at which data is created, collected, and processed. For example,
Wikipedia has thousands of editors making constant updates, adding to the continuous flow
of new data.
3. Variety
Data comes in many forms, from structured data (like RDBMS) to unstructured data (like
social media posts, which can include text, images and videos).

Big Data is data that’s high volume, high velocity and high variety information assets that require new
forms of processing to enable enhanced decision making and insight discovery.

For smaller datasets we just use RDBMS. Big Data analytics is about dealing with massive amounts of
data in ways that traditional methods, like using databases (SQL), can’t handle efficiently. Solution is
to compromise: instead of perfects answers, we focus on patterns, trends, and the most important
information (e.g. top results or partial answers). Integral parts of Big Data analytics:

 Interactive Processing
Users are involved in the data analysis. They give feedback or opinions during the process.
This helps the system make decisions, as users understand the problem and guide the
analysis.

,  Approximate Processing
Instead of analyzing all the data, the system looks at a sample that represents the whole. This
method gives approximate answers, not exact ones, but it’s much faster. For example, if 95%
of people have similar behavior, the system will assume this is representative.
 Crowdsourcing Processing
Complex tasks are given to groups of people to solve. Humans fill in surveys for example, in
exchange for small payments. Challenges: deciding what to ask, how to ask, and how to
handle different or conflicting answers.




 Progressive Processing
The system starts showing results as soon as possible, even if it hasn’t processed all the data.
This is useful when time or computing power is limited, so users can see early results and
work with them.




 Incremental Processing
Since Big Data is constantly changing, results from earlier processing can quickly become
outdated. This method allows systems to update their results when new data comes in,
correcting or completing earlier analyses.

Ensuring transparency, providing clear explanations, and making results interpretable are key to
getting users to trust and accept new technologies, even when data challenges affect the results.

RDBMS have limitations when it comes to handling the massive amounts of data and demands we
face today. Here’s why people are moving beyond traditional RDBMS:

1. Data Growth: The amount of data keeps growing rapidly, and RDBMS often struggles
to keep up.
2. User Expectations: Users expect faster access to more complex data, which puts
pressure on databases.
3. Scaling Limitations: RDBMS can only add more resources like memory or storage to a
single server, but it can’t add more servers which limits the ability to handle large
amounts of data.
Scaling is the ability of a system to handle increasing amounts of data (by adding
more servers or storage for example)

Here are some alternatives to Traditional RDBMS:

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper iuk. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €2,99. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 52928 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€2,99
  • (0)
In winkelwagen
Toegevoegd