What does the Lakehouse consist of? - answer-The lakehouse brings the scalability and cost effectiveness of data lakes with the reliability and performance of data warehouses
How does Databricks differentiate from Snowflake? - answer-Data Science/Machine Learning
Data Ingestion
ETL
Streaming...
Databricks Competition Study
What does the Lakehouse consist of? - answer-The lakehouse brings the scalability and cost
effectiveness of data lakes with the reliability and performance of data warehouses
How does Databricks differentiate from Snowflake? - answer-Data Science/Machine Learning
Data Ingestion
ETL
Streaming Capabilities
Data Sharing
With Snowflake, you will have high costs, limited capabilities, limited unstructured data
support, inefficient engineering and vendor lock in
For Data Science- how does Databricks differ from Snowflake? - answer-Snowflake is built for
SQL workloads. They either use Snowpark or rely on 3rd party tools for DS and ML-like Dataiku
or DataRobot
Databricks can allow your data scientist to use any libraries/languages in the same platform as
their DE, and manage the end-to-end ML pipeline with MLflow
For Data Ingestion- how does Databricks differ from Snowflake - answer-With Snowflake
ingestion go through various Snowflake stages, often needing to utilize limited SQL pipelines or
Snowpipe which is inefficient and expensive
2) Snowflake tax - egress tax moving data to and from and as that data is being stored
Databricks - ingestion is simple with AutoLoader where data is automatically transformed into
Delta Tables
For ETL- how does Databricks differ from Snowflake - answer-Snowflake has no ETL tools rely
heavily on 3rd party vendors which increases cost and complexity
Databricks - Delta Live Tables
What is Delta Live Tables? - answer-Delta Live Tables is the first ETL framework that allows you
to build reliable data pipelines (accelerate ETL development)
- automatically manages your infrastructure at scale so data analysts and engineers can spend
less time on tooling and focus on getting value from data.
DLT- fully supports Python and SQL and works with both batch and streaming
DLT - manages task oircehstration, cluster management, monitoring, data quality and error
handling
supports CDC (change data capture)
Streaming Capabilities- how does Databricks differ from Snowflake - answer-Databricks reads in
any streaming services while Snowflake only supports Kafka and was not designed for high
velocity data but rather structured data at rest
, What Databricks features helps out with observability and governance? - answer-Expectations-
help prevent bad data from flowing into tables, track data quality over time and provide tools
to troubleshoot bad data with granular pipeline observability so you get a high fidelity lineage
diagram of your pipeline, track dependencies and aggregate data quality metrics across all of
your pipeline.
For Data Sharing- how does Databricks differ from Snowflake? - answer-Snowflake's data
format is proprietary, users can only share data with other Snowflake accounts (vendor lock in)
-Snowflake would take the data from your cloud storage, conduct transformations and then
push the data back so you are paying an egress tax to and from and as that data is going to be
stored
-pay for compute to send data
Delta Sharing - an open standard for data sharing, no replication of datasets
How does Delta Lake differ Iceberg? - answer-1) Overall performance - loading and querying
data is 3.5x faster
2) Load Performance- load from Parquet to intended formats (delta is faster)
3) Query performance - Delta is 4.5x faster
What are Data cleanrooms? - answer-It's a secure environment to run computations on joint
data
Run any computation on Python, SQL, R, or java
No data replications
Scalability
What is Databricks Marketplace? - answer-open marketplace for data solutions, built on Delta
Sharing
Consist of
Notebooks
Data files, Data Tables
Solution Accelerators
ML Models
Dashboards
Why have data marketplaces seen limited use? - answer-Closed platforms (one per vendor)
Limited to just datasets
What is Project Lightspeed? - answer-Faster and simpler stream processing
1) predictable low latency
2) enhanced functionality
3) Operations and Troubleshooting
4) Connectors & Ecosystem
Les avantages d'acheter des résumés chez Stuvia:
Qualité garantie par les avis des clients
Les clients de Stuvia ont évalués plus de 700 000 résumés. C'est comme ça que vous savez que vous achetez les meilleurs documents.
L’achat facile et rapide
Vous pouvez payer rapidement avec iDeal, carte de crédit ou Stuvia-crédit pour les résumés. Il n'y a pas d'adhésion nécessaire.
Focus sur l’essentiel
Vos camarades écrivent eux-mêmes les notes d’étude, c’est pourquoi les documents sont toujours fiables et à jour. Cela garantit que vous arrivez rapidement au coeur du matériel.
Foire aux questions
Qu'est-ce que j'obtiens en achetant ce document ?
Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.
Garantie de remboursement : comment ça marche ?
Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.
Auprès de qui est-ce que j'achète ce résumé ?
Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur TOPDOCTOR. Stuvia facilite les paiements au vendeur.
Est-ce que j'aurai un abonnement?
Non, vous n'achetez ce résumé que pour €11,68. Vous n'êtes lié à rien après votre achat.