Data Engineering

Data Engineering

157 bookmarks
Newest
Create a Delta Table in S3 using Rust
Create a Delta Table in S3 using Rust
See how to write Rust code to create a Delta Table in S3 and add data to it, using your local Windows development environment.
·blog.det.life·
Create a Delta Table in S3 using Rust
The Hitchhiker's Guide to Delta Lake Streaming
The Hitchhiker's Guide to Delta Lake Streaming
This session will provide answers for some of the biggest questions in the universe: namely, how to take full advantage of Delta Lake streaming. You will be ...
·m.youtube.com·
The Hitchhiker's Guide to Delta Lake Streaming
Navigating the data lake using Rust - Part One | Cuusoo
Navigating the data lake using Rust - Part One | Cuusoo
Most data engineers correlate delta format with Spark and Databricks. That's not true. Delta can be used by so many other tools and most cloud providers have added delta support to their analytics tools. In this post we will see how to use delta from a Rust client.
·cuusoo.com.au·
Navigating the data lake using Rust - Part One | Cuusoo
Deploy a Delta Sharing Server on Azure
Deploy a Delta Sharing Server on Azure
If you’ve been following along in this series, we’ve previously deployed a Delta Sharing server on AWS. Providing a similar tutorial for…
·medium.com·
Deploy a Delta Sharing Server on Azure
(1) Data Modeling for Mere Mortals – Part 1: What is Data Modeling?! | LinkedIn
(1) Data Modeling for Mere Mortals – Part 1: What is Data Modeling?! | LinkedIn
In recent years, I’ve done dozens of training on various data platform topics, for all kinds of audiences. When teaching various data platform concepts and techniques, I find one of the concepts particularly intimidating for many business analysts, especially those who are just starting their journe
·linkedin.com·
(1) Data Modeling for Mere Mortals – Part 1: What is Data Modeling?! | LinkedIn
Add Jar to standalone pyspark
Add Jar to standalone pyspark
I'm launching a pyspark program: $ export SPARK_HOME= $ export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.9-src.zip $ python And the py code: from pyspark import SparkContext,
.config('spark.jars.packages', 'org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1')
·stackoverflow.com·
Add Jar to standalone pyspark
Optimizing Apache Spark™ on Databricks - Databricks
Optimizing Apache Spark™ on Databricks - Databricks
In this course, we will explore the vast majority of performance problems in an Apache Spark application: skew, spill, shuffle, storage, and serialization.
·databricks.com·
Optimizing Apache Spark™ on Databricks - Databricks