editor
blog

About: editor

Website:
Bio:

Posts by editor:

Scalable Partition Handling for Cloud-Native Architecture in Apache Spark 2.1

Posted on: 20 Jan 2018

Apache Spark 2.1 is just around the corner: the community is going through voting process for the release candidates. This […]

Apache Spark @Scale: A 60 TB+ production use case from Facebook

Posted on: 20 Jan 2018

This is a guest Apache Spark community blog from Facebook Engineering. In this technical blog, Facebook shares their usage of Apache […]

Apache Spark 2.0 Preview: Machine Learning Model Persistence

Posted on: 20 Jan 2018

An ability to save and load models across languages Introduction Consider these Machine Learning (ML) use cases: A data scientist […]

Structured Streaming In Apache Spark

Posted on: 20 Jan 2018

A new high-level API for streaming Apache Spark 2.0 adds the first version of a new higher-level API, Structured Streaming, […]

Introducing Apache Spark 2.0

Posted on: 20 Jan 2018

Now generally available on Databricks Today, we’re excited to announce the general availability of Apache Spark 2.0 on Databricks. This release builds […]

Introducing GraphFrames

Posted on: 20 Jan 2018

We would like to thank Ankur Dave from UC Berkeley AMPLab for his contribution to this blog post. Databricks is excited […]

Introducing Apache Spark Datasets

Posted on: 20 Jan 2018

To learn more about Apache Spark, attend Spark Summit East in New York in Feb 2016. Developers have always loved Apache […]

A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets

Posted on: 20 Jan 2018

When to use them and why Of all the developers’ delight, none is more attractive than a set of APIs […]

Apache Spark as a Compiler: Joining a Billion Rows per Second on a Laptop

Posted on: 20 Jan 2018

When our team at Databricks planned our contributions to the upcoming Apache Spark 2.0 release, we set out with an […]

Apache Kafka for Beginners

Posted on: 20 Jan 2018

When used in the right way and for the right use case, Kafka has unique attributes that make it a […]

KPL Java Sample Application

Posted on: 20 Jan 2018

Setup You will need the following: A stream to put into (with any number of shards). There should be no […]

Amazon Kinesis tutorial – a getting started guide

Posted on: 20 Jan 2018

Of all the developments on the Snowplow roadmap, the one that we are most excited about is porting the Snowplow […]

Transactions in Apache Kafka

Posted on: 20 Jan 2018

In a previous blog post, we introduced exactly once semantics for Apache Kafka®. That post covered the various message delivery semantics, […]