Deep Learning on Databricks

Integrating with TensorFlow, Caffe, MXNet, and Theano We are excited to announce the general availability...

Scalable Partition Handling for Cloud-Native Architecture in Apache Spark 2.1

Apache Spark 2.1 is just around the corner: the community is going through voting process...

Apache Spark @Scale: A 60 TB+ production use case from Facebook

This is a guest Apache Spark community blog from Facebook Engineering. In this technical blog, Facebook...

Apache Spark 2.0 Preview: Machine Learning Model Persistence

An ability to save and load models across languages Introduction Consider these Machine Learning (ML)...

Structured Streaming In Apache Spark

A new high-level API for streaming Apache Spark 2.0 adds the first version of a...

Introducing Apache Spark 2.0

Now generally available on Databricks Today, we’re excited to announce the general availability of Apache Spark...

Introducing GraphFrames

We would like to thank Ankur Dave from UC Berkeley AMPLab for his contribution to this...

Introducing Apache Spark Datasets

To learn more about Apache Spark, attend Spark Summit East in New York in Feb 2016....

A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets

When to use them and why Of all the developers’ delight, none is more attractive...

Apache Spark as a Compiler: Joining a Billion Rows per Second on a Laptop

When our team at Databricks planned our contributions to the upcoming Apache Spark 2.0 release,...