Channel Insider content and product recommendations are editorially independent. We may make money when you click on links to our partners. View our editorial policy here.

Big Data Analytics

1 - Apache Spark Adoption Fuels Big Data AnalyticsApache Spark Adoption Fuels Big Data Analytics Growth

Rising demand for faster big data analytics apps is driving the adoption of Apache Spark, a core technology for modernizing data warehouses.

2 - Most Important Arributes of Apache SparkMost Important Arributes of Apache Spark

A full 91% cited performance as the top attribute, followed by ease of programming, at 77%; ease of deployment, at 71%; and advanced analytics, at 64%.

3 - Other Important Apache Spark FeaturesOther Important Apache Spark Features

Real-time streaming (52%) and DataFrames (47%) are additional important features. A full 64% of respondents are running the latest version of Apache Spark.

4 - Use Cases for Apache SparkUse Cases for Apache Spark

Business intelligence was ranked highest, at 68%, followed by data warehousing (52%), recommendation systems (44%) and log processing (40%).

5 - Most Common Apache Spark Deployment ModelsMost Common Apache Spark Deployment Models

Nearly half (48%) deploy Apache Spark in stand-alone mode followed by YARN running on Hadoop, at 40%. Just over half of respondents (51%) are running Apache Spark on public cloud.

6 - Most Common Apache Spark PlatformsMost Common Apache Spark Platforms

A full 75% are running Apache Spark on a Linux/Unix platform, while 47% are running on OS X. The fastest growing platform is Windows (23%), which grew 17 percentage points from 2014.

7 - Most Used Apache Spark ComponentsMost Used Apache Spark Components

Nearly seven in 10 (69%) are using Spark SQL, followed by DataFrames (62%), MLib + GraphX (58%) and streaming (58%). Three-quarters (75%) are using two or more Apache Spark components.

8 - Programming Languages Used With Apache SparkProgramming Languages Used With Apache Spark

At 71%, Scala is the most widely used programing language, followed by Python (58%), SQL (36%) and Java (31%). Python use is up 49% year-over-year.

9 - Apache Spark Components Used in ProductionApache Spark Components Used in Production

Nearly a quarter (24%) cite SQL, followed by DataFrames and advanced analytics (at 15% each), and streaming (14%). SQL use grew 380% from 2014.

Subscribe for updates!

You must input a valid work email address.
You must agree to our terms.