零基礎學大數據開發,Spark 學習資源分享

本系列是基於目前最新的 spark 1.6.0 系列開始的,spark 目前的更新速度很快,記錄一下版本好還是必要的。 來源:segmentfault


零基礎學大數據開發,Spark 學習資源分享

1. 書籍

  • Learning Spark
  • Mastering Apache Spark

2. 網站

  • official site
  • user mailing list
  • spark channel on youtube
  • spark summit
  • meetup
  • spark third party packages
  • databricks blog
  • databricks docs.html)
  • databricks training/Introduction%20(README).html)
  • cloudera blog about spark
  • https://0x0fff.com
  • http://techsuppdiva.github.io/
  • csdn spark 知識庫
  • 過往記憶

3. 文章,博客

  • RDD論文英文版
  • RDD論文中文版
  • An Architecture for Fast and General Data Processing on Large Clusters
  • How-to: Tune Your Apache Spark Jobs (Part 1)
  • How-to: Tune Your Apache Spark Jobs (Part 2)
  • 藉助 Redis ,讓 Spark 提速 45 倍!
  • 量化派基於Hadoop、Spark、Storm的大數據風控架構
  • 基於Spark的異構分佈式深度學習平臺
  • 你對Hadoop和Spark生態圈瞭解有幾許?
  • Hadoop vs Spark
  • 雅虎開源CaffeOnSpark:基於Hadoop/Spark的分佈式深度學習
  • 2016 上海第二次 spark meetup: 1. spark_meetup.pdf
  • 2016 上海第二次 spark meetup: 2. Flink_ An unified stream engine.pdf
  • 2016 上海第二次 spark meetup: 3. Spark在計算廣告領域的應用實踐.pdf
  • 2016 上海第二次 spark meetup: 4. splunk_spark.pdf
  • 基於Spark的醫療和金融大數據

4. 視頻

  • YouTube: what is apache spark
  • Introduction to Spark Architecture
  • Top 5 Mistakes When Writing Spark Applications
  • slide Top 5 mistakes when writing Spark applications
  • Tuning and Debugging Apache Spark
  • slide Tuning and Debugging Apache Spark
  • A Deeper Understanding of Spark Internals - Aaron Davidson (Databricks)
  • slide A Deeper Understanding of Spark Internals - Aaron Davidson (Databricks)
  • Building, Debugging, and Tuning Spark Machine Learning Pipelines - Joseph Bradley (Databricks)
  • slide Building, Debugging, and Tuning Spark Machine Learning Pipelines
  • Spark DataFrames Simple and Fast Analysis of Structured Data - Michael Armbrust (Databricks)
  • slide Spark DataFrames Simple and Fast Analysis of Structured Data - Michael Armbrust (Databricks)
  • Spark Tuning for Enterprise System Administrators
  • slide Spark Tuning for Enterprise System Administrators
  • Structuring Spark: DataFrames, Datasets, and Streaming
  • slide Structuring Spark: DataFrames, Datasets, and Streaming
  • Spark in Production: Lessons from 100+ Production Users
  • slide Spark in Production: Lessons from 100+ Production Users
  • Production Spark and Tachyon use Cases
  • slide Production Spark and Tachyon use Cases
  • SparkUI Visualization
  • slide SparkUI Visualization
  • Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San Jose 2015
  • slide Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San Jose 2015
  • Large Scale Distributed Machine Learning on Apache Spark
  • Securing your Spark Applications
  • slide Securing your Spark Applications
  • Building a REST Job Server for Interactive Spark as a Service
  • slide Building a REST Job Server for Interactive Spark as a Service
  • Exploiting GPUs for Columnar DataFrame Operations
  • slide Exploiting GPUs for Columnar DataFrame Operations
  • Easy JSON Data Manipulation in Spark - Yin Huai (Databricks)
  • slide Easy JSON Data Manipulation in Spark - Yin Huai (Databricks)
  • Sparkling: Speculative Partition of Data for Spark Applications - Peilong Li
  • slide Sparkling: Speculative Partition of Data for Spark Applications - Peilong Li
  • Advanced Spark Internals and Tuning – Reynold Xin
  • slide Advanced Spark Internals and Tuning – Reynold Xin
  • The Future of Real Time in Spark
  • The Future of Real Time in Spark

原文:https://segmentfault.com/a/1190000005020672


分享到:


相關文章: