In a technical book club at meetup "San Diego Technology Immersion Group" we are reading a book about Spark. Here is a complete reference:
"Spark: The definitive guide" by Bill Chambers, Matei Zaharia(O'Reilly), Copyright 2018 Databricks, Inc., 978-1-491-91221-8
I was especially interested in Machine Learning part of the book. Unfortunately I learned that code provided in the book github account has some missing lines comparatively with the book, and in addition some of it does not work. So I made my source files and notebooks for chapters 24, 25 and 26. They are here:
https://github.com/Mathemilda/SparkTheDefinitiveGuideBook/tree/master/Part_IV_AdvAn%26ML