Ming Chen & Wenqiang Feng


This repository contains mainly notes from learning Apache Spark by Ming Chen & Wenqiang Feng. We try to use the detailed demo code and examples to show how to use pyspark for big data mining. If you find your work wasn’t cited in this note, please feel free to let us know.

Cheat Sheets


Feedback and suggestions

Your comments and suggestions are highly appreciated. We are more than happy to receive corrections, suggestions or feedbacks for improvements.