Spark & Jupyter Notebooks (taught by Dr. Scott Jensen)
Date and time
Location
SJSU BBC 305
Description
Why we hope to see YOU at the seminar!
Apache Spark and Jupyter notebooks are currently two of the hottest tools in data science and this seminar provides the opportunity to work hands-on with these tools even if you have no prior experience in programing or data science! You don’t even need your own computer!
Jupyter Notebooks and Apache Spark are being used by data scientists at some of the largest web-based companies in the Silicon Valley. Apache Spark allows data scientists to explore large datasets in varied formats to quickly identify patterns in the data. Jupyter notebooks allow them to not only visualize and document their results, but also easily share their research with colleagues and even generate publications, webpages, and presentations. Together, through a web-based interface, these tools allow you to explore and experiment with large datasets, quickly ask questions about your data, generate visualizations, and share your work (with a couple clicks you can even publish your notebook to the web and share a link with family, friends, recruiters, or include it on your LinkedIn profile) – all without extensive coding!
After participating in the seminar and completing the post-seminar assessment, you will be able to:
- Load data into Spark DataFrames and ask basic questions of your data using PySpark
- Understand the importance of documenting your work and using markdown in Jupyter notebooks
- Create basic visualizations in Jupyter
- Share and publish your results
Please register to get access to the seminar and optional pre-seminar materials.