Updated March 2023
The following is a nonexhaustive list of my work completed during my time at Google.
Highlights:
Videos
Writing
- Dataproc Serverless codelab
- Dataproc Serverless in-console tutorial
- Improve the data science experience using scalable Python data processing blog
- Spark job tuning tips GCP Documentation
- Dataproc Workflow using Cloud Composer GCP documentation tutorial
Talks:
- Spark Serverless at Google Open Source Live
- Next 2021 Developer Demo
- Intro to TensorFlow 2.0 at Spark Summit 2019
- Python Distributed Machine Learning at PyGotham 2019
Samples:
- Dataproc client library quickstart samples (in Python, Java, Node.js, Go)
- Machine Learning initialization action for Dataproc
Longer list:
Ebooks:
- RAPIDS + Spark 3.0 on Dataproc (partnership with NVIDIA)
Videos:
Blogs:
- Faster machine learning on Dataproc with new initialization action
- Improve the data science experience using scalable Python data processing
- Presto optional component now available on Dataproc
Codelabs:
Talks:
- Spark + Jupyter Notebooks: JupyterCon 2020
- ML Engineer Advice: chat with HiCounselor
- Intro to TensorFlow 2.0: Spark Summit 2019
- Intro to TensorFlow 2.0: EuroPython 2019
- Intro to TensorFlow 2.0: AI Camp 2019
- Python Distributed Machine Learning: PyGotham 2019
- Machine Learning with TensorFlow and PyTorch on Apache Hadoop using Cloud Dataproc: Google Next 2019