Data Science on the Google Cloud Platform

Data Science on the Google Cloud Platform Implementing End-to-End Real-Time Data Pipelines : From Ingest to Machine Learning

Second edition

Paperback (08 Apr 2022)

Save $21.50

  • RRP $81.18
  • $59.68
Add to basket

Includes delivery to the United States

10+ copies available online - Usually dispatched within two working days

Publisher's Synopsis

Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build using Google Cloud Platform (GCP). This hands-on guide shows data engineers and data scientists how to implement an end-to-end data pipeline with cloud native tools on GCP.

Throughout this updated second edition, you'll work through a sample business decision by employing a variety of data science approaches. Follow along by building a data pipeline in your own project on GCP, and discover how to solve data science problems in a transformative and more collaborative way.

You'll learn how to:

  • Employ best practices in building highly scalable data and ML pipelines on Google Cloud
  • Automate and schedule data ingest using Cloud Run
  • Create and populate a dashboard in Data Studio
  • Build a real-time analytics pipeline using Pub/Sub, Dataflow, and BigQuery
  • Conduct interactive data exploration with BigQuery
  • Create a Bayesian model with Spark on Cloud Dataproc
  • Forecast time series and do anomaly detection with BigQuery ML
  • Aggregate within time windows with Dataflow
  • Train explainable machine learning models with Vertex AI
  • Operationalize ML with Vertex AI Pipelines

Book information

ISBN: 9781098118952
Publisher: O'Reilly Media
Imprint: O'Reilly
Pub date:
Edition: Second edition
DEWEY: 004.6782
DEWEY edition: 23
Language: English
Number of pages: xvii, 440 illustrations ; 24 cm
Weight: 802g
Height: 178mm
Width: 234mm
Spine width: 27mm