Normal view MARC view ISBD view

Designing Machine Learning Systems Chip Huyen

By: Huyen, Chip [author].
Publisher: O'Reilly Media, Inc., ©2022Edition: 1st edition.Description: 350 p.Content type: text Media type: computer Carrier type: online resourceSubject(s): Machine learning -- DevelopmentGenre/Form: Print books.
Contents:
Cover -- Copyright -- Table of Contents -- Preface -- Who This Book Is For -- What This Book Is Not -- Navigating This Book -- GitHub Repository and Community -- Conventions Used in This Book -- Using Code Examples -- O'Reilly Online Learning -- How to Contact Us -- Acknowledgments -- Chapter 1. Overview of Machine Learning Systems -- When to Use Machine Learning -- Machine Learning Use Cases -- Understanding Machine Learning Systems -- Machine Learning in Research Versus in Production -- Machine Learning Systems Versus Traditional Software -- Summary
Chapter 2. Introduction to Machine Learning Systems Design -- Business and ML Objectives -- Requirements for ML Systems -- Reliability -- Scalability -- Maintainability -- Adaptability -- Iterative Process -- Framing ML Problems -- Types of ML Tasks -- Objective Functions -- Mind Versus Data -- Summary -- Chapter 3. Data Engineering Fundamentals -- Data Sources -- Data Formats -- JSON -- Row-Major Versus Column-Major Format -- Text Versus Binary Format -- Data Models -- Relational Model -- NoSQL -- Structured Versus Unstructured Data -- Data Storage Engines and Processing
Transactional and Analytical Processing -- ETL: Extract, Transform, and Load -- Modes of Dataflow -- Data Passing Through Databases -- Data Passing Through Services -- Data Passing Through Real-Time Transport -- Batch Processing Versus Stream Processing -- Summary -- Chapter 4. Training Data -- Sampling -- Nonprobability Sampling -- Simple Random Sampling -- Stratified Sampling -- Weighted Sampling -- Reservoir Sampling -- Importance Sampling -- Labeling -- Hand Labels -- Natural Labels -- Handling the Lack of Labels -- Class Imbalance -- Challenges of Class Imbalance -- Handling Class Imbalance
Data Augmentation -- Simple Label-Preserving Transformations -- Perturbation -- Data Synthesis -- Summary -- Chapter 5. Feature Engineering -- Learned Features Versus Engineered Features -- Common Feature Engineering Operations -- Handling Missing Values -- Scaling -- Discretization -- Encoding Categorical Features -- Feature Crossing -- Discrete and Continuous Positional Embeddings -- Data Leakage -- Common Causes for Data Leakage -- Detecting Data Leakage -- Engineering Good Features -- Feature Importance -- Feature Generalization -- Summary
Chapter 6. Model Development and Offline Evaluation -- Model Development and Training -- Evaluating ML Models -- Ensembles -- Experiment Tracking and Versioning -- Distributed Training -- AutoML -- Model Offline Evaluation -- Baselines -- Evaluation Methods -- Summary -- Chapter 7. Model Deployment and Prediction Service -- Machine Learning Deployment Myths -- Myth 1: You Only Deploy One or Two ML Models at a Time -- Myth 2: If We Don't Do Anything, Model Performance Remains the Same -- Myth 3: You Won't Need to Update Your Models as Much
Summary: Many tutorials show you how to develop ML systems from ideation to deployed models. But with constant changes in tooling, those systems can quickly become outdated. Without an intentional design to hold the components together, these systems will become a technical liability, prone to errors and be quick to fall apart. In this book, Chip Huyen provides a framework for designing real-world ML systems that are quick to deploy, reliable, scalable, and iterative. These systems have the capacity to learn from new data, improve on past mistakes, and adapt to changing requirements and environments. Youâ??ll learn everything from project scoping, data management, model development, deployment, and infrastructure to team structure and business analysis. Learn the challenges and requirements of an ML system in production Build training data with different sampling and labeling methods Leverage best techniques to engineer features for your ML models to avoid data leakage Select, develop, debug, and evaluate ML models that are best suit for your tasks Deploy different types of ML systems for different hardware Explore major infrastructural choices and hardware designs Understand the human side of ML, including integrating ML into business, user experience, and team structure
    average rating: 0.0 (0 votes)
Current location Call number Status Date due Barcode Item holds
On Shelf Q325.5 .H89 2022 (Browse shelf) Available AU00000000018603
Total holds: 0

Cover -- Copyright -- Table of Contents -- Preface -- Who This Book Is For -- What This Book Is Not -- Navigating This Book -- GitHub Repository and Community -- Conventions Used in This Book -- Using Code Examples -- O'Reilly Online Learning -- How to Contact Us -- Acknowledgments -- Chapter 1. Overview of Machine Learning Systems -- When to Use Machine Learning -- Machine Learning Use Cases -- Understanding Machine Learning Systems -- Machine Learning in Research Versus in Production -- Machine Learning Systems Versus Traditional Software -- Summary

Chapter 2. Introduction to Machine Learning Systems Design -- Business and ML Objectives -- Requirements for ML Systems -- Reliability -- Scalability -- Maintainability -- Adaptability -- Iterative Process -- Framing ML Problems -- Types of ML Tasks -- Objective Functions -- Mind Versus Data -- Summary -- Chapter 3. Data Engineering Fundamentals -- Data Sources -- Data Formats -- JSON -- Row-Major Versus Column-Major Format -- Text Versus Binary Format -- Data Models -- Relational Model -- NoSQL -- Structured Versus Unstructured Data -- Data Storage Engines and Processing

Transactional and Analytical Processing -- ETL: Extract, Transform, and Load -- Modes of Dataflow -- Data Passing Through Databases -- Data Passing Through Services -- Data Passing Through Real-Time Transport -- Batch Processing Versus Stream Processing -- Summary -- Chapter 4. Training Data -- Sampling -- Nonprobability Sampling -- Simple Random Sampling -- Stratified Sampling -- Weighted Sampling -- Reservoir Sampling -- Importance Sampling -- Labeling -- Hand Labels -- Natural Labels -- Handling the Lack of Labels -- Class Imbalance -- Challenges of Class Imbalance -- Handling Class Imbalance

Data Augmentation -- Simple Label-Preserving Transformations -- Perturbation -- Data Synthesis -- Summary -- Chapter 5. Feature Engineering -- Learned Features Versus Engineered Features -- Common Feature Engineering Operations -- Handling Missing Values -- Scaling -- Discretization -- Encoding Categorical Features -- Feature Crossing -- Discrete and Continuous Positional Embeddings -- Data Leakage -- Common Causes for Data Leakage -- Detecting Data Leakage -- Engineering Good Features -- Feature Importance -- Feature Generalization -- Summary

Chapter 6. Model Development and Offline Evaluation -- Model Development and Training -- Evaluating ML Models -- Ensembles -- Experiment Tracking and Versioning -- Distributed Training -- AutoML -- Model Offline Evaluation -- Baselines -- Evaluation Methods -- Summary -- Chapter 7. Model Deployment and Prediction Service -- Machine Learning Deployment Myths -- Myth 1: You Only Deploy One or Two ML Models at a Time -- Myth 2: If We Don't Do Anything, Model Performance Remains the Same -- Myth 3: You Won't Need to Update Your Models as Much

Available to OhioLINK libraries

Many tutorials show you how to develop ML systems from ideation to deployed models. But with constant changes in tooling, those systems can quickly become outdated. Without an intentional design to hold the components together, these systems will become a technical liability, prone to errors and be quick to fall apart. In this book, Chip Huyen provides a framework for designing real-world ML systems that are quick to deploy, reliable, scalable, and iterative. These systems have the capacity to learn from new data, improve on past mistakes, and adapt to changing requirements and environments. Youâ??ll learn everything from project scoping, data management, model development, deployment, and infrastructure to team structure and business analysis. Learn the challenges and requirements of an ML system in production Build training data with different sampling and labeling methods Leverage best techniques to engineer features for your ML models to avoid data leakage Select, develop, debug, and evaluate ML models that are best suit for your tasks Deploy different types of ML systems for different hardware Explore major infrastructural choices and hardware designs Understand the human side of ML, including integrating ML into business, user experience, and team structure

Made available through: Safari, an O'Reilly Media Company

Copyright © 2020 Alfaisal University Library. All Rights Reserved.
Tel: +966 11 2158948 Fax: +966 11 2157910 Email:
librarian@alfaisal.edu