Practical Web Scraping for Data Science : Best Practices and Examples with Python / Seppe vanden Broucke, Bart Baesens.
By: Broucke, Seppe vanden [author.].
Contributor(s): Baesens, Bart [author.].
Publisher: Berkeley, CA : Springer Science and Business Media : Apress, ©2018Description: xvi, 306 pages : illustration; 25 cm.Content type: text Media type: unmediated Carrier type: volumeISBN: 9781484235812.Subject(s): Automatic data collection systems | Data mining | Python (Computer program language) | Automatic data collection systems | Data mining | Python (Computer program language)Genre/Form: Print books.Current location | Call number | Status | Date due | Barcode | Item holds |
---|---|---|---|---|---|
On Shelf | QA76.55 .B76 2018 (Browse shelf) | Available |
Browsing Alfaisal University Shelves , Shelving location: On Shelf Close shelf browser
QA76.5 .S2193 2007 Using information technology : a practical introduction to computers & communications / | QA76.54 .L35 2018 Data science on the Google cloud platform : implementing end-to-end real-time data pipelines: from ingest to machine learning / | QA76.546.E84 B88 2022 Proof of stake : the making of Ethereum and the philosophy of blockchains / | QA76.55 .B76 2018 Practical Web Scraping for Data Science : Best Practices and Examples with Python / | QA76.57 .M55 M35 2017 Minitel : welcome to the Internet / | QA76.575 .B74 2022 Making media : foundations of sound and image production / | QA76.575 .C6838 2016 Multimedia foundations : core concepts for digital design / |
Includes bibliographical references and index.
Intro; Table of Contents; About the Authors; About the Technical Reviewer; Introduction; Part I: Web Scraping Basics; Chapter 1: Introduction; 1.1 What Is Web Scraping?; 1.1.1 Why Web Scraping for Data Science?; 1.1.2 Who Is Using Web Scraping?; 1.2 Getting Ready; 1.2.1 Setting Up; 1.2.2 A Quick Python Primer; Chapter 2: The Web Speaks HTTP; 2.1 The Magic of Networking; 2.2 The HyperText Transfer Protocol: HTTP; 2.3 HTTP in Python: The Requests Library; 2.4 Query Strings: URLs with Parameters; Chapter 3: Stirring the HTML and CSS Soup; 3.1 Hypertext Markup Language: HTML.
3.2 Using Your Browser as a Development Tool3.3 Cascading Style Sheets: CSS; 3.4 The Beautiful Soup Library; 3.5 More on Beautiful Soup; Part II: Advanced Web Scraping; Chapter 4: Delving Deeper in HTTP; 4.1 Working with Forms and POST Requests; 4.2 Other HTTP Request Methods; 4.3 More on Headers; 4.4 Dealing with Cookies; 4.5 Using Sessions with Requests; 4.6 Binary, JSON, and Other Forms of Content; Chapter 5: Dealing with JavaScript; 5.1 What Is JavaScript?; 5.2 Scraping JavaScript; 5.3 Scraping with Selenium; 5.4 More on Selenium; Chapter 6: From Web Scraping to Web Crawling.
6.1 What Is Web Crawling?6.2 Web Crawling in Python; 6.3 Storing Results in a Database; Part III: Managerial Concerns and Best Practices; Chapter 7: Managerial and Legal Concerns; 7.1 The Data Science Process; 7.2 Where Does Web Scraping Fit In?; 7.3 Legal Concerns; Chapter 8: Closing Topics; 8.1 Other Tools; 8.1.1 Alternative Python Libraries; 8.1.2 Scrapy; 8.1.3 Caching; 8.1.4 Proxy Servers; 8.1.5 Scraping in Other Programming Languages; 8.1.6 Command-Line Tools; 8.1.7 Graphical Scraping Tools; 8.2 Best Practices and Tips; Chapter 9: Examples; 9.1 Scraping Hacker News.
9.2 Using the Hacker News API9.3 Quotes to Scrape; 9.4 Books to Scrape; 9.5 Scraping GitHub Stars; 9.6 Scraping Mortgage Rates; 9.7 Scraping and Visualizing IMDB Ratings; 9.8 Scraping IATA Airline Information; 9.9 Scraping and Analyzing Web Forum Interactions; 9.10 Collecting and Clustering a Fashion Data Set; 9.11 Sentiment Analysis of Scraped Amazon Reviews; 9.12 Scraping and Analyzing News Articles; 9.13 Scraping and Analyzing a Wikipedia Graph; 9.14 Scraping and Visualizing a Board Members Graph; 9.15 Breaking CAPTCHA's Using Deep Learning; Index.
Access limited to UNC Chapel Hill-authenticated users. Unlimited simultaneous users.
Including many larger, fully worked out examples, this book provides a complete and modern guide to web scraping, using Python as the programming language, without glossing over important details or best practices. --