Text as data : a new framework for machine learning and the social sciences / Justin Grimmer, Margaret E. Roberts, Brandon M. Stewart.
By: Grimmer, Justin [author.].
Contributor(s): Roberts, Margaret E [author.] | Stewart, Brandon M [author.].
Publisher: Princeton, New Jersey : Princeton University Press, ©2022Description: xix, 336 pages : illustrations ; 26 cm.Content type: text Media type: unmediated Carrier type: volumeISBN: 9780691207544; 0691207542; 9780691207551; 0691207550.Subject(s): Text data mining | Social sciences -- Data processing | Machine learning | Machine learning | Social sciences -- Data processing | Text data miningGenre/Form: Print books.Current location | Call number | Status | Date due | Barcode | Item holds |
---|---|---|---|---|---|
On Shelf | QA76.9.D343 G758 2022 (Browse shelf) | Available | AU00000000018753 |
Browsing Alfaisal University Shelves , Shelving location: On Shelf Close shelf browser
QA76.9.D343 E24 2014 Reality mining : using big data to engineer a better world / | QA76.9 .D343 E74 2018 Confident data skills : master the fundamentals of working with data and supercharge your career / | QA76.9.D343 G63 2017 Think like a data scientist : tackle the data science process step-by-step / | QA76.9.D343 G758 2022 Text as data : a new framework for machine learning and the social sciences / | QA76.9.D343 G88 2021 Becoming a data head : how to think, speak, and understand data science, statistics, and machine learning / | QA76.9.D343 H35 2016 Getting started with data science : making sense of data with analytics / | QA76.9.D343 K35 2020 Data mining : concepts, models, methods, and algorithms / |
Includes bibliographical references (pages [307]-329) and index.
Part I. Preliminaries: Chapter 1: Introduction -- Chapter 2: Social science research and text analysis --
Part II. Selection and representation. Chapter 3: Principles of selection and representation -- Chapter 4: Selecting documents -- Chapter 5: Bag of words -- Chapter 6: The multinominal language model -- Chapter 7: The vector space model and similarity metrics -- Chapter 8: Distributed representations of work -- Chapter 9: Representations from language sequences --
Part III. Discovey. Chapter 10: Principles of discovery -- Chapter 11: Discriminating words -- Chapter 12: Clustering -- Chapter 13: Topic models -- Chapter 14: Low-dimensional document embeddings --
Part IV. Measurement. Chapter 15: Principles of measurement -- Chapter 16: Word counting -- Chapter 17: An overview of supervised classification -- Chapter 18: Coding a training set -- Chapter 19: Classifying documents with supervised learning -- Chapter 20: Checking performance -- Chapter 21: Repurposing discovery methods --
Part V. Inference. Chapter 22: Principles of inference -- Chapter 23: Prediction -- Chapter 24: Casual inference -- Chapter 25: Text as outcome -- Chapter 26: Text as treatment -- Chapter 27: Text as confounder --
Part VI. Conclusion. Chapter 28: Conclusion.
"From social media posts and text messages to digital government documents and archives, researchers are bombarded with a deluge of text reflecting the social world. This textual data gives unprecedented insights into fundamental questions in the social sciences, humanities, and industry. Meanwhile new machine learning tools are rapidly transforming the way science and business are conducted. Text as Data shows how to combine new sources of data, machine learning tools, and social science research design to develop and evaluate new insights.Text as Data is organized around the core tasks in research projects using text--representation, discovery, measurement, prediction, and causal inference. The authors offer a sequential, iterative, and inductive approach to research design. Each research task is presented complete with real-world applications, example methods, and a distinct style of task-focused research. Bridging many divides--computer science and social science, the qualitative and the quantitative, and industry and academia--Text as Data is an ideal resource for anyone wanting to analyze large collections of text in an era when data is abundant and computation is cheap, but the enduring challenges of social science remain." -- publisher.