Custom cover image
Custom cover image

Fundamentals of data science : theory and practice / Jugal K. Kalita, Dhruba K. Bhattacharyya, Swarup Roy

By: Contributor(s): Resource type: Ressourcentyp: Buch (Online)Book (Online)Language: English Publisher: London : ACADEMIC PRESS, [2024]Copyright date: © 2024Description: 1 Online-RessourceISBN:
  • 0323972632
  • 9780323972635
Subject(s): Additional physical formats: 9780323917780 | Erscheint auch als: Fundamentals of data science. Druck-Ausgabe London : Academic Press, 2024. xxvi, 307 SeitenDDC classification:
  • 005.7
LOC classification:
  • QA76.9.B45
Online resources:
Contents:
Front Cover -- Fundamentals of Data Science -- Copyright -- Contents -- Preface -- Acknowledgment -- Foreword -- Foreword -- 1 Introduction -- 1.1 Data, information, and knowledge -- 1.2 Data Science: the art of data exploration -- 1.2.1 Brief history -- 1.2.2 General pipeline -- 1.2.2.1 Data collection and integration -- 1.2.2.2 Data preparation -- 1.2.2.3 Learning-model construction -- 1.2.2.4 Knowledge interpretation and presentation -- 1.2.3 Multidisciplinary science -- 1.3 What is not Data Science? -- 1.4 Data Science tasks -- 1.4.1 Predictive Data Science
1.4.2 Descriptive Data Science -- 1.4.3 Diagnostic Data Science -- 1.4.4 Prescriptive Data Science -- 1.5 Data Science objectives -- 1.5.1 Hidden knowledge discovery -- 1.5.2 Prediction of likely outcomes -- 1.5.3 Grouping -- 1.5.4 Actionable information -- 1.6 Applications of Data Science -- 1.7 How to read the book? -- References -- 2 Data, sources, and generation -- 2.1 Introduction -- 2.2 Data attributes -- 2.2.1 Qualitative -- 2.2.1.1 Nominal -- 2.2.1.2 Binary -- 2.2.1.3 Ordinal -- 2.2.2 Quantitative -- 2.2.2.1 Discrete -- 2.2.2.2 Continuous -- 2.2.2.3 Interval -- 2.2.2.4 Ratio
2.3 Data-storage formats -- 2.3.1 Structured data -- 2.3.2 Unstructured data -- 2.3.3 Semistructured data -- 2.4 Data sources -- 2.4.1 Primary sources -- 2.4.2 Secondary sources -- 2.4.3 Popular data sources -- 2.4.4 Homogeneous vs. heterogeneous data sources -- 2.5 Data generation -- 2.5.1 Types of synthetic data -- 2.5.2 Data-generation steps -- 2.5.3 Generation methods -- 2.5.4 Tools for data generation -- 2.5.4.1 Software tools -- 2.5.4.2 Python libraries -- 2.6 Summary -- References -- 3 Data preparation -- 3.1 Introduction -- 3.2 Data cleaning -- 3.2.1 Handling missing values
3.2.1.1 Ignoring and discarding data -- 3.2.1.2 Parameter estimation -- 3.2.1.3 Imputation -- 3.2.2 Duplicate-data detection -- 3.2.2.1 Knowledge-based methods -- 3.2.2.2 ETL method -- 3.3 Data reduction -- 3.3.1 Parametric data reduction -- 3.3.2 Sampling -- 3.3.3 Dimensionality reduction -- 3.4 Data transformation -- 3.4.1 Discretization -- 3.4.1.1 Supervised discretization -- 3.4.1.2 Unsupervised discretization -- 3.5 Data normalization -- 3.5.1 Min-max normalization -- 3.5.2 Z-score normalization -- 3.5.3 Decimal-scaling normalization -- 3.5.4 Quantile normalization
3.5.5 Logarithmic normalization -- 3.6 Data integration -- 3.6.1 Consolidation -- 3.6.2 Federation -- 3.7 Summary -- References -- 4 Machine learning -- 4.1 Introduction -- 4.2 Machine Learning paradigms -- 4.2.1 Supervised learning -- 4.2.2 Unsupervised learning -- 4.2.3 Semisupervised learning -- 4.3 Inductive bias -- 4.4 Evaluating a classifier -- 4.4.1 Evaluation steps -- 4.4.1.1 Validation -- 4.4.1.2 Testing -- 4.4.1.3 K-fold crossvalidation -- 4.4.2 Handling unbalanced classes -- 4.4.3 Model generalization -- 4.4.3.1 Underfitting -- 4.4.3.2 Overfitting -- 4.4.3.3 Accurate fittings
PPN: PPN: 1907961984Package identifier: Produktsigel: ZDB-4-NLEBK | BSZ-4-NLEBK-KAUB
No physical items for this record