Python Data Science Handbook: Essential Tools for Working with Data


For many researchers, Python is a first-class tool mainly because of its libraries for storing, manipulating and gaining insight from data. Several resources exist for individual pieces of this data science stack, but only with the Python Data Science Handbook do you get them all IPython, NumPy, Pandas, Matplotlib, Scikit-Learn and other related tools. 

Working scientists and data crunchers familiar with reading and writing Python code will find this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. Quite simply, this is the must-have reference for scientific computing in Python.

With this handbook, you'll learn how to use:
IPython and Jupyter: provide computational environments for data scientists using Python
NumPy: includes the ndarray for efficient storage and manipulation of dense data arrays in Python
Pandas: features the DataFrame for efficient storage and manipulation of labeled/columnar data in Python
Matplotlib: includes capabilities for a flexible range of data visualizations in Python
Scikit-Learn: for efficient and clean Python implementations of the most important and established machine learning algorithms.


Table of Contents


Preface

1. IPython: Beyond Normal Python

  • Help and Documentation in IPython
  • Keyboard Shortcuts in the IPython Shell
  • IPython Magic Commands
  • Input and Output History
  • IPython and Shell Commands
  • Errors and Debugging
  • Profiling and Timing Code
  • More IPython Resources

2. Introduction to NumPy

  • Understanding Data Types in Python
  • The Basics of NumPy Arrays
  • Computation on NumPy Arrays: Universal Functions
  • Aggregations: Min, Max, and Everything In Between
  • Computation on Arrays: Broadcasting
  • Comparisons, Masks, and Boolean Logic
  • Fancy Indexing
  • Sorting Arrays
  • Structured Data: NumPy's Structured Arrays

3. Data Manipulation with Pandas

  • Introducing Pandas Objects
  • Data Indexing and Selection
  • Operating on Data in Pandas
  • Handling Missing Data
  • Hierarchical Indexing
  • Combining Datasets: Concat and Append
  • Combining Datasets: Merge and Join
  • Aggregation and Grouping
  • Pivot Tables
  • Vectorized String Operations
  • Working with Time Series
  • High-Performance Pandas: eval() and query()
  • Further Resources

4. Visualization with Matplotlib

  • Simple Line Plots
  • Simple Scatter Plots
  • Visualizing Errors
  • Density and Contour Plots
  • Histograms, Binnings, and Density
  • Customizing Plot Legends
  • Customizing Colorbars
  • Multiple Subplots
  • Text and Annotation
  • Customizing Ticks
  • Customizing Matplotlib: Configurations and Stylesheets
  • Three-Dimensional Plotting in Matplotlib
  • Geographic Data with Basemap
  • Visualization with Seaborn
  • Further Resources

5. Machine Learning

  • What Is Machine Learning?
  • Introducing Scikit-Learn
  • Hyperparameters and Model Validation
  • Feature Engineering
  • In Depth: Naive Bayes Classification
  • In Depth: Linear Regression
  • In-Depth: Support Vector Machines
  • In-Depth: Decision Trees and Random Forests
  • In Depth: Principal Component Analysis
  • In-Depth: Manifold Learning
  • In Depth: k-Means Clustering
  • In Depth: Gaussian Mixture Models
  • In-Depth: Kernel Density Estimation
  • Application: A Face Detection Pipeline
  • Further Machine Learning Resources
Amazon Link :
Click Me