Machine Learning for Hackers
Case Studies and Algorithms to Get You Started
Publisher: O'Reilly Media
Released: February 2012
Pages: 324
If you’re an experienced programmer interested in crunching data, this book will get you started with machine learning—a toolkit of algorithms that enables computers to train themselves to automate useful tasks. Authors Drew Conway and John Myles White help you understand machine learning and statistics tools through a series of hands-on case studies, instead of a traditional math-heavy presentation.
Each chapter focuses on a specific problem in machine learning, such as classification, prediction, optimization, and recommendation. Using the R programming language, you’ll learn how to analyze sample datasets and write simple machine learning algorithms. Machine Learning for Hackers is ideal for programmers from any background, including business, government, and academic research.
- Develop a naïve Bayesian classifier to determine if an email is spam, based only on its text
- Use linear regression to predict the number of page views for the top 1,000 websites
- Learn optimization techniques by attempting to break a simple letter cipher
- Compare and contrast U.S. Senators statistically, based on their voting records
- Build a “whom to follow” recommendation system from Twitter data
Chapter 1 Using R
R for Machine Learning
Chapter 2 Data Exploration
Exploration versus Confirmation
What Is Data?
Inferring the Types of Columns in Your Data
Inferring Meaning
Numeric Summaries
Means, Medians, and Modes
Quantiles
Standard Deviations and Variances
Exploratory Data Visualization
Visualizing the Relationships Between Columns
Chapter 3 Classification: Spam Filtering
This or That: Binary Classification
Moving Gently into Conditional Probability
Writing Our First Bayesian Spam Classifier
Chapter 4 Ranking: Priority Inbox
How Do You Sort Something When You Don’t Know the Order?
Ordering Email Messages by Priority
Writing a Priority Inbox
Chapter 5 Regression: Predicting Page Views
Introducing Regression
Predicting Web Traffic
Defining Correlation
Chapter 6 Regularization: Text Regression
Nonlinear Relationships Between Columns: Beyond Straight Lines
Methods for Preventing Overfitting
Text Regression
Chapter 7 Optimization: Breaking Codes
Introduction to Optimization
Ridge Regression
Code Breaking as Optimization
Chapter 8 PCA: Building a Market Index
Unsupervised Learning
Chapter 9 MDS: Visually Exploring US Senator Similarity
Clustering Based on Similarity
How Do US Senators Cluster?
Chapter 10 kNN: Recommendation Systems
The k-Nearest Neighbors Algorithm
R Package Installation Data
Chapter 11 Analyzing Social Graphs
Social Network Analysis
Hacking Twitter Social Graph Data
Analyzing Twitter Networks
Chapter 12 Model Comparison
SVMs: The Support Vector Machine
Comparing Algorithms
Works Citedbooks and publicationsbibliography ofresourcesbooks and publications; website resourcesstatisticsresources formachine learningresources forR programming languageresources for
Colophon
- Title:
- Machine Learning for Hackers
- By:
- Drew Conway, John Myles White
- Publisher:
- O'Reilly Media
- Formats:
- Ebook
- Safari Books Online
- Print:
- February 2012
- Ebook:
- February 2012
- Pages:
- 324
- Print ISBN:
- 978-1-4493-0371-6
- | ISBN 10:
- 1-4493-0371-4
- Ebook ISBN:
- 978-1-4493-0378-5
- | ISBN 10:
- 1-4493-0378-1