Adventures in DM

Applied data mining, predictive modeling and statistical analysis - largely focused on marketing analytics.

Wednesday, February 6, 2013

Imputation with Random Forest : Miss Forest

›
Random Forest Imputation: One of the nice "extras" of the random forest algorithm (Breiman, 2001) is it's use for mixed ...
Monday, January 7, 2013

Partial Dependency Plots and GBM

›
My favorite "go-to-first" modeling algorithm for a classification or regression task is  Gradient  Boosted Regression Trees (Fri...
Wednesday, January 2, 2013

Need for Repeated Hold Outs in Predictive Models

›
Many models are built and deployed using a training/validation partitioning approach. Under this construct a data set is randomly split in t...

Delta Method in Logistic Regression

›
In the last post we looked at how to construct and evaluate a simple linear hypothesis test within the frame work of logistic regression (or...
Monday, December 31, 2012

Linear Hypothesis Tests

›
Returning to our example from the last post, within the framework of a generalized linear model, one way to statistically test if there are ...
Sunday, December 30, 2012

Simple Logistic Regression (GLM and Optim)

›
We will be using the data from the 1998 KDD Cup in the next couple of posts - at least a couple of the columns- which we will re-purpose.  ...
Thursday, December 27, 2012

Introduction

›
This blog will be an outlet to write about data mining and predictive modeling. I plan to explore and comment on the use of these discipline...
Home
View web version

About Me

Jeff
www.linkedin.com/in/jeffreymallard/
View my complete profile
Powered by Blogger.