
Data Mining Project
Daniel Brockman
Dan@DanielBrockman.com
Pursuant to
Course CIS366, SAS Data Mining
Summer 2006
University of California Berkeley Extension
Jianmin Liu, Instructor
jiliu@ggu.edu

Contents

Report
Lift Charts for Final Models
cumulative
noncumulative
Notes on the project
Conclusion
On working with fellow students
Candidate Neural Network Models
Model NND2
Model NND3
Lift Chart  cumulative
Lift Chart  noncumulative
Project Assignment
Data
Review by
Jaison K. Joseph



Cumulative Lift Chart for Final Models

Contents



Noncumulative Lift Chart for Final Models

Contents



Notes on the project

SAS Enterprise Data Miner impresses me
with the ease with which one
can produce useful results. This makes it
appear simple. At the same time, it
offers many opportunities to insert
controls on the modeling process. This
makes it appear complex.
Feeling constrained by time, I
focused on the specifically assigned tasks, but did
allow myself a bit of exploration.
1. Conclusion: I notice the
Tree model gives results quite as good or
better than other models, though it runs
more quickly than the Neural Networks,
and considers more data than the Logit
model. If the cost of data processing
becomes significant, the Tree model
should outperform the others in return on
cost. If the cost of data processing isn't
significant, then the ensemble of three models,
TreeD2, LogitD2 and NND3, gives
the best results.
2. The Neural Network Models:
Interested in Neural Networks, I created
several of these models, two of which (NND2 and NND3) were interesting, and I
retained them. I used an assessment node to choose NND3 as the best Neural Network model.
3. Variable Transformation: The Neural
Networks weren't interesting until I used
the Variable Transformation node to
create some binned variables, which
showed up as "Ordinal" in the i
network diagrams.
4. Discarded Models: In all, I
created two Logit models, two Tree models
and four Neural Network models, and
discarded half of them because they had
no predictive value and no interest.
5. Working with fellow students:I had
initially agreed to collaborate with
Chookij Vanatham, but he had scheduling
conflicts with unrelated activities and
had to withdraw from the collaboration.
At that point, no
opportunity remained to plan a joint
project with someone else. Jaison Joseph
reviewed my work at an intermediate stage.
Nicholas Zemstov and Chookij asked me
about structuring the project. They and
Jaison inquired with me about some
technical questions. I provided what
suggestions I could. Jaison and Chookij
and I worked together on the technical
task of getting copies of the SAS
software for our use.
6. Considerations for the Future:
At the end of a project, I always notice what
might have been
done differently but for some constraint.
They serve to guide future projects.
One is the assessments using "profit"
to discriminate the goodness of a model.
In the future, I want to find the control
on this behavior and investigate
alternatives. Also, I've given little
attention to CHAID, CART and C4.5 models,
all of which have virtues about which I
want to learn more. Further, the Neuron
Network models' capabilities interest me
immensely, and I want to explore them more.
Contents



Candidate Neural Network Models
Model NND2

Contents



Candidate Neural Network Models
Model NND3

Contents



Candidate Neural Network Models
Lift Chart  Cumulative

Contents



Candidate Neural Network Models
Lift Chart  Noncumulative

Contents


Top of Page
 Home
 Up
 Daniel Brockman
 Contact
