Design Patterns Applied to CHAID, QUEST and other Classification Schemes/Algorithms
Generalized Factor Tree Patterns
The following use-case model explains the general approach
to Factor analysis: Use-case model for factor
analysis.
The following activity diagrams explain the mechanics of how
a Factor Analysis starts by running through Learn_Mode. Learn mode produces the rules that are
used for subsequent analysis. When a new data set is obtained, it is run
through Analyze Mode where it is compared
against the rules produced in Learn Mode.
This following activity diagram explains how Ross Quinlan architected the primary routines C45 Activity Diagram
The followiing diagrams provide a decomposition of the algorithm associated with CHAID and QUEST and describing these algorthims using design patterns.
Diagram of Analyze Process
Class Diagram of Interaction between Factor Analysis Components
Selected References
- For overviews of classification trees and their basic purpose, please see: www.cs.cornell.edu/johannes/papers/2001/kdd2001-tutorial-final.pdf
-
For a basic but detailed description of the CHAID algorithm, please see:
http://www.cbs.nl/en/service/autimp/Appendix1-Tree-(AUTIMP).pdf
-
For a comparative study of variable selection methods in data mining, please see the work published by Wei-Yin Loh and Young Joo at:
http://stats.snu.ac.kr/~youngjoo/Document/MINING2.PDF
-
For executable binary files to run QUEST and an extensive list of other resources related to classification trees, please see Wei-Yin Loh’s site at:
http://www.stat.wisc.edu/~loh/quest.html
Return to Home Page