Techniques

Exploratory Data Analysis (EDA) is an approach / philosophy for data analysis that employs a variety of techniques (mostly graphical) to:

  1. maximize insight into a data set
  2. uncover underlying structure
  3. extract important variables
  4. detect outliers and anomalies
  5. test underlying assumptions
  6. develop parsimonious models
  7. determine optimal factor settings

EDA vs statistical graphics

EDA is not identical to statistical graphics although the two terms are used almost interchangeably. Statistical graphics is a collection of techniques, all graphically based and all focusing on one data characterization aspect.

Philosophy

EDA encompasses a larger venue: EDA is an approach to data analysis that postpones the usual assumptions about what kind of model the data follow with the more direct approach of allowing the data itself to reveal its underlying structure and model. EDA is not a mere collection of techniques: EDA is a philosophy as to how we dissect a data set; what we look for; how we look; and how we interpret.

Techniques

The particular graphical techniques employed in EDA are often quite simple, consisting of various techniques of:

  1. plotting the raw data (such as data traces, histograms, bihistograms, probability plots, lag plots, block plots, and Youden plots),
  2. plotting simple statistics such as mean plots, standard deviation plots, box plots, and main effects plots of the raw data,
  3. positioning such plots so as to maximize our natural pattern-recognition abilities, such as using multiple plots per page.

 

 

Contact information
Mr Coos Bosma
Statistical Consultant and Analyst
Tel: 27 41 504 9902
coos.bosma@mandela.ac.za