Two estimators used in the
Z-Scores, the sample mean and sample standard deviation, can be affected by a
few extreme values or by even a single extreme value. To avoid this problem the
median and the median of the absolute deviation of the median (MAD) are
employed in the modified Z-Scores instead of the mean and standard deviation of
the sample, respectively (Iglewicz and Hoaglin, 1993).
$MAD=median\{\left| {{x}_{i}}-S \right|\}$
$MAD=median\{\left| {{x}_{i}}-S \right|\}$
The modified Z-score (Mi) is Computed as
$M_i= ((0.6745*(x - median(x))) / MAD)$
Iglewicz and Hoaglin, (1993)
suggested that observations are labeled outliers when
through the simulation
based on pseudo-normal observations for sample sizes of 10, 20 and 40. The
score is effective for
normal data in the same way as the Z-score.
-----------------------------------------------------------------------------------------------
Example: R code for Modified Z-Scores
-----------------------------------------------------------------------------------------------
>median(x) >m<-median data-blogger-escaped-x="">m >abs(x-m) >MAD = median(abs(x-m)) >MAD >Zm = ((0.6745*(x - median(x))) / MAD) >Zm
No comments:
Post a Comment