Wednesday, 4 February 2015

Modified Z-Scores using r code

  Manoj       Wednesday, 4 February 2015

Two estimators used in the Z-Scores, the sample mean and sample standard deviation, can be affected by a few extreme values or by even a single extreme value. To avoid this problem the median and the median of the absolute deviation of the median (MAD) are employed in the modified Z-Scores instead of the mean and standard deviation of the sample, respectively (Iglewicz and Hoaglin, 1993).
$MAD=median\{\left| {{x}_{i}}-S \right|\}$
$MAD=median{|Xi-S|} $, where $S$ is the sample median.
The modified Z-score (Mi) is Computed as
$M_i= ((0.6745*(x - median(x))) / MAD)$
 where $E(MAD) = 0.675$ sigma for large normal data.
Iglewicz and Hoaglin, (1993) suggested that observations are labeled outliers when  through the simulation based on pseudo-normal observations for sample sizes of 10, 20 and 40. The  score is effective for normal data in the same way as the Z-score.
-----------------------------------------------------------------------------------------------
Example: R code for Modified Z-Scores
-----------------------------------------------------------------------------------------------

>median(x)
>m<-median data-blogger-escaped-x="">m
>abs(x-m)
>MAD = median(abs(x-m))
>MAD
>Zm = ((0.6745*(x - median(x))) / MAD)
>Zm
logoblog

Thanks for reading Modified Z-Scores using r code

Previous
« Prev Post

No comments:

Post a Comment