异常值

统计学中,异常值(又稱離群值)是指与其他观测值有显著差异的数据点英语Unit of observation[1][2]。异常值可能是由实验误差造成;后者有时会从数据集中排除[3]。异常值可能会导致统计分析中出现严重问题。

能妥善處理異常值的估计量,稱為「穩健」。例如,中位數集中趋势的穩健統計量,但平均數則不然。[4]

参考文献

  1. ^ Grubbs, F. E. Procedures for detecting outlying observations in samples. Technometrics. February 1969, 11 (1): 1–21. doi:10.1080/00401706.1969.10490657. An outlying observation, or "outlier," is one that appears to deviate markedly from other members of the sample in which it occurs. 
  2. ^ Maddala, G. S. https://books.google.com/books?id=nBS3AAAAIAAJ&pg=PA89 |chapterurl=缺少标题 (帮助). Introduction to Econometrics 2nd. New York: MacMillan. 1992: 89. ISBN 978-0-02-374545-4. An outlier is an observation that is far removed from the rest of the observations. 
  3. ^ Grubbs 1969 stating "An outlying observation may be merely an extreme manifestation of the random variability inherent in the data. ... On the other hand, an outlying observation may be the result of gross deviation from prescribed experimental procedure or an error in calculating or recording the numerical value."
  4. ^ Ripley, Brian D. Robust statistics (PDF). 2004. (原始内容 (PDF)存档于2012-10-21).