注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

阿弥陀佛

街树飘影未见尘 潭月潜水了无声 般若观照心空静...

 
 
 

日志

 
 
关于我

一直从事气象预报、服务建模实践应用。 注重气象物理场、实况场、地理信息、本体知识库、分布式气象内容管理系统建立。 对Barnes客观分析, 小波,计算神经网络、信任传播、贝叶斯推理、专家系统、网络本体语言有一定体会。 一直使用Java、Delphi、Prolog、SQL编程。

网易考拉推荐

spark StatCounter  

2015-07-01 10:57:51|  分类: Spark |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |
package dataModel

/**
* A class for tracking the statistics of a set of numbers (count, mean and variance) in a
* numerically robust way. Includes support for merging two StatCounters. Based on Welford
* and Chan's [[http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance algorithms]]
* for running variance.
*
* @constructor Initialize the StatCounter with the given values.
*/
class StatCounter(values: TraversableOnce[Float]) extends Serializable {
private var n: Long = 0 // Running count of our values
private var mu: Double = 0 // Running mean of our values
private var m2: Double = 0 // Running variance numerator (sum of (x - mean)^2)
private var maxValue: Double = Double.NegativeInfinity // Running max of our values
private var minValue: Double = Double.PositiveInfinity // Running min of our values

merge(values)

/** Initialize the StatCounter with no values. */
def this() = this(Nil)

/** Add a value into this StatCounter, updating the internal statistics. */
def merge(value: Float): StatCounter = {
val delta = value - mu
n += 1
mu += delta / n
m2 += delta * (value - mu)
maxValue = math.max(maxValue, value)
minValue = math.min(minValue, value)
this
}

/** Add multiple values into this StatCounter, updating the internal statistics. */
def merge(values: TraversableOnce[Float]): StatCounter = {
values.foreach(v => merge(v))
this
}

/** Merge another StatCounter into this one, adding up the internal statistics. */
def merge(other: StatCounter): StatCounter = {
if (other == this) {
merge(other.copy()) // Avoid overwriting fields in a weird order
} else {
if (n == 0) {
mu = other.mu
m2 = other.m2
n = other.n
maxValue = other.maxValue
minValue = other.minValue
} else if (other.n != 0) {
val delta = other.mu - mu
if (other.n * 10 < n) {
mu = mu + (delta * other.n) / (n + other.n)
} else if (n * 10 < other.n) {
mu = other.mu - (delta * n) / (n + other.n)
} else {
mu = (mu * n + other.mu * other.n) / (n + other.n)
}
m2 += other.m2 + (delta * delta * n * other.n) / (n + other.n)
n += other.n
maxValue = math.max(maxValue, other.maxValue)
minValue = math.min(minValue, other.minValue)
}
this
}
}

/** Clone this StatCounter */
def copy(): StatCounter = {
val other = new StatCounter
other.n = n
other.mu = mu
other.m2 = m2
other.maxValue = maxValue
other.minValue = minValue
other
}

def count: Long = n

def mean: Double = mu

def sum: Double = n * mu

def max: Double = maxValue

def min: Double = minValue

/** Return the variance of the values. */
def variance: Double = {
if (n == 0) {
Double.NaN
} else {
m2 / n
}
}

/**
* Return the sample variance, which corrects for bias in estimating the variance by dividing
* by N-1 instead of N.
*/
def sampleVariance: Double = {
if (n <= 1) {
Double.NaN
} else {
m2 / (n - 1)
}
}

/** Return the standard deviation of the values. */
def stdev: Double = math.sqrt(variance)

/**
* Return the sample standard deviation of the values, which corrects for bias in estimating the
* variance by dividing by N-1 instead of N.
*/
def sampleStdev: Double = math.sqrt(sampleVariance)

override def toString: String = {
"(count: %d, mean: %f, stdev: %f, max: %f, min: %f)".format(count, mean, stdev, max, min)
}
}

object StatCounter {
/** Build a StatCounter from a list of values. */
def apply(values: TraversableOnce[Float]) = new StatCounter(values)

/** Build a StatCounter from a list of values passed as variable-length arguments. */
def apply(values: Float*) = new StatCounter(values)
}
  评论这张
 
阅读(280)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017