注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

阿弥陀佛

街树飘影未见尘 潭月潜水了无声 般若观照心空静...

 
 
 

日志

 
 
关于我

一直从事气象预报、服务建模实践应用。 注重气象物理场、实况场、地理信息、本体知识库、分布式气象内容管理系统建立。 对Barnes客观分析, 小波,计算神经网络、信任传播、贝叶斯推理、专家系统、网络本体语言有一定体会。 一直使用Java、Delphi、Prolog、SQL编程。

网易考拉推荐

efficient-nearest-neighbour-search-in-Spark  

2015-03-02 17:37:50|  分类: Spark |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |
package mytest
import org.apache.spark.{SparkConf, SparkContext}
/**
* Created by 何险峰 on 15-3-8.
*/

class SDOrdering(x: Array[Double]) extends Ordering[Array[Double]] {
def compare(a: Array[Double], b: Array[Double]): Int ={
val d1 : Double = a.zip(x).map{c => val d = c._1 - c._2; d*d}.sum
val d2 : Double = b.zip(x).map{c => val d = c._1 - c._2; d*d}.sum
d1 compare d2
}
}

object TakeOrderedBreeze {
val conf = new SparkConf().setMaster("local").setAppName("TakeOrdered")
val sc = new SparkContext(conf)

def euclidean_test = {
val grid = (1 to 50000).map ( _ => Array(Math.random * 5, Math.random * 5))
def se2(x: Array[Double]) = sc.parallelize(grid).takeOrdered(10)(new SDOrdering(x))

val t1 = System.currentTimeMillis()
for (i <- 0 until 100) {
val x = Array(Math.random * 5, Math.random * 5)
val near=se2(x)
println("x="+x.mkString(","))
near.map(f => println("near:"+f.mkString(",")))
}
val t2 = System.currentTimeMillis()
println("mm" + (t2 - t1))
}

def main(args: Array[String]) {
euclidean_test
}
}
=======================================
...
x=3.7393394173522756,1.8935961549662106
near:3.7598957581777097,1.907454970374115
near:3.709672102195803,1.8852274424796356
near:3.7753726700162176,1.892142957094503
near:3.7601756854336834,1.8629983422973173
near:3.7498270890213172,1.930354983763577
near:3.7770181118111186,1.8811535525785483
near:3.7711713530256437,1.868497451927193
near:3.749567360289073,1.854365998028829
near:3.7071992473093762,1.8683069393464669
near:3.7293872651173054,1.8532527010624071
mm10572
==========================================
老版本


package
mytest
import org.apache.spark.{SparkConf, SparkContext}

/**
* Created by 何险峰 on 15-3-2.
*/
case class SquareEuclidean(a: Array[Double]) {
def dist(b: Array[Double]) = a.zip(b).map{c => val d = c._1 - c._2; d * d}.sum
}

class SquareEuclideanOrdering(x: Array[Double]) extends Ordering[SquareEuclidean] {
def compare(a: SquareEuclidean, b: SquareEuclidean): Int = a.dist(x) compare b.dist(x)
}

object TakeOrdered {
val conf = new SparkConf().setMaster("local").setAppName("TakeOrdered")
val sc = new SparkContext(conf)

def seq_test {
val se2 = sc.parallelize(Seq(2, 3, 4, 5, 6)).takeOrdered(2)
println(se2.mkString(","))
}

def euclidean_test = {
val grid = (1 to 50000).map { _ => SquareEuclidean(Array(Math.random * 5, Math.random * 5))}
def se2(x: Array[Double]) = sc.parallelize(grid).takeOrdered(10)(new SquareEuclideanOrdering(x))

val t1 = System.currentTimeMillis()
for (i <- 0 until 100) {
val x = Array(Math.random * 5, Math.random * 5)
val near=se2(x)
println("x="+x.mkString(","))
near.map(f => println("near:"+f.a.mkString(",")))
}
val t2 = System.currentTimeMillis()
println("mm" + (t2 - t1))
}

def main(args: Array[String]) {
euclidean_test
}
}
======================================
.....
x=3.206772554623151,4.130511004150846
near:3.209260286412739,4.11326883401359
near:3.2177701383549135,4.111863591582609
near:3.2031367432352726,4.105879169801421
near:3.1813726021429014,4.144485846628721
near:3.234661047791947,4.139370017598104
near:3.178528106079503,4.140502127066933
near:3.1751313415460833,4.145834221496672
near:3.1820688064046654,4.157916575782007
near:3.185732874034893,4.161649268788747
near:3.213310877220475,4.168033442512988
mm15733
Spark.RDD.takeOrdered方法,  取代了k-d Tree 的功能. 将在客观分析和图计算中得到应用.
 雷达反演降水机器学习算法.
  评论这张
 
阅读(339)| 评论(0)
推荐 转载

历史上的今天

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017