scala一个诡异问题？

Question

scala一个诡异问题？

1.1k19215394

发布于
2021-10-21

var calcUsers = new ArrayBuffer[Int]()
      nRDD.foreach(item=>{
        val arr = item.split(' ')
        val currUserId = arr(1).toInt
        calcUsers.+=(currUserId)
        println("calcUsers",currUserId,calcUsers.length)
      })
      println("calcUsers",calcUsers.length)

第一个println可以看到数组长度再不断变长

第二个println却会输出0

为什么

如何解决？

后端 scala 人工智能算法

阅读 1.8k

1 个回答

得票最新

勇敢的少年

1.1k19215394

发布于
2021-10-21

✓ 已被采纳

RDD's is a disctributed data structure. The RDD actually does not live on the driver node (the node on which your code is actually running). All RDD operations (map, foreach) etc are actually performed on the executor nodes. So, Spark creates a closure of the operating function (you can think of it as an object containing the copies of all required variables and the function itself) and sends this object to each executor node, where it execute on the actual RDD.

In simpler words, Spark will create multiple copies of your calcUsers and will send it to executor nodes along with the function. Each executor will then execute the function using their own copy of calcUsers. The calcUser which you are seeing here will not be used at all.

查看全部 1 个回答

推荐问题

相似问题

找不到问题？创建新问题

scala一个诡异问题？

如何防止接口的 key 泄露?

字节的 trae AI IDE 不支持类似 vscode 的 ssh remote 远程开发怎么办？

在购买页面，这里有： for 1 month, for 3 months，这里说的意思是什么呢？

spring boot 报错怎么解决：Invalid bean definition with name 'appMapper' defined in file ？

请问这些AI相关的概念，是否可以方便人性化地解释是什么呢，它们的功能和解决了哪些问题呢？

求推荐双向同步数据的软件？

base32 crockford 编码与其他语言的实现结果不同?