val originInfoDF = spark.sql("select col1, col2, col3, col4 from table_T")
val aggData = originInfoDF
.groupBy("col1")
.agg(collect_set(array("col2", "col3")), first("col4"))
.toDF("c1", "c2", "c4")
.map(line => {
val x = line.get... // 处理代码
//如果返回22个以内, 直接返回 (x, y, z)
//如果超过22个, 该怎么写来代替元组呢?
})
已解决
外站.