EMR上运行spark程序出现内存分配报错

spark小白求指教:
最近在做相关的作业,spark的程序弄成jar包,上传到EMR上跑之后,程序运行8秒就失败了,报错信息:


17/05/06 13:44:12 INFO client.RMProxy: Connecting to ResourceManager at ip-172-31-30-132.us-west-2.compute.internal/172.31.30.132:8032
17/05/06 13:44:13 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers
17/05/06 13:44:13 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (1024 MB per container)
Exception in thread "main" java.lang.IllegalArgumentException: Required executor memory (1024+384 MB) is above the max threshold (1024 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.

at org.apache.spark.deploy.yarn.Client.verifyClusterResources(Client.scala:283)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:139)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1016)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1076)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Command exiting with ret '1'


EMR的指标是:
2个core c1 medium instances, 1个master c1 medium instance
JAR 位置: command-runner.jar
主类: 无
自变量: spark-submit --deploy-mode cluster --class target/guo-0.0.1-SNAPSHOT.jar s3://finalprojectprogram/guo-0.0.1-SNAPSHOT.jar s3://projectdata-amazonproductrate/relative_small_data.txt s3://finalprojectoutputfile/mimic30
出现故障时的操作: 继续

非常感谢!

阅读 4.5k
2 个回答

看起来是资源分配的时候内存爆了

用 EMR 的 Spark 相比自己搭 Spark,除了节省时间和资源管控方便,还有什么优势?

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进