-->

DEVOPSZONES

  • Recent blogs

    Problem: Container Killed by YARN or hung for exceeding memory limits in Spark on AWS EMR

    Problem: Container Killed by YARN or hung for exceeding memory limits in Spark on Amazon EMR

    Spark on EMR


    Resolution: 


    The solution Varies according to your use case. In my case i was not able to run multiple spark jobs. The first job eats up all the memory and never released it, unless we kill it. I know that's not a solution. Lets Check What could be the Possible resolutions:

    Reduce the number of executor cores

    This reduces the maximum number of tasks that the executor can perform, which reduces the amount of memory required. Depending on the driver container that's throwing this error or the other executor container that's getting this error, consider decreasing cores for either the driver or the executor.

    On a running cluster:

    Modify spark-defaults.conf on the master node. 
    Example:

    #vi /etc/spark/conf/spark-defaults.conf
    spark.driver.cores  1
    spark.executor.cores  1        ---------------> I've Decreased this value to 1, the default value can vary according to your instances types

    On a new cluster:

    Add a configuration object similar to the following when you launch a cluster:

    [
      {
        "Classification": "spark-defaults",
        "Properties": {"spark.driver.cores" : "1",
          "spark.executor.cores": "1"
        }
      }
    ]

    For a single job:

    Use the --executor-cores option to reduce the number of executor cores when you run spark-submit. 

    Example:


    spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --executor-cores 1 --driver-cores 1 /usr/lib/spark/examples/jars/spark-examples.jar 100


    Increase memory overhead

    Memory overhead is the amount of off-heap memory allocated to each executor. By default, memory overhead is set to either 10% of executor memory or 384, whichever is higher.

    Consider making gradual increases in memory overhead, up to 25%. Be sure that the sum of the driver or executor memory plus the driver or executor memory overhead is always less than the value of yarn.nodemanager.resource.memory-mb for your Amazon Elastic Compute Cloud (Amazon EC2) instance type:

    Formula:


    spark.driver/executor.memory + spark.driver/executor.memoryOverhead < yarn.nodemanager.resource.memory-mb

    If the error occurs in the driver container or executor container, consider increasing memory overhead for that container only. You can increase memory overhead while the cluster is running, when you launch a new cluster, or when you submit a job.

    On a running cluster:

    Modify spark-defaults.conf on the master node. Example:


    #vi /etc/spark/conf/spark-defaults.conf

    spark.driver.memoryOverhead 512
    spark.executor.memoryOverhead 512

    On a new cluster:

    Add a configuration object similar to the following when you launch a cluster:

    [
      {
        "Classification": "spark-defaults",
        "Properties": {
          "spark.driver.memoryOverhead": "512",
          "spark.executor.memoryOverhead": "512"
        }
      }
    ]

    For a single job:

    Use the --conf option to increase memory overhead when you run spark-submit. 

    Example:

    spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --conf spark.driver.memoryOverhead=512 --conf spark.executor.memoryOverhead=512 /usr/lib/spark/examples/jars/spark-examples.jar 100


    Increase driver and executor memory

    If the error occurs in either a driver container or an executor container, consider increasing memory for either the driver or the executor, but not both. Be sure that the sum of driver or executor memory plus driver or executor memory overhead is always less than the value of yarn.nodemanager.resource.memory-mb for your EC2 instance type:


    spark.driver/executor.memory + spark.driver/executor.memoryOverhead < yarn.nodemanager.resource.memory-mb
    On a running cluster:

    Modify spark-defaults.conf on the master node. Example:


    #vi /etc/spark/conf/spark-defaults.conf

    spark.executor.memory  1g  ---------------> I've increased this value to 1g, the default value can vary according to your instances types.
    spark.driver.memory  1g

    On a new cluster:

    Add a configuration object similar to the following when you launch a cluster:


    [
      {
        "Classification": "spark-defaults",
        "Properties": {
          "spark.executor.memory": "1g",
          "spark.driver.memory":"1g",
        }
      }
    ]

    For a single job:

    Use the --executor-memory and --driver-memory options to increase memory when you run spark-submit. 

    Example:


    spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --executor-memory 1g --driver-memory 1g /usr/lib/spark/examples/jars/spark-examples.jar 100

    No comments