You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently org.jetbrains.kotlinx.spark.api.withSpark(props, master, appName, logLevel, func) provides a default value for the parameter master. This overrides the value from the external spark config provided with the spark-submit command.
In my case, I'm running Spark on yarn, so I'm having a submit-command like this ./spark-submit --master yarn .... But when I start the SparkSession in my code with
withSpark(
props =mapOf("some_key" to "some_value"),
) {
//... my Spark code
}
, the default "local[*]" for the parameter master is used. This leads to a local Spark session on the application master (which kind of works, but does not make use of the executors provided by yarn) and yarn complaining that I never started a Spark session after my job finishes, because it doesn't recognize the local one.
I think, by default the value for master should be taken from the external config loaded by SparkConf(). As a workaround, I load the value myself:
withSpark(
master =SparkConf().get("spark.master", "local[*]"),
props =mapOf("some_key" to "some_value"),
) {
//... my Spark code
}
This works, but is not nice and duplicates the default value "local[*]".
The text was updated successfully, but these errors were encountered:
@christopherfrieler thanks for the idea! @Jolanrensen it looks like Christopher provided us with complete implementation, we should just change the default value :)
Sure! Just wanted to check whether this is the best way to go, since they did call it a "workaround" and "not nice", haha. But creating a new SparkConfig() does seem the best way to access system/jvm variables.
Currently
org.jetbrains.kotlinx.spark.api.withSpark(props, master, appName, logLevel, func)
provides a default value for the parametermaster
. This overrides the value from the external spark config provided with the spark-submit command.In my case, I'm running Spark on yarn, so I'm having a submit-command like this
./spark-submit --master yarn ...
. But when I start the SparkSession in my code with, the default
"local[*]"
for the parametermaster
is used. This leads to a local Spark session on the application master (which kind of works, but does not make use of the executors provided by yarn) and yarn complaining that I never started a Spark session after my job finishes, because it doesn't recognize the local one.I think, by default the value for
master
should be taken from the external config loaded bySparkConf()
. As a workaround, I load the value myself:This works, but is not nice and duplicates the default value "local[*]".
The text was updated successfully, but these errors were encountered: