Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
en:services:application_services:high_performance_computing:software:spark [2023/06/03 09:21] – [Creating a Spark Cluster on the SCC] removed gwdu103 venden:services:application_services:high_performance_computing:software:spark [2023/06/03 09:42] (current) – [Creating a Spark Cluster on the SCC] vend
Line 18: Line 18:
 </code> </code>
  
-We’re now ready to deploy a Spark cluster. Since the resources of the HPC system are managed by [[en:services:application_services:high_performance_computing:running_jobs_slurm|Slurm]], the entire setup has to be submitted as a job. This can be conveniently done by running the script ''scc_spark_deploy.sh'', which accepts the same arguments as the sbatch command used to submit generic batch jobs:+We’re now ready to deploy a Spark cluster. Since the resources of the HPC system are managed by [[en:services:application_services:high_performance_computing:running_jobs_slurm|Slurm]], the entire setup has to be submitted as a job. This can be conveniently done by running the script ''scc_spark_deploy.sh'', which accepts the same arguments as the sbatch command used to submit generic batch jobs. The default setup is: 
 + 
 +<code> 
 +#SBATCH --partition fat 
 +#SBATCH --time=0-02:00:00 
 +#SBATCH --qos=short 
 +#SBATCH --nodes=4 
 +#SBATCH --job-name=Spark 
 +#SBATCH --output=scc_spark_job-%j.out 
 +#SBATCH --ntasks-per-node=1 
 +#SBATCH --cpus-per-task=24 
 +</code> 
 + 
 +If you would like to override these default values, you can do so, by handing over the Slurm parameters to the script:
  
 <code> <code>
Line 24: Line 37:
 Submitted batch job 872699 Submitted batch job 872699
 </code> </code>
 +
 +Especially, if you do not want to share the nodes resources, you need to add ''%%--%%exclusive''
  
 In this case, the ''%%--%%nodes'' parameter has been set to specify a total amount of two worker nodes and ''%%--%%time'' is used to request a job runtime of two hours. If you would like to set a longer  runtime, beside ''%%--%%time'', add ''%%--%%qos=normal'' parameter as well. The job ID is reported back. We can use it to inspect if the job is running yet and if so, on which nodes: In this case, the ''%%--%%nodes'' parameter has been set to specify a total amount of two worker nodes and ''%%--%%time'' is used to request a job runtime of two hours. If you would like to set a longer  runtime, beside ''%%--%%time'', add ''%%--%%qos=normal'' parameter as well. The job ID is reported back. We can use it to inspect if the job is running yet and if so, on which nodes: