Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:services:application_services:high_performance_computing:software:spark [2023/05/15 17:48] – [Further reading] ckoehle2en:services:application_services:high_performance_computing:software:spark [2023/06/03 09:42] (current) – [Creating a Spark Cluster on the SCC] vend
Line 9: Line 9:
 ===== Creating a Spark Cluster on the SCC ===== ===== Creating a Spark Cluster on the SCC =====
 <WRAP center round important 60%> <WRAP center round important 60%>
-We assume that you have access to the HPC system already and are logged in to one of the frontend nodes ''gwdu101''''gwdu102'' and ''gwdu103''.\\ If that's not the case, please check out our [[en:services:application_services:high_performance_computing:running_jobs_slurm|introductory documentation]] first.+We assume that you have access to the HPC system already and are logged in to one of the frontend nodes ''gwdu101'' or ''gwdu102'' .\\ If that's not the case, please check out our [[en:services:application_services:high_performance_computing:running_jobs_slurm|introductory documentation]] first.
 </WRAP> </WRAP>
  
Line 18: Line 18:
 </code> </code>
  
-We’re now ready to deploy a Spark cluster. Since the resources of the HPC system are managed by [[en:services:application_services:high_performance_computing:running_jobs_slurm|Slurm]], the entire setup has to be submitted as a job. This can be conveniently done by running the script ''scc_spark_deploy.sh'', which accepts the same arguments as the sbatch command used to submit generic batch jobs:+We’re now ready to deploy a Spark cluster. Since the resources of the HPC system are managed by [[en:services:application_services:high_performance_computing:running_jobs_slurm|Slurm]], the entire setup has to be submitted as a job. This can be conveniently done by running the script ''scc_spark_deploy.sh'', which accepts the same arguments as the sbatch command used to submit generic batch jobs. The default setup is: 
 + 
 +<code> 
 +#SBATCH --partition fat 
 +#SBATCH --time=0-02:00:00 
 +#SBATCH --qos=short 
 +#SBATCH --nodes=4 
 +#SBATCH --job-name=Spark 
 +#SBATCH --output=scc_spark_job-%j.out 
 +#SBATCH --ntasks-per-node=1 
 +#SBATCH --cpus-per-task=24 
 +</code> 
 + 
 +If you would like to override these default values, you can do so, by handing over the Slurm parameters to the script:
  
 <code> <code>
Line 24: Line 37:
 Submitted batch job 872699 Submitted batch job 872699
 </code> </code>
 +
 +Especially, if you do not want to share the nodes resources, you need to add ''%%--%%exclusive''
  
 In this case, the ''%%--%%nodes'' parameter has been set to specify a total amount of two worker nodes and ''%%--%%time'' is used to request a job runtime of two hours. If you would like to set a longer  runtime, beside ''%%--%%time'', add ''%%--%%qos=normal'' parameter as well. The job ID is reported back. We can use it to inspect if the job is running yet and if so, on which nodes: In this case, the ''%%--%%nodes'' parameter has been set to specify a total amount of two worker nodes and ''%%--%%time'' is used to request a job runtime of two hours. If you would like to set a longer  runtime, beside ''%%--%%time'', add ''%%--%%qos=normal'' parameter as well. The job ID is reported back. We can use it to inspect if the job is running yet and if so, on which nodes: