Both sides previous revisionPrevious revision | |
en:services:application_services:high_performance_computing:software:spark [2023/06/03 09:21] – [Creating a Spark Cluster on the SCC] removed gwdu103 vend | en:services:application_services:high_performance_computing:software:spark [2023/06/03 09:42] (current) – [Creating a Spark Cluster on the SCC] vend |
---|
</code> | </code> |
| |
We’re now ready to deploy a Spark cluster. Since the resources of the HPC system are managed by [[en:services:application_services:high_performance_computing:running_jobs_slurm|Slurm]], the entire setup has to be submitted as a job. This can be conveniently done by running the script ''scc_spark_deploy.sh'', which accepts the same arguments as the sbatch command used to submit generic batch jobs: | We’re now ready to deploy a Spark cluster. Since the resources of the HPC system are managed by [[en:services:application_services:high_performance_computing:running_jobs_slurm|Slurm]], the entire setup has to be submitted as a job. This can be conveniently done by running the script ''scc_spark_deploy.sh'', which accepts the same arguments as the sbatch command used to submit generic batch jobs. The default setup is: |
| |
| <code> |
| #SBATCH --partition fat |
| #SBATCH --time=0-02:00:00 |
| #SBATCH --qos=short |
| #SBATCH --nodes=4 |
| #SBATCH --job-name=Spark |
| #SBATCH --output=scc_spark_job-%j.out |
| #SBATCH --ntasks-per-node=1 |
| #SBATCH --cpus-per-task=24 |
| </code> |
| |
| If you would like to override these default values, you can do so, by handing over the Slurm parameters to the script: |
| |
<code> | <code> |
Submitted batch job 872699 | Submitted batch job 872699 |
</code> | </code> |
| |
| Especially, if you do not want to share the nodes resources, you need to add ''%%--%%exclusive'' |
| |
In this case, the ''%%--%%nodes'' parameter has been set to specify a total amount of two worker nodes and ''%%--%%time'' is used to request a job runtime of two hours. If you would like to set a longer runtime, beside ''%%--%%time'', add ''%%--%%qos=normal'' parameter as well. The job ID is reported back. We can use it to inspect if the job is running yet and if so, on which nodes: | In this case, the ''%%--%%nodes'' parameter has been set to specify a total amount of two worker nodes and ''%%--%%time'' is used to request a job runtime of two hours. If you would like to set a longer runtime, beside ''%%--%%time'', add ''%%--%%qos=normal'' parameter as well. The job ID is reported back. We can use it to inspect if the job is running yet and if so, on which nodes: |