Source
yaml
id: git-spark
namespace: company.team
tasks:
- id: working_directory
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/scripts
branch: main
- id: spark_job
type: io.kestra.plugin.spark.SparkCLI
commands:
- spark-submit --name Pi --master spark://localhost:7077
etl/spark_pi.py
About this blueprint
CLI Data Git
This flow clones a git repository and runs a Spark job. Make sure to expose the port 7077 on your Spark master in order for this flow to work.
More Related Blueprints