Source
id: python-docker-artifact-registry-gcp
namespace: company.team
tasks:
- id: wdir
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: download_csv
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv
- id: fetch_auth_token
type: io.kestra.plugin.gcp.auth.OauthAccessToken
projectId: YOUR_GCP_PROJECT_NAME
serviceAccount: "{{ secret('GCP_CREDS') }}"
- id: analyze_sales
type: io.kestra.plugin.scripts.python.Script
inputFiles:
data.csv: "{{ outputs.download_csv.uri }}"
script: |
import pandas as pd
from kestra import Kestra
df = pd.read_csv("data.csv")
sales = df.total.sum()
med = df.quantity.median()
Kestra.outputs({"total_sales": sales, "median_quantity": med})
top_sellers = df.sort_values(by="total", ascending=False).head(3)
print(f"Top 3 orders: {top_sellers}")
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
config: >
{
"auths": {
"yourGcpRegion-docker.pkg.dev": {
"username": "oauth2accesstoken",
"password": "{{ outputs.fetch_auth_token.accessToken.tokenValue }}"
}
}
}
containerImage: yourGcpRegion-docker.pkg.dev/YOUR_GCP_PROJECT_NAME/REPO_NAME/python:latest
About this blueprint
Docker Python Software Engineering GCP
This flow downloads a CSV file from a public URL, analyzes it with a Python script, and outputs the results.
The Python script is executed in a Docker container.
The Docker image is stored in Google Artifact Registry.
To push an image to Google Artifact Registry, you need to: - Create a Google Cloud Platform service account with the Artifact Registry Writer
role. - Create a JSON key for the service account. - Create a secret with the contents of the JSON key. - Build a Docker image: docker build -t yourGcpRegion-docker.pkg.dev/YOUR_GCP_PROJECT_NAME/REPO_NAME/python:latest .
- Push the image to Google Artifact Registry: docker push yourGcpRegion-docker.pkg.dev/YOUR_GCP_PROJECT_NAME/REPO_NAME/python:latest
Note that the OauthAccessToken
task is necessary to securely fetch a short-lived access token.