Co-authored-by: Adam Janikowski <12255597+ajanikow@users.noreply.github.com>
6 KiB
layout | parent | title |
---|---|---|
page | Custom resources overview | ArangoMLExtension |
ArangoMLExtension Custom Resource
Enterprise Edition only
Full CustomResourceDefinition reference ->
You can spin up the ArangoML engine on existing ArangoDeployment. That will allow you to train ML models and use them for predictions based on data in your database.
This instruction covers only the steps to run ArangoML in Kubernetes cluster with already running ArangoDeployment. If you don't have one yet, consider checking kube-arangodb installation guide and ArangoDeployment CR description.
To start ArangoML in your cluster, follow next steps:
-
Enable ML operator. e.g. if you are using Helm package, add
--set "operator.features.ml=true"
option to the Helm command. -
Create
ArangoMLStorage
CR. This resource provides access for ArangoML to object storage. Currently only S3 API-compatible storages are supported. In this example we will use Minio object storage. Please install Minio and make sure the endpoint is available from inside the cluster running ArangoML.
- Create Kubernetes Secret containing Minio credentials to access S3 API. The secret data should contain two fields:
accessKey
andsecretKey
. - Create Kubernetes Secret containing CA certificates to validate connection to endpoint if your Minio installation uses encrypted connection. The secret data should contain two fields:
ca.crt
andca.key
(both PEM-encoded). - Create ArangoMLStorage resource. Example:
apiVersion: ml.arangodb.com/v1alpha1
kind: ArangoMLStorage
metadata:
name: myarangoml-storage
spec:
backend:
s3: # defines access to S3 API
caSecret: # skip this field if you are not using HTTPS connection to minio
name: ml-storage-s3-ca
credentialsSecret:
name: ml-storage-s3-creds
allowInsecure: false # set to true if you want to skip certificate check
endpoint: https://minio.my-minio-tenant.svc.cluster.local
bucketName: my-arangoml-bucket # bucket will be created if it does not exist
mode: # defines how storage proxy is deployed to cluster. Currently only 'sidecar' mode is supported.
sidecar: {} # you can configure various parameters for sidecar container here. See full CRD reference for details.
- Create
ArangoMLExtension
CR. The name of extension must be the same as the name ofArangoDeployment
and it should be created in the same namespace. Assuming you have ArangoDeployment with namemyarangodb
, create CR:
apiVersion: ml.arangodb.com/v1alpha1
kind: ArangoMLExtension
metadata:
name: myarangodb
spec:
storage:
name: myarangoml-storage # name of the ArangoMLStorage created on the previous step
deployment:
# you can add here: tolerations, nodeSelector, nodeAffinity, scheduler and many other parameters. See full CRD reference for details.
replicas: 1 # by default only one pod is running which contains containers for each component (prediction, training, project). You can scale it up or down.
prediction:
image: <prediction-image>
# you can configure various parameters for container running this component here. See full CRD reference for details.
project:
image: <projects-image>
# you can configure various parameters for container running this component here. See full CRD reference for details.
training:
image: <training-image>
# you can configure various parameters for container running this component here. See full CRD reference for details.
init: # configuration for Kubernetes Job running initial bootstrap of ArangoML for your cluster.
image: <init-image>
# you can add here: tolerations, nodeSelector, nodeAffinity, scheduler and many other parameters. See full CRD reference for details.
jobsTemplates:
prediction:
cpu:
image: <prediction-job-cpu image>
# you can configure various parameters for pod and container running this component here. See full CRD reference for details.
gpu:
image: <prediction-job-gpu image>
# you can configure various parameters for pod and container running this component here. See full CRD reference for details.
resources: # this ensures that pod will be scheduled on GPU-enabled node. Adjust for your environment if neccessary.
limits:
nvidia.com/gpu: "1"
requests:
nvidia.com/gpu: "1"
training:
cpu:
image: <training-cpu-image>
# you can configure various parameters for pod and container running this component here. See full CRD reference for details.
gpu:
image: <training-gpu-image>
# you can configure various parameters for pod and container running this component here. See full CRD reference for details.
resources: # this ensures that pod will be scheduled on GPU-enabled node. Adjust for your environment if neccessary.
limits:
nvidia.com/gpu: "1"
requests:
nvidia.com/gpu: "1"
- After creation of CR, please wait a few minutes for ArangoML initialization to complete. You can check the status for ArangoMLExtension to see current state. Wait for condition
Ready
to beTrue
:
kubectl describe arangomlextension myarangodb
# ...
status:
conditions:
name: Ready
value: True
- ArangoML now is ready to use! Head to ArangoML documentation for more details on usage.
Please note the ArangoML creates a new database in your ArangoDB cluster for storing meta-information about model training and predictions. Editing or removing this database can cause ArangoML to fail or operate in an unpredictable manner.