Deploying a New Greenplum Cluster
This section describes how to use the Greenplum Operator deploy a Greenplum cluster to your Kubernetes system. You can use these instructions either to deploy a brand new cluster (provisioning new, empty Persistent Volume Claims in Kubernetes), or to re-deploy an earlier cluster, re-using existing Persistent Volumes if available.
Prerequisites
This procedure requires that you first install the Greenplum for Kubernetes docker images and create the Greenplum Operator in your Kubernetes system. See Installing Greenplum for Kubernetes for more information.
Verify that the Greenplum Operator is installed and running in your system before you continue:
$ helm list
NAME REVISION UPDATED STATUS CHART APP VERSION NAMESPACE
greenplum-operator 1 Thu Oct 11 15:38:54 2018 DEPLOYED operator-0.1.0 1.0 default
Procedure
Go to the
workspace
subdirectory where you unpacked the Greenplum for Kubernetes distribution:$ cd ./greenplum-for-kubernetes-*/workspace
If necessary, create a Kubernetes manifest file to specify the configuration of your Greenplum cluster. A sample file is provided in
workspace/my-gp-instance.yaml
.my-gp-instance.yaml
contains the minimal set of instructions necessary to create a demonstration cluster named “my-greenplum” with a single segment (a single primary and mirror segment) and default storage, memory, and CPU settings:apiVersion: "greenplum.pivotal.io/v1" kind: "GreenplumCluster" metadata: name: my-greenplum spec: masterAndStandby: hostBasedAuthentication: | # host all gpadmin 1.2.3.4/32 trust # host all gpuser 0.0.0.0/0 md5 memory: "800Mi" cpu: "0.5" storageClassName: standard storage: 1G segments: primarySegmentCount: 1 memory: "800Mi" cpu: "0.5" storageClassName: standard storage: 2G
Most non-trivial clusters will require configuration changes to specify additional segments, cpu, memory,
pg_hba.conf
entries, and Storage Class resources. See Greenplum Operator Manifest File for information about these configuration parameters and change them as necessary before you continue.
If you want to re-deploy a Greenplum cluster that you previously deployed, simply locate the existing configuration file.Use
kubectl apply
command and specify your manifest file to send the deployment request to the Greenplum Operator. For example, to use the samplemy-gp-instance.yaml
file:$ kubectl apply -f ./my-gp-instance.yaml
greenplumcluster.greenplum.pivotal.io/my-greenplum created
The Greenplum Operator deploys the necessary Greenplum resources according to your specification, and also populates a
gpinitsystem
configuration file necessary to initialize the Greenplum cluster. If there are no existing Persistent Volume Claims for the cluster, new PVCs are created and used for the deployment. If PVCs for the cluster already exist, they are used as-is with the available data.Use
watch kubectl get all
and wait until all Greenplum cluster pods have the statusRunning
:$ watch kubectl get all
NAME READY STATUS RESTARTS AGE pod/greenplum-operator-58dd68b9c5-frrbz 1/1 Running 3 3h pod/master-0 1/1 Running 0 1m pod/master-1 1/1 Running 0 1m pod/segment-a-0 1/1 Running 0 1m pod/segment-b-0 1/1 Running 0 1m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/agent ClusterIP None <none> 22/TCP 1m service/greenplum LoadBalancer 10.110.26.184 <pending> 5432:32686/TCP 1m service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 22h NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE deployment.apps/greenplum-operator 1 1 1 1 3h NAME DESIRED CURRENT READY AGE replicaset.apps/greenplum-operator-58dd68b9c5 1 1 1 3h NAME DESIRED CURRENT AGE statefulset.apps/master 2 2 1m statefulset.apps/segment-a 1 1 1m statefulset.apps/segment-b 1 1 1m
Describe your Greenplum cluster to verify that it was created successfully:
$ kubectl describe greenplumClusters/my-greenplum
Name: my-greenplum Namespace: default Labels: <none> Annotations: kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"greenplum.pivotal.io/v1","kind":"GreenplumCluster", "metadata":{"annotations":{},"name":"my-greenplum", "namespace":"default"... API Version: greenplum.pivotal.io/v1 Kind: GreenplumCluster Metadata: Creation Timestamp: 2019-01-10T22:15:40Z Generation: 1 Resource Version: 28403 Self Link: /apis/greenplum.pivotal.io/v1/namespaces/default/ greenplumclusters/my-greenplum UID: 43842f53-1525-11e9-941d-080027530600 Spec: Master And Standby: Cpu: 0.5 Host Based Authentication: # host all gpadmin 1.2.3.4/32 trust # host all gpuser 0.0.0.0/0 md5 Memory: 800Mi Storage: 1G Storage Class Name: standard Worker Selector: Segments: Cpu: 0.5 Memory: 800Mi Primary Segment Count: 1 Storage: 2G Storage Class Name: standard Worker Selector: Status: Instance Image: greenplum-for-kubernetes:latest Phase: Running Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal created 12s greenplumOperator greenplumCluster created Normal updated 12s greenplumOperator greenplumCluster updated successfully
The
Phase
should beRunning
and the Events should match the output above.If you redeployed a previously-deployed Greenplum cluster, it uses any Persistent Volume Claims that were available. In this case, the master and segment data directories will already exist in their former state, and you can skip this step.
If you are deploying a brand new cluster, wait until all Greenplum pods show the statusRunning
and then execute thewrap_initialize_cluster.bash
script on themaster-0
pod to initialize all Greenplum cluster segments:$ kubectl exec -it master-0 /home/gpadmin/tools/wrap_initialize_cluster.bash
Key scanning started ******************************* Initializing Greenplum for Kubernetes Cluster ******************************* ******************************* SSH KeyScan started ******************************* ******************************* Generating gpinitsystem_config ******************************* Sub Domain for the cluster is: agent.default.svc.cluster.local ******************************* Running gpinitsystem ******************************* 20181005:23:30:45:000101 gpinitsystem:master-0:gpadmin-[INFO]:-Environment variable $LOGNAME unset, will set to gpadmin 20181005:23:30:45:000101 gpinitsystem:master-0:gpadmin-[INFO]:-Checking configuration parameters, please wait... ... ******************************* Running createdb *******************************
At this point, you can work with the deployed Greenplum cluster by executing Greenplum utilities from within Kubernetes, or by using a locally-installed tool, such as
psql
, to access the Greenplum instance running in Kubernetes. For example, to run thepsql
utility on themaster-0
pod:$ kubectl exec -it master-0 bash -- -c "source /opt/gpdb/greenplum_path.sh; psql"
psql (8.3.23) Type "help" for help.
gpadmin=# select * from gp_segment_configuration;
dbid | content | role | preferred_role | mode | status | port | hostname | address | repli cation_port ------+---------+------+----------------+------+--------+-------+--------------- ---------------------------+---------------------------------------------+------ ------------ 1 | -1 | p | p | s | u | 5432 | master-0 | master-0.agent.default.svc.cluster.local | 2 | 0 | p | p | s | u | 40000 | segment-a-0 | segment-a-0.agent.default.svc.cluster.local | 6000 3 | 0 | m | m | s | u | 50000 | segment-b-0 | segment-b-0.agent.default.svc.cluster.local | 6001 4 | -1 | m | m | s | u | 5432 | master-1.agent .default.svc.cluster.local | master-1.agent.default.svc.cluster.local | (4 rows)
(Enter
\q
to exit thepsql utility
.)