Deploying GPText with Greenplum (Beta)
This section describes procedures for deploying a Greenplum for Kubernetes cluster that includes the Pivotal GPText.
Note: Pivotal GPText in Greenplum for Kubernetes is a Beta feature.
About GPText on Greenplum for Kubernetes
When you deploy GPText to Greenplum for Kubernetes, the Greenplum Operator creates multiple dedicated pods to host the Apache Solr Cloud and ZooKeeper instances necessary for using GPText. ZooKeeper can be deployed to multiple replica pods as needed for redundancy. Currently, Apache Solr Cloud can only be deployed to a single pod.
Note that it is not possible to place the Zookeeper instances on the Greenplum segment hosts (a ‘binding’ ZooKeeper cluster), as described in the Pivotal Greenplum Text Documentation.
Deploying GPtext with the Greenplum for Kubernetes
Follow these steps to deploy GPText with a new Greenplum for Kubernetes cluster.
Use the procedure described in Deploying a New Greenplum Cluster to deploy the cluster, but use the
samples/my-gp-with-gptext-instance.yaml
as the basis for your deployment. Copy the file into your/workspace
directory. For example:$ cd ./greenplum-for-kubernetes-*/workspace $ cp ./samples/my-gp-with-gptext-instance.yaml .
Edit the file as necessary for your deployment.
samples/my-gp-with-gptext-instance.yaml
includes additional properties to configure Greenplum Text in the new cluster:apiVersion: "greenplum.pivotal.io/v1" kind: "GreenplumCluster" metadata: name: my-greenplum spec: masterAndStandby: hostBasedAuthentication: | # host all gpadmin 1.2.3.4/32 trust # host all gpuser 0.0.0.0/0 md5 memory: "800Mi" cpu: "0.5" storageClassName: standard storage: 1G antiAffinity: "yes" workerSelector: {} segments: primarySegmentCount: 1 memory: "800Mi" cpu: "0.5" storageClassName: standard storage: 1G antiAffinity: "yes" workerSelector: {} gptext: serviceName: "my-greenplum-gptext" --- apiVersion: "greenplum.pivotal.io/v1beta1" kind: "GreenplumTextService" metadata: name: my-greenplum-gptext spec: solr: replicas: 1 cpu: "0.5" memory: "1Gi" workerSelector: {} storageClassName: standard storage: 100M zookeeper: replicas: 3 cpu: "0.5" memory: "1Gi" workerSelector: {} storageClassName: standard storage: 100M
The entry:
gptext: serviceName: "my-greenplum-gptext"
Indicates that the cluster will use the GPText service configuration named
my-greenplum-gptext
, that follows at the end of the yaml file. The sample configuration creates a single Solr pod (required) and three ZooKeeper replica pods (the minimum required for Apache Solr Cloud). Minimal settings for CPU and memory are defined for each pod. You can customize these values as needed, as well as theworkerSelector
value if you want to constrain the replica pods to labeled nodes in your cluster. You can also customize thestorageClassName
if necessary to provide dedicated storage for storing GPText indexes separate from Greenplum Database.Use
kubectl apply
command with your modified PXF manifest file to send the deployment request to the Greenplum Operator. For example:$ kubectl apply -f ./my-gp-with-gptext-instance.yaml
greenplumcluster.greenplum.pivotal.io/my-greenplum created greenplumtextservice.greenplum.pivotal.io/my-greenplum-gptext created
If you are deploying another instance of a Greenplum cluster, specify the Kubernetes namespace where you want to deploy the new cluster. For example, if you previously deployed a cluster in the namespace gpinstance-1, you could deploy a second Greenplum cluster in the gpinstance-2 namespace using the command:
$ kubectl apply -f ./my-gp-with-gptext-instance.yaml -n gpinstance-2
The Greenplum Operator deploys the necessary Greenplum and GPText resources according to your specification, and also initializes the Greenplum cluster.
Execute the following command to monitor the deployment of the cluster. While the cluster is initializing the status will be
Pending
:$ watch kubectl get all
NAME READY STATUS RESTARTS AGE pod/greenplum-operator-79cddcf586-ctftb 1/1 Running 0 11m pod/master-0 1/1 Running 0 15s pod/master-1 1/1 Running 0 15s pod/my-greenplum-gptext-solr-0 0/1 Running 0 17s pod/my-greenplum-gptext-zookeeper-0 1/1 Running 0 17s pod/my-greenplum-gptext-zookeeper-1 1/1 Running 0 12s pod/my-greenplum-gptext-zookeeper-2 0/1 Pending 0 0s pod/segment-a-0 1/1 Running 0 15s pod/segment-b-0 1/1 Running 0 15s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/agent ClusterIP None <none> 22/TCP 15s service/greenplum LoadBalancer 10.100.229.5 <pending> 5432:32275/TCP 15s service/greenplum-validating-webhook-service-79cddcf586-ctftb ClusterIP 10.105.7.189 <none> 443/TCP 11m service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 28m service/my-greenplum-gptext-solr ClusterIP None <none> 8983/TCP 17s service/my-greenplum-gptext-zookeeper ClusterIP None <none> 2888/TCP,3888/TCP,2 181/TCP 17s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/greenplum-operator 1/1 1 1 11m NAME DESIRED CURRENT READY AGE replicaset.apps/greenplum-operator-79cddcf586 1 1 1 11m NAME READY AGE statefulset.apps/master 2/2 15s statefulset.apps/my-greenplum-gptext-solr 0/1 17s statefulset.apps/my-greenplum-gptext-zookeeper 2/3 17s statefulset.apps/segment-a 1/1 15s statefulset.apps/segment-b 1/1 15s NAME STATUS AGE greenplumcluster.greenplum.pivotal.io/my-greenplum Pending 17s NAME AGE greenplumtextservice.greenplum.pivotal.io/my-greenplum-gptext 17s
Note that the Solr and ZooKeeper services are created along with the Greenplum Database cluster.
Describe your Greenplum cluster to verify that it was created successfully. The Phase should eventually transition to
Running
:$ kubectl describe greenplumClusters/my-greenplum
Name: my-greenplum Namespace: default Labels: <none> Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"greenplum.pivotal.io/v1","kind":"GreenplumCluster","metadata":{"annotations":{},"name":"my-greenplum","namespace":"default"},"spec":{"gp... API Version: greenplum.pivotal.io/v1 Kind: GreenplumCluster Metadata: Creation Timestamp: 2019-10-02T23:43:05Z Finalizers: stopcluster.greenplumcluster.pivotal.io Generation: 2 Resource Version: 7399 Self Link: /apis/greenplum.pivotal.io/v1/namespaces/default/greenplumclusters/my-greenplum UID: b25e90e5-3ac2-40d6-94cb-a8b159b8134a Spec: Gptext: Service Name: my-greenplum-gptext Master And Standby: Anti Affinity: no Cpu: 0.5 Host Based Authentication: # host all gpadmin 1.2.3.4/32 trust # host all gpuser 0.0.0.0/0 md5 Memory: 800Mi Storage: 1G Storage Class Name: standard Worker Selector: Segments: Anti Affinity: no Cpu: 0.5 Memory: 800Mi Primary Segment Count: 1 Storage: 1G Storage Class Name: standard Worker Selector: Status: Instance Image: greenplum-for-kubernetes:v1.7.2.dev.51.g4530ad36 Operator Version: greenplum-operator:v1.7.2.dev.51.g4530ad36 Phase: Pending Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal CreatingGreenplumCluster 4m greenplumOperator Creating Greenplum cluster my-greenplum in default Normal CreatedGreenplumCluster 8s greenplumOperator Successfully created Greenplum cluster my-greenplum in default
If you are deploying a brand new cluster, the Greenplum Operator automatically initializes the Greenplum cluster. The
Phase
should eventually transition fromPending
toRunning
and the Events should match the output above.
Note: If you redeployed a previously-deployed Greenplum cluster, the phase will stay atPending
. It uses the previous Persistent Volume Claims if available. In this case, the master and segment data directories will already exist in their former state. In this case, master-0 pod automatically starts Greenplum Cluster. The phase should transition toRunning
.At this point, you can work with the deployed Greenplum cluster by executing Greenplum utilities from within Kubernetes, or by using a locally-installed tool, such as
psql
, to access the Greenplum instance running in Kubernetes. To validate the initial GPText service deployment configuration, execute:$ kubectl exec -it master-0 bash $ source /opt/gptext/greenplum-text_path.sh $ gptext-state configs
20191008:23:52:39:002812 gptext-state:master-0:gpadmin-[INFO]:-Execute GPText state ... 20191008:23:52:40:002812 gptext-state:master-0:gpadmin-[INFO]:-Check zookeeper cluster state ... 20191008:23:52:40:002812 gptext-state:master-0:gpadmin-[WARNING]:-object of type 'NoneType' has no len() 20191008:23:52:40:002812 gptext-state:master-0:gpadmin-[INFO]:-Cluster Configurations. ...
Ensure that the ZooKeeper and Solr nodes are available for use.
To begin working with GPText, see Working With GPText Indexes in the Pivotal GPText documentation.