Expanding a Greenplum Deployment

To expand a Greenplum cluster, you first use the Greenplum Operator to create the new segment pods in Kubernetes. Afterwards, you configure and run gpexpand within the cluster to initialize the new segments and redistribute data.

Note: You cannot resize a cluster to use a lower number of nodes; you must delete and re-create the cluster to reduce the number of nodes.

Follow these steps to expand Greenplum for Kubernetes cluster:

  1. Go to the workspace subdirectory where you unpacked the Greenplum for Kubernetes distribution, or to the directory where you created your Greenplum cluster deployment manifest. For example:

    $ cd ./greenplum-for-kubernetes-*/workspace
    
  2. Edit the Kubernetes manifest that you used to deploy your Greenplum cluster. Edit the file to increase the primarySegmentCount value. This example expands the default Greenplum deployment (configured in my-gp-instance.yaml):

    primarySegmentCount: 3
    
  3. After modifying the file, use kubectl apply -f <updated-manifest.yaml> to apply the change. For example:

    $ kubectl apply -f my-gp-instance.yaml
    
    greenplumcluster.greenplum.pivotal.io/my-greenplum configured
    
  4. Execute watch kubectl get all and wait until the new Greenplum pods reach the Running state:

    $ watch kubectl get all
    
    NAME                                     READY     STATUS    RESTARTS   AGE
    pod/greenplum-operator-7cc5ff7dc-7r6ng   1/1       Running   0          24m
    pod/master-0                             1/1       Running   0          22m
    pod/master-1                             1/1       Running   0          22m
    pod/segment-a-0                          1/1       Running   0          22m
    pod/segment-a-1                          0/1       Pending   0          16s
    pod/segment-a-2                          0/1       Pending   0          16s
    pod/segment-b-0                          1/1       Running   0          22m
    pod/segment-b-1                          0/1       Pending   0          16s
    pod/segment-b-2                          0/1       Pending   0          16s
    
    NAME                 TYPE           CLUSTER-IP    EXTERNAL-IP   PORT(S)          AGE
    service/agent        ClusterIP      None          <none>        22/TCP           22m
    service/greenplum    LoadBalancer   10.98.44.85   <pending>     5432:30365/TCP   22m
    service/kubernetes   ClusterIP      10.96.0.1     <none>        443/TCP          29m
    
    NAME                                 DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
    deployment.apps/greenplum-operator   1         1         1            1           24m
    
    NAME                                           DESIRED   CURRENT   READY     AGE
    replicaset.apps/greenplum-operator-7cc5ff7dc   1         1         1         24m
    
    NAME                         DESIRED   CURRENT   AGE
    statefulset.apps/master      2         2         22m
    statefulset.apps/segment-a   3         3         22m
    statefulset.apps/segment-b   3         3         22m
    
  5. After the new pods are running, open a bash shell to the Greenplum master pod:

    $ kubectl exec -it master-0 bash
    
    gpadmin@master-0:~$ 
    
  6. Use the vi editor to create a new input file for the Greenplum gpexpand utility:

    gpadmin@master-0:~$ vi expand-my-cluster
    
  7. Add entries to the file to define each of the new segment pods. (See Creating an Input File for System Expansion in the Greenplum documentation for more information about the file entries.) In this example, expanding the segment count by two added two new primary segment pods (segment-a-1 and segment-a-2) and two new mirror segment pods (segment-b-1 and segment-b-2). The corresponding gpexpand input file contains:

    segment-a-1.agent.default.svc.cluster.local:segment-a-1.agent.default.svc.cluster.local:40000:/greenplum/data:5:1:p:6000
    segment-b-1.agent.default.svc.cluster.local:segment-b-1.agent.default.svc.cluster.local:50000:/greenplum/mirror/data:6:1:m:6001
    segment-a-2.agent.default.svc.cluster.local:segment-a-2.agent.default.svc.cluster.local:40000:/greenplum/data:7:2:p:6000
    segment-b-2.agent.default.svc.cluster.local:segment-b-2.agent.default.svc.cluster.local:50000:/greenplum/mirror/data:8:2:m:6001
    

    Save your file and exit vi.

  8. Execute gpexpand with the new input file, and specify a database in which to store the expansion schema. For example:

    gpadmin@master-0:~$ gpexpand -i expand-my-cluster -D gpadmin
    
    20181110:00:09:52:005209 gpexpand:master-0:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.12.0 build dev'
    20181110:00:09:52:005209 gpexpand:master-0:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.12.0 build dev) on x86_64-pc-linux-gnu, compiled by GCC gcc (Ubuntu 6.4.0-17ubuntu1~16.04) 6.4.0 20180424, 64-bit compiled on Oct 19 2018 16:56:25'
    20181110:00:09:55:005209 gpexpand:master-0:gpadmin-[INFO]:-Querying gpexpand schema for current expansion state
    20181110:00:09:55:005209 gpexpand:master-0:gpadmin-[INFO]:-Checking database template1 for unalterable tables...
    20181110:00:09:56:005209 gpexpand:master-0:gpadmin-[INFO]:-Checking database postgres for unalterable tables...
    20181110:00:09:56:005209 gpexpand:master-0:gpadmin-[INFO]:-Checking database gpadmin for unalterable tables...
    20181110:00:09:56:005209 gpexpand:master-0:gpadmin-[INFO]:-Checking database template1 for tables with unique indexes...
    20181110:00:09:56:005209 gpexpand:master-0:gpadmin-[INFO]:-Checking database postgres for tables with unique indexes...
    20181110:00:09:56:005209 gpexpand:master-0:gpadmin-[INFO]:-Checking database gpadmin for tables with unique indexes...
    20181110:00:09:58:005209 gpexpand:master-0:gpadmin-[INFO]:-Heap checksum setting consistent across cluster
    ...
    
  9. Follow the instructions in Redistributing Tables in the Greenplum Database documentation to redistribute data to the new segments.