Greenplum Operator Manifest File

This section describes each of the properties that you can define in a Greenplum Operator manifest file.

Synopsis

apiVersion: "greenplum.pivotal.io/v1"
kind: "GreenplumCluster"
metadata:
  name: <string>
  namespace: <string>
spec:
  masterAndStandby:
    hostBasedAuthentication: |
      [ host  <database>  <role>  <address>  <authentication-method> ]
      [ ... ]
    memory: <memory-limit>
    cpu: <cpu-limit>
    storageClassName: <storage-class>
    storageSize: <size>
    workerSelector: {
      <label>: "<value>"
      [ ... ]
    }
    antiAffinity: <yes|no>
  segments:
    primarySegmentCount: <int>
    memory: <memory-limit>
    cpu: <cpu-limit>
    storageClassName: <storage-class>
    storageSize: <size>
    workerSelector: {
      <label>: "<value>"
      [ ... ]
    }
    antiAffinity: <yes|no>
    mirrors: <yes|no>

Description

You specify Greenplum cluster configuration properties to the Greenplum Operator via a YAML-formatted manifest file. A sample manifest file is provided in workspace/my-greenplum-cluster.yaml. The current version of the manifest supports configuring the cluster name, number of segments, and the memory, cpu, and storage settings for master and segment pods. See also Deploying a New Greenplum Cluster for information about deploying a new cluster using a manifest file.

Keywords and Values

Cluster Metadata

name: <string>
(Required) Sets the name of the Greenplum cluster instance resources. You can filter the output of kubectl commands using this name. For example, if you set the name to my-greenplum then you can use commands like kubectl get all -l greenplum-cluster=my-greenplum or kubectl get pvc -l greenplum-cluster=my-greenplumto get all resources or pvc associated with the Greenplum cluster instance respectively.

This value cannot be dynamically changed for an existing cluster. If you attempt to change this value and re-apply it to an existing cluster, the Operator will interpret the change as a new deployment and reject it as only one Greenplum cluster instance is allowed per namespace.

namespace: <string>
(Optional) Specifies the namespace in which the Greenplum cluster is deployed. If not specified, the current kubectl context’s namespace will be used for cluster deployment. To set kubectl’s current context to a specific namespace, use the command:

$ kubectl config set-context $(kubectl config current-context) --namespace=<NAMESPACE>

This value cannot be dynamically changed for an existing cluster. To deploy an existing cluster to a different namespace, first delete the cluster instance and then deploy it using the new namespace value.

Segment Configuration

masterAndStandby:, segments:
These sections share many of the same properties to configure memory, CPU, and storage for Greenplum segment pods. masterAndStandby: settings apply only to both the master and standby master pods. All Greenplum for Kubernetes clusters include a standby master. The segments: section applies to each primary segment and mirror segment pod. All Greenplum for Kubernetes clusters use segment mirroring.

hostBasedAuthentication:
(Optional) Entries to add to the pg_hba.conf file generated for the Greenplum cluster. Each entry (multiple entries are possible) must include the items host <database> <role> <address> <authentication-method> in that order, to enable a role to access the indicated database (or all databases) from the specified CIDR and authentication method. See Allowing Connections to Greenplum Database in the Greenplum Database documentation for more information about pg_hba.conf file entries.

This value cannot be dynamically changed for an existing cluster. The Operator only uses this value to populate the initial pg_hba.conf file that is created with a new cluster. You cannot use this property to change the existing generated file; instead, modify it directly on the master pod using a text editor. See the section on Editing the pg_hba.conf File in Allowing Connections to Greenplum Database.

primarySegmentCount: <int>
(Required) The number of primary/mirror segment pod pairs to create in the Greenplum cluster. Segment pods use the naming format segment-<type>-<number> where the segment <type> is either a for primary segments or b for mirror segments. Segment numbering starts at zero. If you omit this property, the Operator will fail to create a Greenplum cluster because it requires at least 1 primary/mirror segment pair.

You can increase this value and re-apply it to an existing cluster, but you may need to manually gpexpand to incorporate the additional segments. See the section on gpexpand.

Note: You cannot decrease this value for an existing cluster unless you first delete both the deployed cluster and the PVCs that were created for that cluster. See Deleting Greenplum Persistent Volume Claims.

memory: <memory-limit>
(Optional) The amount of memory allocated to a Greenplum pod. This value defines a memory limit; if a pod tries to exceed the limit it is removed and replaced by a new pod. You can specify a suffix to define the memory units (for example, 4.5Gi.). If omitted or left empty, the pod has no upper bound on the memory resource it can use or inherits the default limit if one is specified in its deployed namespace. See Assign Memory Resources to Containers and Pods in the Kubernetes documentation for more information.

This value cannot be dynamically changed for an existing cluster. If you attempt to make changes to this value and re-apply it to an existing cluster, the change will be rejected. If you wish to update this value, you must delete the existing cluster and recreate the cluster for the new value to take effect. See Upgrade a Greenplum Cluster.

Note: If you do not want to specify a memory limit, comment-out or remove the memory: keyword from the YAML file, or specify an empty string for its value (memory: ""). If the keyword appears in the YAML file, you must assign a valid string value to it.

cpu: <cpu-limit>
(Optional) The amount of CPU resources allocated to a Greenplum pod, specified as a Kubernetes CPU unit (for example, cpu: "1.2"). If omitted or left empty, the pod has no upper bound on the CPU resource it can use or inherits the default limit if one is specified in its deployed namespace. See Assign CPU Resources to Containers and Pods in the Kubernetes documentation for more information.

This value cannot be dynamically changed for an existing cluster. If you attempt to make changes to this value and re-apply it to an existing cluster, the change will be rejected. If you wish to update this value, you must delete the existing cluster and recreate the cluster for the new value to take effect. See Upgrade a Greenplum Cluster.

Note: If you do not want to specify a cpu limit, comment-out or remove the cpu: keyword from the YAML file, or specify an empty string for its value (cpu: ""). If the keyword appears in the YAML file, you must assign a valid string value to it.

storageClassName: <storage-class>
(Required) The Storage Class name to use for dynamically provisioning Persistent Volumes (PVs) for a Greenplum pod. If the PVs already exist, either from a previous deployment of the Greenplum instance or because you manually provisioned the PVs, then the Greenplum Operator uses the existing PVs. You can configure the Storage Class according to your performance needs. See Storage Classes in the Kubernetes documentation to understand the different configuration options.

For best performance, Pivotal recommends using persistent volumes that are backed by a local SSD with the XFS filesystem, using readahead cache for best performance. Use the mount options rw,nodev,noatime,nobarrier,inode64 to mount the volume. See Creating Local Persistent Volumes for Greenplum for information about manually provisioning local persistent volumes. See Optimizing Persistent Disk and Local SSD Performance in the Google Cloud documentation for information about the performance characteristics of different storage types.

You cannot change this value for an existing cluster unless you first delete both the deployed cluster and the PVCs that were created for that cluster. See Deleting Greenplum Persistent Volume Claims.

storageSize: <size>
(Required) The storage size of the Persistent Volume Claim (PVC) for a Greenplum pod. Specify a suffix for the units (for example: 100G, 1T).

You cannot change this value for an existing cluster unless you first delete both the deployed cluster and the PVCs that were created for that cluster. See Deleting Greenplum Persistent Volume Claims.

workerSelector: <map of key-value pairs>
(Optional) One or more selector labels to use for choosing Greenplum pods. Specify one or more label-value pairs to constrain Greenplum pods to nodes having the matching labels. Define the selector labels as you would for a pod’s nodeSelector attribute. You can define the workerSelector attribute for Greenplum master and standby pods and/or for segment pods. If a workerSelector is not desired, remove the workerSelector attribute from the manifest file.

For example, consider the case where you assign the label gpdb-worker=master to one or more pods using the command:

$ kubectl label node <node_name> gpdb-worker=master

Similarly, pods reserved for Greenplum segments might be assigned the gpdb-worker=segments label:

  $ kubectl label node <node_name> gpdb-worker=segments

With the above labels present in your cluster, you would edit the Greenplum Operator manifest file to specify the same key-value pairs in the workerSelector attribute. This shows the relevant excerpt from the manifest file:

  masterAndStandby:
    storageClassName: gpdb-storage
    storageSize: 5G
    workerSelector: {
      gpdb-worker: "master"
    }
  segments:
    primarySegmentCount: 6
    storageClassName: gpdb-storage
    storageSize: 50G
    workerSelector: {
      gpdb-worker: "segments"
    }


This value cannot be dynamically changed for an existing cluster. If you want to update this value, you must first delete the existing cluster and then recreate the cluster for the new value to take effect. See Upgrade a Greenplum Cluster.

antiAffinity: <yes or no>
(Optional) Enables or disables the anti-affinity property when deploying a Greenplum cluster. Specifying “yes” means that the Operator guarantees scheduling a mirror segment to a different worker node than its corresponding primary segment (or similarly for master and standby). If anti-affinity cannot be achieved, then the Operator aborts the Greenplum cluster deployment. Specifying “no” means that the Operator uses the native Kubernetes system scheduler to schedule primary or mirror segments based on available worker nodes indiscriminately (similarly for master and standby). Defaults to “yes” if omitted or left empty.

Notes:
  • If you are using virtual machines, always ensure that there is one Kubernetes worker per server for to ensure high availability.
  • Set antiAffinity to “yes” unless the deployment is to a single Kubernetes worker node.
  • The antiAffinity values for masterAndStandby and segments must be the same. Otherwise, the cluster will fail to deploy.

This value cannot be dynamically changed for an existing cluster. If you wish to update this value, you must delete the existing cluster and recreate the cluster for the new value to take effect.

mirrors: <yes or no>
(Optional) Enables or disables the use of segment mirroring when deploying a Greenplum cluster. Defaults to “yes” if omitted or left empty. Keep in mind that segment mirroring is required for all production clusters. This value cannot be dynamically changed for an existing cluster. If you wish to update this value, you must delete the existing cluster and recreate the cluster for the new value to take effect.

Note: If mirrors is set to “no”, antiAffinity must also be set to “no”.

Examples

See the workspace/my-greenplum-cluster.yaml for an example manifest.

See Also

Deploying a New Greenplum Cluster, Deleting a Greenplum Cluster, Installing Greenplum for Kubernetes.