Greenplum GPText Service Properties (Beta)

This section describes each of the properties that you can define for a GreenplumTextService configuration in the Pivotal Greenplum manifest file.

Synopsis

apiVersion: "greenplum.pivotal.io/v1beta1"
kind: "GreenplumTextService"
metadata:
  name: <string>
spec:
  solr:
    replicas: <integer>
    cpu: <cpu-limit>
    memory: <memory-limit>
    workerSelector: {
        <label>: "<value>"
        [ ... ]
    }
    storageClassName: <storage-class>
    storage: <size>
  zookeeper:
    replicas: <integer>
    cpu: <cpu-limit>
    memory: <memory-limit>
    workerSelector: {
        <label>: "<value>"
        [ ... ]
    }
    storageClassName: <storage-class>
    storage: <size>

Description

You specify Greenplum Text configuration properties to the Greenplum Operator via the YAML-formatted Greenplum manifest file. A sample manifest file is provided in workspace/samples/my-gp-with-gptext-instance.yaml. The current version of the manifest supports configuring memory and CPU limits for the ZooKeeper and Apache Solr Cloud pods. See also Deploying GPText with Greenplum for information about deploying a new Greenplum cluster with GPText using a manifest file.

Note: As a best practice, keep the Greenplum Text configuration properties in the same manifest file as Greenplum Database, to simplify upgrades or changes to the related service objects.

Keywords and Values

Cluster Metadata

name: <string>
(Required) Sets the name of the GPText instance resources. You can filter the output of kubectl commands using this name.

This value cannot be dynamically changed for an existing cluster. If you attempt to change this value and re-apply it to an existing cluster, the Operator will create a new deployment.

Greenplum Text Configuration

replicas: <int>
(Optional) The number of Apache Solr Cloud or ZooKeeper replica pods to create in the Greenplum cluster. For Solr, the default is 2. For ZooKeeper, replicas is optional, but must be 3 if specified.

Note: If the replicas property is set to 1 for Apache Solr Cloud, Greenplum Text will display the following warning during certain operations: WARNING: There are not enough Solr nodes to match the replica HA policy, you need to either expand your cluster or change your policy settings. (SearchingService.cpp:640). This warning is expected and can be removed by specifying more than 1 solr replica.

memory: <memory-limit>
(Optional) The amount of memory allocated to each Solr or ZooKeeper pod. You can specify a suffix to define the memory units (for example, 4.5Gi.). This value defines a memory limit; if a pod tries to exceed the limit it is removed and replaced by a new pod. For information about computing the memory required for Solr nodes, see Calculating the Memory Configuration for Solr Nodes in the Pivotal Greenplum Text documentation.

If you do not want to specify a memory limit, comment-out or remove the memory: keyword from the YAML file. If this value is omitted, the pod has no upper bound on the memory resource it can use, or it inherits the default limit if one is specified in its deployed namespace. See Assign Memory Resources to Containers and Pods in the Kubernetes documentation.

This value cannot be dynamically changed for an existing cluster. If you attempt to make changes to this value and re-apply it to an existing cluster, it re-creates existing pods with a rolling update strategy.

Note: Pivotal Greenplum for Kubernetes automatically configures each Solr and ZooKeeper JVM to use 75% of the memory available to its pod. This generally requires no further configuration or tuning. For Solr nodes, the percentages is automatically configured with SOLR_JAVA_MEM="-XX:MaxRAMPercentage=75.0" in /opt/solr/bin/solr.in.sh. For ZooKeeper pods, it is automatically configured with export JVMFLAGS="-XX:MaxRAMPercentage=75.0" in /opt/zookeeper/conf/java.env.

cpu: <cpu-limit>
(Optional) The amount of CPU resources allocated to each Solr or ZooKeeper pod, specified as a Kubernetes CPU unit (for example, cpu: "1.2"). If omitted, the pod has no upper bound on the CPU resource it can use or inherits the default limit if one is specified in its deployed namespace. See Assign CPU Resources to Containers and Pods in the Kubernetes documentation for more information.

This value cannot be dynamically changed for an existing cluster. If you attempt to make changes to this value and re-apply it to an existing cluster, it re-creates existing pods with a rolling update strategy.

Note: If you do not want to specify a cpu limit, comment-out or remove the cpu: keyword from the YAML file.

workerSelector: <map of key-value pairs>
(Optional) Specify one or more label-value pairs to constrain GPText pods to nodes having the matching labels. Define the selector label-value pairs as you would for a pod’s nodeSelector attribute.

For example, consider the case where you assign the label worker=gpdb-gptext to one or more pods using the command:

$ kubectl label node <node_name> worker=gpdb-gptext

With the above labels present in your cluster, you would edit the Greenplum Operator manifest file to specify the same key-value pairs in the workerSelector attribute. This shows the relevant excerpt from the manifest file:

    ...
    workerSelector: {
      worker: "gpdb-gptext"
    }
    ...


This value cannot be dynamically changed for an existing cluster. Applying an update to this field will recreate the Solr and ZooKeeper pods with a rolling-update strategy for the new value to take effect.

storageClassName: <storage-class>
(Required) The Storage Class to use for dynamically provisioning Persistent Volumes (PVs) for the Greenplum Text Solr and ZooKeeper pods. If the PVs already exist, either from a previous deployment of the Greenplum instance or because you manually provisioned the PVs, then the Greenplum Operator uses the existing PVs. You can configure the Storage Class according to your performance needs. See Storage Classes in the Kubernetes documentation to understand the different configuration options.

You cannot change this value for an existing cluster unless you first delete both the deployed cluster and the PVCs that were created for that cluster. See Deleting Greenplum Persistent Volume Claims.

storageSize: <size>
(Required) The storage size of the Persistent Volume Claim (PVC) for Greenplum Text Solr and ZooKeeper pods. Specify a suffix for the units (for example: 100G, 1T).

You cannot change this value for an existing GPText instance unless you first delete both the deployed GPText instance and the PVCs that were created for that cluster. See Deleting Greenplum Persistent Volume Claims.

Examples

See the workspace/samples/my-gp-with-gptext-instance.yaml file for an example manifest.

See Also

Deploying GPText with Greenplum (Beta)