Skip to main content

google_dataproc_cluster resource

Syntax

A google_dataproc_cluster is used to test a Google Cluster resource

Beta Resource

This resource has beta fields available. To retrieve these fields, include beta: true in the constructor for the resource

Examples

describe google_dataproc_cluster(project: 'chef-gcp-inspec', region: 'europe-west2', cluster_name: 'inspec-dataproc-cluster') do
  it { should exist }
  its('labels') { should include('label' => 'value') }
  its('config.master_config.num_instances') { should cmp '1' }
  its('config.worker_config.num_instances') { should cmp '2' }
  its('config.master_config.machine_type_uri') { should match 'n1-standard-1' }
  its('config.worker_config.machine_type_uri') { should match 'n1-standard-1' }
  its('config.software_config.properties') { should include('dataproc:dataproc.allow.zero.workers' => 'true') }
end

describe google_dataproc_cluster(project: 'chef-gcp-inspec', region: 'europe-west2', cluster_name: 'nonexistent') do
  it { should_not exist }
end

Properties

Properties that can be accessed from the google_dataproc_cluster resource:

cluster_name
The name of the cluster, unique within the project and region.
labels
Labels to apply to this cluster. A list of key->value pairs.
config
Configuration for the cluster
config_bucket
The Cloud Storage staging bucket used to stage files, such as Hadoop jars, between client machines and the cluster.
gce_cluster_config
Common config settings for resources of Google Compute Engine cluster instances, applicable to all instances in the cluster.
zone_uri
The zone where the Compute Engine cluster will be located
network_uri
The Compute Engine network to be used for machine communications
subnetwork_uri
The Compute Engine subnetwork to be used for machine communications
internal_ip_only
If true, all instances int he cluster will only have internal IP addresses
service_account_scopes
The URIs of service account scopes to be included in Compute Engine instances The following base set of scopes is always included: https://www.googleapis.com/auth/cloud.useraccounts.readonly https://www.googleapis.com/auth/devstorage.read_write https://www.googleapis.com/auth/logging.write
tags
The Compute Engine tags to add to all instances
metadata
The map of metadata entries to add to all instances
master_config
The config settings for Compute Engine resources in an instance group, such as a master or worker group.
num_instances
The number of VM instances in the instance group. For master instance groups, must be set to 1.
instance_names
The list of instance names.
image_uri
The Compute Engine image resource used for cluster instances.
machine_type_uri
The Compute Engine machine type used for cluster instances
disk_config
Disk option config settings
boot_disk_type
Type of the boot disk. Valid values are “pd-ssd” or “pd-standard”
boot_disk_size_gb
Size in GB of the boot disk.
num_local_ssds
Number of attached SSDs, from 0 to 4.
is_preemptible
Specifies if this instance group contains preemptible instances.
managed_group_config
The config for Compute Engine Instance Group Manager that manages this group. This is only used for preemptible instance groups.
instance_template_name
The name of the Instance Template used for the Managed Instance Group.
instance_group_manager_name
The name of the Instance Group Manager for this group
worker_config
The config settings for Compute Engine resources in an instance group, such as a master or worker group.
num_instances
The number of VM instances in the instance group. For master instance groups, must be set to 1.
instance_names
The list of instance names.
image_uri
The Compute Engine image resource used for cluster instances.
machine_type_uri
The Compute Engine machine type used for cluster instances
disk_config
Disk option config settings
boot_disk_type
Type of the boot disk. Valid values are “pd-ssd” or “pd-standard”
boot_disk_size_gb
Size in GB of the boot disk.
num_local_ssds
Number of attached SSDs, from 0 to 4.
is_preemptible
Specifies if this instance group contains preemptible instances.
managed_group_config
The config for Compute Engine Instance Group Manager that manages this group. This is only used for preemptible instance groups.
instance_template_name
The name of the Instance Template used for the Managed Instance Group.
instance_group_manager_name
The name of the Instance Group Manager for this group
secondary_worker_config
The config settings for Compute Engine resources in an instance group, such as a master or worker group.
num_instances
The number of VM instances in the instance group. For master instance groups, must be set to 1.
instance_names
The list of instance names.
image_uri
The Compute Engine image resource used for cluster instances.
machine_type_uri
The Compute Engine machine type used for cluster instances
disk_config
Disk option config settings
boot_disk_type
Type of the boot disk. Valid values are “pd-ssd” or “pd-standard”
boot_disk_size_gb
Size in GB of the boot disk.
num_local_ssds
Number of attached SSDs, from 0 to 4.
is_preemptible
Specifies if this instance group contains preemptible instances.
managed_group_config
The config for Compute Engine Instance Group Manager that manages this group. This is only used for preemptible instance groups.
instance_template_name
The name of the Instance Template used for the Managed Instance Group.
instance_group_manager_name
The name of the Instance Group Manager for this group
software_config
Specifies the selection and config of software inside the cluster
image_version
The version of software inside the cluster. It must be one of the supported Cloud Dataproc Versions, such as “1.2” (including a subminor version, such as “1.2.29”), or the “preview” version.
properties
The properties to set on daemon config files. Property keys are specified in the prefix:property format, for example core:hadoop.tmp.dir
optional_components
The set of optional components to activate on the cluster.

Possible values:

  • COMPONENT_UNSPECIFIED
  • ANACONDA
  • HBASE
  • RANGER
  • SOLR
  • HIVE_WEBHCAT
  • JUPYTER
  • ZEPPELIN
initialization_actions
Specifies an executable to run on a fully configured node and a timeout period for executable completion.
executable_file
Cloud Storage URI of the executable file
execution_timeout
Amount of time executable has to complete
encryption_config
Encryption settings for the cluster.
gce_pd_kms_key_name
The Cloud KMS key name to use for PD disk encyption for all instances in the cluster.
security_config
Kerberos config holder.
kerberos_config
Kerberos related configuration.
enable_kerberos
Flag to indicate whether to Kerberize the cluster.
rootprincipal_password_uri
The cloud Storage URI of a KMS encrypted file containing the root principal password.
kms_key_uri
The uri of the KMS key used to encrypt various sensitive files.
keystore_uri
The Cloud Storage URI of the keystore file used for SSL encryption.
truststore_uri
The Cloud Storage URI of a KMS encrypted file containing the password to the user provided keystore.
key_password_uri
The Cloud Storage URI of a KMS encrypted file containing the password to the user provided key.
truststore_password_uri
The Cloud Storage URI of a KMS encrypted file containing the password to the user provided truststore.
cross_realm_trust_realm
The remote realm the Dataproc on-cluster KDC will trust, should the user enable cross realm trust.
cross_realm_trust_admin_server
The admin server (IP or hostname) for the remote trusted realm in a cross realm trust relationship.
cross_realm_trust_shared_password_uri
The Cloud Storage URI of a KMS encrypted file containing the shared password between the on-cluster Kerberos realm and the remote trusted realm, in a cross realm trust relationship.
kdc_db_key_uri
The Cloud Storage URI of a KMS encrypted file containing the master key of the KDC database.
tgt_lifetime_hours
The lifetime of the ticket granting ticket, in hours.
realm
The name of the on-cluster Kerberos realm.
region
The region in which the cluster and associated nodes will be created in.

GCP Permissions

Ensure the Cloud Dataproc API is enabled for the current project.

Edit this page on GitHub

Thank you for your feedback!

×