Storage

Cloud Storage

Cloud Storage provides globally unified, scalable, and highly durable object storage. You can store files in Cloud Storage without having to worry about running out of space, or managing your own filesystems.

Cloud Storage buckets can be configured to store files in a single region, dual-region, or multi-region.

Location

Use Case

Region

Use a region to help optimize latency and network bandwidth for data consumers, such as analytics pipelines, that are grouped in the same region.

Dual-Region

Use a dual-region when you want similar performance advantages as regions, but also want the higher availability that comes with being geo-redundant.

Multi-Region

Use a multi-region when you want to serve content to data consumers that are outside of the Google network and distributed across large geographic areas, or when you want the higher availability that comes with being geo-redundant.

See Cloud Storage Bucket locations for more information.

Cloud Storage buckets can also be configured with a different Storage Class, for different access patterns to optimize cost.

Storage Class

Use Case

Standard

Standard Storage is best for data that is frequently accessed ("hot" data) and/or stored for only brief periods of time.

Nearline

Nearline Storage is a low-cost, highly durable storage service for storing infrequently accessed data. Nearline Storage is a better choice than Standard Storage in scenarios where slightly lower availability, a 30-day minimum storage duration.

Coldline

Coldline Storage is a very-low-cost, highly durable storage service for storing infrequently accessed data. Coldline Storage is a better choice than Standard Storage or Nearline Storage in scenarios where slightly lower availability, a 90-day minimum storage duration.

Archive

Archive Storage is the lowest-cost, highly durable storage service for data archiving, online backup, and disaster recovery. Unlike the "coldest" storage services offered by other Cloud providers, your data is available within milliseconds, not hours or days.

See Cloud Storage Classes for more information.

Enable API

gcloud services enable storage-component.googleapis.com

Create a Bucket

A bucket is the top-level directory that you can add additional files and sub-directories into. A bucket name must be globally unique. For example, create a bucket with the same name as your Google Cloud project.

PROJECT_ID=$(gcloud config get-value project)
gsutil mb gs://$PROJECT_ID

Copy a File

echo "Hello World" > hello.txt
gsutil cp hello.txt gs://$PROJECT_ID

Delete a File

gsutil rm gs://$PROJECT_ID/hello.txt

Spring Resource

With Spring Cloud GCP, you can use Spring Resource to perform store and retrieve file from Cloud Storage.

Dependency

Add the Spring Cloud GCP Storage starter:

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-gcp-starter-storage</artifactId>
</dependency>

Configuration

There is no explicit configuration required if you use the automatic authentication and project ID detection. I.e., if you already logged in locally with gcloud command line, then it'll automatically access buckets that you have access to.

Notice that there is no explicit configuration for username/password. Cloud Storage authentication uses the GCP credential (either your user credential, or Service Account credential), and authorization is configured via Identity Access Management (IAM).

Storage Client

The starter automatically creates a pre-configured Storage bean that provides raw-access to Google Cloud Storage.

@Bean
ApplicationRunner storageRunner(Storage storage, GcpProjectIdProvider projectIdProvider) {
  return (args) -> {
    Page<Blob> list = storage.list(projectIdProvider.getProjectId());
    list.iterateAll().forEach(blob -> System.out.println(blob.getName()));
  };
}

Resource URI

You can address a Cloud Storage file by using the resource URI prefixed with gs://. The fully qualified URI is of the form: gs://project-id/path/to/file.

Read a file

You can open the Resource using ApplicationContext. Then read the content from InputStream. You must close the stream when you are done (or wrap with try-with-resource since it's auto-closeable).

@Bean
ApplicationRunner runner(ApplicationContext ctx, GcpProjectIdProvider projectIdProvider) {
  return (args) -> {
    WritableResource resource = (WritableResource) ctx
        .getResource(String.format("gs://%s/hello.txt", projectIdProvider.getProjectId()));
    try (PrintWriter writer = new PrintWriter(resource.getOutputStream())) {
      writer.println("Hello World!");
    }
  };
}

Write a file

You can open the Resource using ApplicationContext. Then cast the Resource to a WritableResource, and then use the OutputStream to write the content. Lastly, you must close the stream in order for the file to write (or wrap with try-with-resource since it's auto-closeable).

@Bean
ApplicationRunner readRunner(ApplicationContext ctx, GcpProjectIdProvider projectIdProvider) {
  return (args) -> {
    Resource resource = ctx
        .getResource(String.format("gs://%s/hello.txt", projectIdProvider.getProjectId()));
    try (InputStreamReader reader = new InputStreamReader(resource.getInputStream())) {
      BufferedReader bufferedReader = new BufferedReader(reader);
      bufferedReader.lines().forEach(System.out::println);
    }
  };
}

Samples

Spring Integration

You can channel adapters for Google Cloud Storage to read and write files to Google Cloud Storage through MessageChannels.

Dependency

Add both the Spring Cloud GCP Storage starter, and Spring Integration File component.

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-gcp-starter-storage</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.integration</groupId>
    <artifactId>spring-integration-file</artifactId>
</dependency>

Inbound Channel Adapter

Inbound File Synchronizer

File-based integration with files typically requires polling a directory that contains the new files. In Spring Integration, this is configured through an InboundFileSynchronizer. Use GcsInboundFileSyncronizer to create a MessageSource and adapt it to a MessageChannel.

The files are temporarily stored in a directory in the local file system.

@Bean
public MessageChannel gcsInputChannel() {
  return MessageChannels.direct().get();
}

@Bean
@InboundChannelAdapter(channel = "gcsInputChannel", poller = @Poller(fixedDelay = "5000"))
public MessageSource<File> (Storage gcs, GcpProjectIdProvider projectIdProvider)
    throws IOException {
  GcsInboundFileSynchronizer synchronizer = new GcsInboundFileSynchronizer(gcs);
  synchronizer.setRemoteDirectory(projectIdProvider.getProjectId());

  GcsInboundFileSynchronizingMessageSource messageSource =
          new GcsInboundFileSynchronizingMessageSource(synchronizer);
  File localDirectory = Files.createTempDirectory("gcs");
  messageSource.setLocalDirectory(localDirectory);

  return messageSource;
}

Streaming Message Source

For most use cases, you should use the streaming message source, which does not require files to be stored in the file system.

@Bean
public MessageChannel gcsInputChannel() {
  return MessageChannels.direct().get();
}

@Bean
@InboundChannelAdapter(channel = "gcsInputChannel", poller = @Poller(fixedDelay = "5000"))
public MessageSource<InputStream> streamingAdapter(Storage gcs, GcpProjectIdProvider projectIdProvider) {
  GcsStreamingMessageSource adapter =
          new GcsStreamingMessageSource(new GcsRemoteFileTemplate(new GcsSessionFactory(gcs)));
  adapter.setRemoteDirectory(projectIdProvider.getProjectId());
  return adapter;
}

Outbound Channel Adapter

The outbound channel adapter allows files to be written to Google Cloud Storage. When it receives a Message containing a payload of type File, it writes that file to the Google Cloud Storage bucket specified in the adapter.

@Bean
@ServiceActivator(inputChannel = "gcsOutputChannel")
public MessageHandler outboundChannelAdapter(Storage gcs, GcpProjectIdProvider projectIdProvider) {
  GcsMessageHandler outboundChannelAdapter = new GcsMessageHandler(new GcsSessionFactory(gcs));
  outboundChannelAdapter.setRemoteDirectoryExpression(new ValueExpression<>(projectIdProvider.getProjectId()));

  return outboundChannelAdapter;
}

Samples

Last updated