How to Create an GCP Cloud Storage Bucket and Service Account

Approximate time to complete: 5-7 minutes, excluding prerequisites

This how to article will give you step by step instructions for creating a GCS bucket and service account with credentials in Google Cloud Platform. It will also show you how to configure a source connector from Vectorize to your GCS bucket so you can use your bucket in a RAG pipeline.

Before you begin

Before starting, ensure you have access to the credentials, connection parameters, and API keys as appropriate for the following:

Step 1: Make sure GCS is enabled

  1. Open Google Cloud Console

    • Navigate to the Google Cloud Console.

    • Make sure you are working in the correct project.

  2. Go to APIs & Services

    • From the dashboard, locate the "APIs & Services" option in the quick access section.

    • Click on APIs & Services.

  3. Enable Cloud Storage API

    • Click on ENABLE APIS AND SERVICES.

    • In the API library, use the search box and type Cloud Storage to locate the API.

    • From the search results, click on the Cloud Storage option from Google Enterprise API.

    • On the Cloud Storage API details page, click ENABLE.

    • If the API is already enabled, you can go to the next step.

Step 2: Create The GCS Bucket in GCP Cloud Console

  1. Open GCP Menu

    • Open the Google Cloud Platform (GCP) menu and navigate to Cloud Storage.

    • Under Cloud Storage, select Buckets.

  2. Create a New Bucket

    • Click on the CREATE button at the top to start creating a new bucket.

  3. Configure Your Bucket

    • Bucket Name: Enter a globally unique name for your bucket (e.g., vectorize-gcs-aef220-dee21).

    • Storage Location: Choose a region for your data (you can accept the default or select a specific region).

    • Storage Class: Choose the default storage class for your data (e.g., Standard).

    • Access Control: Set Public Access Prevention to on and use Uniform access control.

    • Data Protection: Configure options such as soft delete, versioning, retention policies, etc.

    • Once done, click CREATE to proceed.

  4. Confirm Public Access Settings

    • A prompt will appear notifying you that public access will be prevented.

    • Click CONFIRM to proceed unless you have a specific use case that requires public access.

  5. Save the Bucket Name

    • After the bucket is created, you'll be taken to the bucket details page.

    • Copy and save your bucket name, as you'll need it later (e.g., vectorize-gcs-aef220-dee21).

Step 3: Set Up the Service Account and IAM Permissions

  1. Navigate to Service Accounts in GCP IAM

  1. Click the button labeled CREATE SERVICE ACCOUNT

  1. Fill in your service details

  • Provide a Service account name, this will automatically populate the Dervice account ID field as well

  • Click on CREATE AND CONTINUE, making sure you're clicking the button shown and NOT the more prominent DONE button.

  1. Assign roles to your service account

  • In the role selection section, grant the Project > Viewer role for the service account.

  • Click on the ADD ANOTHER ROLE button

  1. Add Roles and IAM Conditions

    • After adding the Viewer role, select Storage Admin under the Cloud Storage section.

    • Now let's limit the service account to just this one bucket. To do so, add an IAM condition by clicking on ADD IAM CONDITION.

  2. Create IAM Condition

    • Name your IAM condition and paste the bucket name into the value field. Set the condition as shown:

      • Condition Type: Name

      • Operator: is

    • Click SAVE to apply the condition.

  3. Fixing Errors and Editing IAM Conditions (Skip if you don't get this error)

    • If you receive an error such as "Failed to add project roles", you can edit the IAM condition by clicking the pencil icon next to the condition name.

    • In the Edit Condition screen, you can either modify it in the Condition Editor if needed.

    • Once you're done, run the linter by clicking RUN LINTER and then click SAVE.

    • After saving, click CONTINUE again to proceed.

  4. Finalizing the Service Account

    • To finalize the setup, click DONE to complete the creation of the service account.

Step 4: Create Keys for the Service Account

  1. Select the Service Account

    • After creating the service account, you should see this list with the new service account listed.

    • If you left that page, you can always get to it by going to IAM & Admin > Service Accounts using upper left "hamburger" menu.

    • Click on the service account email to manage it.

  2. Go to the Keys Tab

    • Once in the service account details, navigate to the KEYS tab.

  3. Add a New Key

    • Open the ADD KEY dropdown

  4. Select Create new key to generate a new key for this service account.

  1. Create a JSON Key

  • When prompted, select the JSON key type and click CREATE.

  1. Save the Key

    • A JSON file will be downloaded. Save this file securely as it contains the private key for your service account.

    • Choose a location on your hard drive to store the key file.

  • Make sure to keep this key file secure, as it cannot be recovered if lost.

Step 5: Configure a Google Cloud Storage (GCS) Connector in Vectorize

  1. Open the Vectorize Dashboard

    • Navigate to the Vectorize dashboard after logging in.

  2. Select "Source Connectors"

    • From the left-hand menu, find the Source Connectors option under Integrations.

  3. Click "New Source Connector"

    • On the Source Connectors page, click the New Source Connector button in the top-right corner.

  4. Select "Google Cloud Storage"

    • In the list of connector options, find and select Google Cloud Storage.

  5. Configure the Google Cloud Storage Source

    • Name: Enter a name for the GCS integration (this can be your bucket name, but it's not mandatory).

    • Bucket: Enter the exact name of the GCS bucket you created in Google Cloud.

    • Service Account JSON: Copy and paste the service account JSON key you downloaded during the GCP setup process.

  6. Create the GCS Integration

    • Once the details are filled in, click Create Google Cloud Storage Integration to finalize the connector setup.

    After the integration is successfully created, your GCS connector can now be used as part of a RAG Pipeline.

Last updated