How to Create an GCP Cloud Storage Bucket and Service Account
Last updated
Last updated
Approximate time to complete: 5-7 minutes, excluding prerequisites
This how to article will give you step by step instructions for creating a GCS bucket and service account with credentials in Google Cloud Platform. It will also show you how to configure a source connector from Vectorize to your GCS bucket so you can use your bucket in a RAG pipeline.
Before starting, ensure you have access to the credentials, connection parameters, and API keys as appropriate for the following:
A Vectorize account (Create one free here ↗)
A GCP Cloud Account (Create one free here ↗)
Open Google Cloud Console
Navigate to the Google Cloud Console.
Make sure you are working in the correct project.
Go to APIs & Services
From the dashboard, locate the "APIs & Services" option in the quick access section.
Click on APIs & Services.
Enable Cloud Storage API
Click on ENABLE APIS AND SERVICES.
In the API library, use the search box and type Cloud Storage to locate the API.
From the search results, click on the Cloud Storage option from Google Enterprise API.
On the Cloud Storage API details page, click ENABLE.
If the API is already enabled, you can go to the next step.
Open GCP Menu
Open the Google Cloud Platform (GCP) menu and navigate to Cloud Storage.
Under Cloud Storage, select Buckets.
Create a New Bucket
Click on the CREATE button at the top to start creating a new bucket.
Configure Your Bucket
Bucket Name: Enter a globally unique name for your bucket (e.g., vectorize-gcs-aef220-dee21
).
Storage Location: Choose a region for your data (you can accept the default or select a specific region).
Storage Class: Choose the default storage class for your data (e.g., Standard).
Access Control: Set Public Access Prevention to on and use Uniform access control.
Data Protection: Configure options such as soft delete, versioning, retention policies, etc.
Once done, click CREATE to proceed.
Confirm Public Access Settings
A prompt will appear notifying you that public access will be prevented.
Click CONFIRM to proceed unless you have a specific use case that requires public access.
Save the Bucket Name
After the bucket is created, you'll be taken to the bucket details page.
Copy and save your bucket name, as you'll need it later (e.g., vectorize-gcs-aef220-dee21
).
Navigate to Service Accounts in GCP IAM
Click the button labeled CREATE SERVICE ACCOUNT
Fill in your service details
Provide a Service account name, this will automatically populate the Service account ID field as well
Click on CREATE AND CONTINUE, making sure you're clicking the button shown and NOT the more prominent DONE button.
Assign roles to your service account
In the role selection section, grant the Project > Viewer role for the service account.
Click on the ADD ANOTHER ROLE button
Add Roles and IAM Conditions
After adding the Viewer role, select Storage Admin under the Cloud Storage section.
Now let's limit the service account to just this one bucket. To do so, add an IAM condition by clicking on ADD IAM CONDITION.
Create IAM Condition
Name your IAM condition and paste the bucket name into the value field. Set the condition as shown:
Condition Type: Name
Operator: is
Click SAVE to apply the condition.
Fixing Errors and Editing IAM Conditions (Skip if you don't get this error)
If you receive an error such as "Failed to add project roles", you can edit the IAM condition by clicking the pencil icon next to the condition name.
In the Edit Condition screen, you can either modify it in the Condition Editor if needed.
Once you're done, run the linter by clicking RUN LINTER and then click SAVE.
After saving, click CONTINUE again to proceed.
Finalizing the Service Account
To finalize the setup, click DONE to complete the creation of the service account.
Select the Service Account
After creating the service account, you should see this list with the new service account listed.
If you left that page, you can always get to it by going to IAM & Admin > Service Accounts using upper left "hamburger" menu.
Click on the service account email to manage it.
Go to the Keys Tab
Once in the service account details, navigate to the KEYS tab.
Add a New Key
Open the ADD KEY dropdown
Select Create new key to generate a new key for this service account.
Create a JSON Key
When prompted, select the JSON key type and click CREATE.
Save the Key
A JSON file will be downloaded. Save this file securely as it contains the private key for your service account.
Choose a location on your hard drive to store the key file.
Make sure to keep this key file secure, as it cannot be recovered if lost.
Open the Vectorize Dashboard
Navigate to the Vectorize dashboard after logging in.
Select "Source Connectors"
From the left-hand menu, find the Source Connectors option under Integrations.
Click "New Source Connector"
On the Source Connectors page, click the New Source Connector button in the top-right corner.
Select "Google Cloud Storage"
In the list of connector options, find and select Google Cloud Storage.
Configure the Google Cloud Storage Source
Name: Enter a name for the GCS integration (this can be your bucket name, but it's not mandatory).
Bucket: Enter the exact name of the GCS bucket you created in Google Cloud.
Service Account JSON: Copy and paste the service account JSON key you downloaded during the GCP setup process.
Create the GCS Integration
Once the details are filled in, click Create Google Cloud Storage Integration to finalize the connector setup.
After the integration is successfully created, your GCS connector can now be used as part of a RAG Pipeline.