cloud.google.com

•

One-click Login

About Google Cloud OCR

Google Cloud Vision API provides two OCR modes: TEXT_DETECTION for short text in natural scenes (signs, labels, product packaging) and DOCUMENT_TEXT_DETECTION for dense printed or handwritten documents with paragraph-level structure and bounding boxes. It is billed per image or per page for PDFs. Beyond OCR, the Vision API also covers label detection, face detection, logo detection, object localization, and web detection in a single unified API. New customers receive $300 in free credits.

AI Agent Use Cases

- TEXT_DETECTION mode for scene text (signs, menus, labels)

- DOCUMENT_TEXT_DETECTION for dense documents with paragraph and bounding box structure

- Handwriting recognition support

- PDF and TIFF multi-page document processing (each page billed as one image)

- Returns structured output including word positions and confidence scores

- 1,000 free units per month on most features

Available Actions

These are the specific actions that AI agents can perform with this tool

Async File Annotation

2 inputs

Submit asynchronous requests to analyze images within files like PDFs.

Inputs

requests

A list of individual async file annotation requests for this batch.

parent

Target project and location for the call. If not specified, a region will be chosen automatically. Example format: 'projects/project-A/locations/eu'.

Async Image Annotation

3 inputs

Run asynchronous image detection and annotation for a list of images

Inputs

requests

Individual image annotation requests for this batch.

outputConfig

The desired output location and metadata (e.g., format).

parent

Target project and location to make a call. If no parent is specified, a region will be chosen automatically.

File Annotation

2 inputs

Annotate a batch of files using image detection service

Inputs

requests

The list of file annotation requests. Right now only one AnnotateFileRequest is supported in BatchAnnotateFilesRequest.

parent

Target project and location in the format 'projects/{project-id}/locations/{location-id}'. Determines the region where the API call is made.

Image Annotation Batch Processing

2 inputs

Perform image detection and annotation for a batch of images using Google's Vision API.

Inputs

requests

Individual image annotation requests for this batch.

parent

Target project and location to make a call. If no parent is specified, a region will be chosen automatically. Supported locations: 'us' for USA, 'eu' for European Union.

About Google Cloud OCR

AI Agent Use Cases

- TEXT_DETECTION mode for scene text (signs, menus, labels)

- DOCUMENT_TEXT_DETECTION for dense documents with paragraph and bounding box structure

- Handwriting recognition support

- PDF and TIFF multi-page document processing (each page billed as one image)

- Returns structured output including word positions and confidence scores

- 1,000 free units per month on most features

Available Actions

These are the specific actions that AI agents can perform with this tool

Async File Annotation

2 inputs

Submit asynchronous requests to analyze images within files like PDFs.

Inputs

requests

A list of individual async file annotation requests for this batch.

parent

Target project and location for the call. If not specified, a region will be chosen automatically. Example format: 'projects/project-A/locations/eu'.

Async Image Annotation

3 inputs

Run asynchronous image detection and annotation for a list of images

Inputs

requests

Individual image annotation requests for this batch.

outputConfig

The desired output location and metadata (e.g., format).

parent

Target project and location to make a call. If no parent is specified, a region will be chosen automatically.

File Annotation

2 inputs

Annotate a batch of files using image detection service

Inputs

requests

The list of file annotation requests. Right now only one AnnotateFileRequest is supported in BatchAnnotateFilesRequest.

parent

Target project and location in the format 'projects/{project-id}/locations/{location-id}'. Determines the region where the API call is made.

Image Annotation Batch Processing

2 inputs

Perform image detection and annotation for a batch of images using Google's Vision API.

Inputs

requests

Individual image annotation requests for this batch.

parent

Target project and location to make a call. If no parent is specified, a region will be chosen automatically. Supported locations: 'us' for USA, 'eu' for European Union.