Microsoft Azure Computer Vision | AI Agent Tools

microsoft.com

•

One-click Login

About Microsoft Azure Computer Vision

Microsoft Azure Computer Vision provides businesses with advanced AI-powered tools to analyze and interpret visual content such as images and videos. It offers capabilities including optical character recognition (OCR) to extract printed and handwritten text from various documents, object detection to identify and classify thousands of items, and facial analysis for identity verification and customer insights. The service supports multiple languages and works across diverse business scenarios like retail inventory management, logistics package verification, and security operations. By automating visual data extraction and analysis, it enables companies to streamline workflows, improve operational efficiency, and gain deeper understanding of their visual assets. Integration with Azure Cognitive Services allows seamless embedding of these capabilities into business applications without requiring deep AI expertise.

Available Actions

These are the specific actions that AI agents can perform with this tool

Analyze Image

3 inputs

Analyze an image to extract visual features including categories, tags, and description.

Inputs

url

The URL of the image to analyze

features

Comma-separated list of visual features to analyze

language

The language of the returned metadata

Generate Thumbnail

7 inputs

Generate a thumbnail from a given image URL with specified dimensions using Azure AI.

Inputs

Endpoint

Supported Cognitive Services endpoint.

height

Height of the thumbnail in pixels.

width

Width of the thumbnail in pixels.

model-version

AI model version, defaults to 'latest'.

smartCropping

Enable smart cropping.

Ocp-Apim-Subscription-Key

API key for authentication

url

Publicly reachable URL of an image.

List Models

2 inputs

Retrieve the list of domain-specific models supported by the Computer Vision API.

Inputs

Endpoint

Supported Cognitive Services endpoints.

Ocp-Apim-Subscription-Key

The API key used for authenticating requests.

Recognize Printed Text

6 inputs

Use the Azure OCR API to recognize and extract text from an image.

Inputs

Endpoint

Supported Cognitive Services endpoint.

detectOrientation

Whether to detect the text orientation in the image.

language

Language code of the text in the image. Defaults to 'unk'.

model-version

Version of the AI model. Defaults to 'latest'.

Ocp-Apim-Subscription-Key

Subscription key for accessing the service.

url

Publicly reachable URL of the image.

Tag Image

4 inputs

The endpoint tags an image with relevant words related to its content.

Inputs

Endpoint

Supported Cognitive Services endpoints.

url

Publicly reachable URL of an image.

language

The desired language for output generation. Defaults to 'en'.

modelVersion

Optional parameter to specify the version of the AI model. Defaults to 'latest'.

About Microsoft Azure Computer Vision

Available Actions

These are the specific actions that AI agents can perform with this tool

Analyze Image

3 inputs

Analyze an image to extract visual features including categories, tags, and description.

Inputs

url

The URL of the image to analyze

features

Comma-separated list of visual features to analyze

language

The language of the returned metadata

Generate Thumbnail

7 inputs

Generate a thumbnail from a given image URL with specified dimensions using Azure AI.

Inputs

Endpoint

Supported Cognitive Services endpoint.

height

Height of the thumbnail in pixels.

width

Width of the thumbnail in pixels.

model-version

AI model version, defaults to 'latest'.

smartCropping

Enable smart cropping.

Ocp-Apim-Subscription-Key

API key for authentication

url

Publicly reachable URL of an image.

List Models

2 inputs

Retrieve the list of domain-specific models supported by the Computer Vision API.

Inputs

Endpoint

Supported Cognitive Services endpoints.

Ocp-Apim-Subscription-Key

The API key used for authenticating requests.

Recognize Printed Text

6 inputs

Use the Azure OCR API to recognize and extract text from an image.

Inputs

Endpoint

Supported Cognitive Services endpoint.

detectOrientation

Whether to detect the text orientation in the image.

language

Language code of the text in the image. Defaults to 'unk'.

model-version

Version of the AI model. Defaults to 'latest'.

Ocp-Apim-Subscription-Key

Subscription key for accessing the service.

url

Publicly reachable URL of the image.

Tag Image

4 inputs

The endpoint tags an image with relevant words related to its content.

Inputs

Endpoint

Supported Cognitive Services endpoints.

url

Publicly reachable URL of an image.

language

The desired language for output generation. Defaults to 'en'.

modelVersion

Optional parameter to specify the version of the AI model. Defaults to 'latest'.