Landing AI
Florence2 Phrase Grounding

A tool that can detect multiple objects given a text prompt which can be object names or caption. You can optionally separate the object names in the text with commas. It returns a list of bounding boxes with normalized coordinates, label names and associated probability scores of 1.0.

Input

Detect objects from a photo

Output

python
import requests import base64 url = "https://api.landing.ai/v1/tools/florence2" with open("{{path_to_image}}", "rb") as image_file: base64_string = base64.b64encode(image_file.read()).decode('utf-8') payload = { "image": base64_string, "prompt": "{{prompt}}", "task": "<CAPTION_TO_PHRASE_GROUNDING>" } headers = { "Content-Type": "application/json", "Accept": "application/json" } response = requests.post(url, json=payload, headers=headers) print(response.json())