VisionAgent

Florence-2 Sam2 Video Tracking

A tool that can segment and track multiple entities in a video given a text prompt such as category names or referring expressions. You can optionally separate the categories in the text with commas. It only tracks entities present in the first frame and only returns segmentation masks. It is useful for tracking and counting without duplicating counts.

Output

python
import requests

url = "https://api.landing.ai/v1/tools/florence2-sam2"
files = {
  "video": open("{{path_to_video}}", "rb")
}

data = {
  "prompts": [ "{{prompt1}}", "{{prompt2}}" ],
  "function_name": "florence2_sam2_video_tracking"
}
response = requests.post(url, files=files, data=data)

print(response.json())