Object Detection
prompt = """Detect objects. Return JSON:
[{"label": "cat", "box_2d": [y0, x0, y1, x1]}]"""
response = client.models.generate_content(
model=MODEL, contents=[prompt, image]
)
[
{"label": "cat", "box_2d": [116, 85, 1000, 885]},
{"label": "left eye", "box_2d": [519, 625, 615, 713]}
]
Coordinates normalized to 0-1000. See notebook for visualization.