Skip to main content

How This Helps

Semantic Search lets you find images using plain language. A query like “car on highway at night” returns visually and contextually relevant results—even if none of the images are explicitly labeled that way.

Prerequisites

  • A dataset ID (visible in the browser URL when viewing a dataset: https://app.visual-layer.com/dataset/<dataset_id>/data).
  • A valid JWT token. See Authentication.
Semantic search requires enrichment to have been applied to your dataset.
  • Caption search (caption parameter): requires CAPTION_IMAGES enrichment.
  • Semantic vector search (VQL text filter): requires MULTIMODEL_IMAGE_ENCODING or CAPTION_IMAGES enrichment.
  • Full-text search (VQL fts): works on any dataset with captions.
Check which enrichment models are applied to your dataset by calling GET /api/v1/enrichment/{dataset_id}/context.

The caption query parameter searches across AI-generated captions using semantic similarity.
GET /api/v1/explore/{dataset_id}
Authorization: Bearer <jwt>

Parameters

ParameterTypeRequiredDescription
captionstringYesNatural language search query.
entity_typestringYesIMAGES or OBJECTS.
thresholdintegerNoClustering granularity (0–4). Use 0 for finest granularity.
textual_similarity_thresholdfloatNoMinimum similarity score to include in results (0.0–1.0).
page_numberintegerNoPage index for pagination (0-based, 100 clusters per page).
labelsstringNoFilter by label. Format: ["label1","label2"]. Labels are generated by the IMAGE_TAGGING enrichment model.
tagsstringNoFilter by tag UUID. Format: ["uuid1","uuid2"]. Tag UUIDs are returned in search responses and visible in the Visual Layer UI.

Example

curl -H "Authorization: Bearer <jwt>" \
  "https://app.visual-layer.com/api/v1/explore/<dataset_id>?caption=cars+on+highway&entity_type=IMAGES&threshold=0&textual_similarity_threshold=0.5"

Response

{
  "clusters": [
    {
      "cluster_id": "d0470097-0c77-4a9c-9edf-289680df7f71",
      "type": "IMAGES",
      "n_images": 14,
      "similarity_threshold": "0",
      "relevance_score": 0.63,
      "relevance_score_type": "cosine_distance",
      "captions": [
        "An aerial view of a highway with multiple vehicles traveling at night."
      ],
      "previews": [
        {
          "type": "IMAGE",
          "media_id": "300dad2c-1234-11f1-8483-5a879df30de4",
          "media_uri": "https://cdn.example.com/.../image.jpg",
          "media_thumb_uri": "https://cdn.example.com/.../thumb.webp",
          "caption": "An aerial view of a highway with multiple vehicles traveling at night.",
          "file_name": "highway_night.jpg",
          "relevance_score": 0.63,
          "relevance_score_type": "cosine_distance",
          "width": 1920,
          "height": 1080
        }
      ],
      "labels": null,
      "user_tags": null
    }
  ],
  "metadata": {
    "used_duckdb": true
  }
}

Understanding relevance_score

When relevance_score_type is cosine_distance, a lower score means a stronger match.
  • 0.0–0.4 — strong match
  • 0.4–0.6 — partial match
  • 0.6+ — loosely related
Use textual_similarity_threshold to filter out weak matches. A value of 0.5 returns only results with a score below 0.5 (higher similarity).
The image_caption parameter is deprecated. Use caption instead.

Visual Query Language (VQL) is the preferred approach for advanced text queries. It supports both full-text search and semantic vector search, and it composes cleanly with other filters. Pass VQL as a JSON array in the vql query parameter.
GET /api/v1/explore/{dataset_id}?vql=[...]&entity_type=IMAGES&threshold=0
Authorization: Bearer <jwt>
Full-text search (fts) matches captions using stemming and keyword ranking.
curl -H "Authorization: Bearer <jwt>" \
  "https://app.visual-layer.com/api/v1/explore/<dataset_id>?vql=%5B%7B%22text%22%3A%7B%22op%22%3A%22fts%22%2C%22value%22%3A%22highway+night%22%7D%7D%5D&entity_type=IMAGES&threshold=0"
Decoded VQL:
[{"text": {"op": "fts", "value": "highway night"}}]
Semantic search (semantic) uses vector embeddings to find conceptually similar content — even when the exact words don’t appear in the captions.
curl -H "Authorization: Bearer <jwt>" \
  "https://app.visual-layer.com/api/v1/explore/<dataset_id>?vql=%5B%7B%22text%22%3A%7B%22op%22%3A%22semantic%22%2C%22value%22%3A%22cars+on+highway%22%2C%22threshold%22%3A0.7%7D%7D%5D&entity_type=IMAGES&threshold=0"
Decoded VQL:
[{"text": {"op": "semantic", "value": "cars on highway", "threshold": 0.7}}]
The optional threshold in the VQL filter (0.0–1.0) sets the minimum similarity score for results.

Combining Filters with VQL

VQL filters combine with AND logic. The following example finds semantically similar images that are also labeled as vehicles.
[
  {"text": {"op": "semantic", "value": "car on highway"}},
  {"labels": {"op": "one_of", "value": ["vehicle", "car"]}}
]
curl -H "Authorization: Bearer <jwt>" \
  "https://app.visual-layer.com/api/v1/explore/<dataset_id>?vql=%5B%7B%22text%22%3A%7B%22op%22%3A%22semantic%22%2C%22value%22%3A%22car+on+highway%22%7D%7D%2C%7B%22labels%22%3A%7B%22op%22%3A%22one_of%22%2C%22value%22%3A%5B%22vehicle%22%2C%22car%22%5D%7D%7D%5D&entity_type=IMAGES&threshold=0"

Available VQL Filter Types

FilterDescriptionExample
textSearch by caption text (fts or semantic){"text": {"op": "semantic", "value": "dog"}}
labelsFilter by classification labels{"labels": {"op": "one_of", "value": ["cat"]}}
tagsFilter by user tag UUIDs{"tags": {"op": "is", "value": ["uuid"]}}
issuesFilter by quality issues{"issues": {"op": "issue", "value": "blur", "mode": "in"}}
duplicatesShow duplicate clusters{"duplicates": {"op": "duplicates", "value": 0.95}}
uniquenessFilter by uniqueness score{"uniqueness": {"op": "uniqueness", "value": 0.8}}

Python Example

import requests
from urllib.parse import quote
import json

VL_BASE_URL = "https://app.visual-layer.com"
JWT_TOKEN = "<your-jwt-token>"
DATASET_ID = "<your-dataset-id>"

headers = {"Authorization": f"Bearer {JWT_TOKEN}"}

def semantic_search(query: str, threshold: float = 0.6, page: int = 0):
    vql = json.dumps([{"text": {"op": "semantic", "value": query, "threshold": threshold}}])
    resp = requests.get(
        f"{VL_BASE_URL}/api/v1/explore/{DATASET_ID}",
        headers=headers,
        params={
            "vql": vql,
            "entity_type": "IMAGES",
            "threshold": 0,
            "page_number": page,
        },
    )
    resp.raise_for_status()
    return resp.json()

results = semantic_search("cars driving on a highway at night")
clusters = results.get("clusters", [])
print(f"Found {len(clusters)} clusters")

for cluster in clusters:
    score = cluster.get("relevance_score")
    n = cluster.get("n_images")
    captions = cluster.get("captions", [])
    caption_preview = captions[0][:80] if captions else "no caption"
    print(f"  {n} images — score: {score:.3f}{caption_preview}...")

Response Codes

See Error Handling for the error response format and Python handling patterns.
HTTP CodeMeaning
200Results returned successfully.
401Unauthorized — check your JWT token.
404Dataset not found.
422Invalid query parameters — check VQL syntax.