Skip to main content

How This Helps

Export your dataset’s metadata, labels, and media files for use in training pipelines, annotation tools, or downstream analysis. Both full and selective exports are supported.

Prerequisites

  • A dataset in READY status.
  • A dataset ID (visible in the browser URL when viewing a dataset: https://app.visual-layer.com/dataset/<dataset_id>/data).
  • A valid JWT token. See Authentication.

Full Dataset Export

Export all media and metadata for an entire dataset. This is an asynchronous operation — initiate the export, poll for completion, then download the result.

Step 1: Initiate Export

GET /api/v1/dataset/{dataset_id}/export_context_async
Authorization: Bearer <jwt>

Parameters

ParameterTypeRequiredDescription
file_namestringYesName for the output ZIP file.
export_formatstringYesjson or parquet.
include_imagesbooleanNoSet to true to include image files in the export. Default: false.

Example

curl -G \
  -H "Authorization: Bearer <jwt>" \
  -H "Accept: application/json" \
  --data-urlencode "file_name=export.zip" \
  --data-urlencode "export_format=json" \
  --data-urlencode "include_images=false" \
  "https://app.visual-layer.com/api/v1/dataset/<dataset_id>/export_context_async"

Response

{
  "id": "fdb84834-d19b-4797-861a-d48b7a16f908"
}
Save the id — you need it to poll export status.

Step 2: Poll Export Status

curl -H "Authorization: Bearer <jwt>" \
  "https://app.visual-layer.com/api/v1/dataset/<dataset_id>/export_status?export_task_id=<task_id>"

Status Response

{
  "status": "COMPLETED",
  "download_uri": "https://s3.amazonaws.com/.../export.zip?..."
}
Poll until status is COMPLETED or FAILED.

Step 3: Download

curl -L "<download_uri>" --output export.zip
Use -L to follow S3 redirects.

Selective Export

Export specific media items or entire clusters using POST /api/v1/dataset/{dataset_id}/export_entities_async. This is useful for exporting only the results of a search or filter operation.
POST /api/v1/dataset/{dataset_id}/export_entities_async
Authorization: Bearer <jwt>
Content-Type: application/json

Query Parameters

ParameterTypeDescription
export_formatstringjson or parquet.
include_imagesbooleanInclude image files in the export.

Export by Media IDs

media_ids are returned in the media_id field of any Explore endpoint response (visual search, semantic search, or duplicate retrieval results).
curl -X POST \
  -H "Authorization: Bearer <jwt>" \
  -H "Content-Type: application/json" \
  -d '{
    "media_selection": [
      {
        "type": "media",
        "payload": {
          "media_ids": ["9e8da312-d954-4844-afc7-357c458c5b03"]
        }
      }
    ]
  }' \
  "https://app.visual-layer.com/api/v1/dataset/<dataset_id>/export_entities_async?include_images=true&export_format=json"

Export by Cluster

cluster_id values are returned in the cluster_id field of Explore endpoint responses and are visible in the Visual Layer UI when browsing clusters.
curl -X POST \
  -H "Authorization: Bearer <jwt>" \
  -H "Content-Type: application/json" \
  -d '{
    "media_selection": [
      {
        "type": "cluster",
        "payload": {
          "cluster_id": "4e0e4d51-0fef-4fe1-a8ec-1a82b6f4880b",
          "cluster_filters": {
            "entity_type": "IMAGES"
          },
          "exclude_media_ids": []
        }
      }
    ]
  }' \
  "https://app.visual-layer.com/api/v1/dataset/<dataset_id>/export_entities_async?include_images=true&export_format=json"
Both return the same response format as the full export — an id to use for polling status.

Full Automation Script

The following script handles the complete export workflow: initiate, poll, download, and extract.
#!/bin/bash

DATASET_ID="your-dataset-id"
JWT_TOKEN="your-jwt-token"
FILENAME="export.zip"

echo "Initiating export..."
EXPORT_TASK_RESPONSE=$(curl -s -G \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -H "Accept: application/json" \
  --data-urlencode "file_name=$FILENAME" \
  --data-urlencode "export_format=json" \
  --data-urlencode "include_images=false" \
  "https://app.visual-layer.com/api/v1/dataset/$DATASET_ID/export_context_async")

EXPORT_TASK_ID=$(echo $EXPORT_TASK_RESPONSE | grep -o '"id":"[^"]*' | cut -d':' -f2 | tr -d '"')
echo "Export Task ID: $EXPORT_TASK_ID"

echo "Polling for completion..."
STATUS="PENDING"
while [ "$STATUS" != "COMPLETED" ]; do
  STATUS_RESPONSE=$(curl -s \
    -H "Authorization: Bearer $JWT_TOKEN" \
    "https://app.visual-layer.com/api/v1/dataset/$DATASET_ID/export_status?export_task_id=$EXPORT_TASK_ID")
  STATUS=$(echo $STATUS_RESPONSE | grep -o '"status":"[^"]*' | cut -d':' -f2 | tr -d '"')
  echo "  Status: $STATUS"
  [ "$STATUS" = "FAILED" ] && echo "Export failed." && exit 1
  [ "$STATUS" != "COMPLETED" ] && sleep 5
done

DOWNLOAD_URI=$(echo $STATUS_RESPONSE | grep -o '"download_uri":"[^"]*' | cut -d':' -f2- | sed 's/^"//' | sed 's/"$//' | sed 's/\\//g')
echo "Downloading..."
curl -L "$DOWNLOAD_URI" --output $FILENAME

echo "Extracting..."
unzip $FILENAME -d exported_dataset

echo "Done."

Working with Exported Data

After extraction, the archive contains a metadata file (Parquet or JSON) and optionally an images/ folder.

Filter by Uniqueness Score

import pandas as pd

df = pd.read_parquet("exported_dataset/metadata.parquet")
top_unique = df.sort_values(by="uniqueness_score", ascending=False).head(100)
top_unique.to_csv("top_unique_images.csv", index=False)

Copy Filtered Images

import os, shutil

os.makedirs("top_unique_images", exist_ok=True)
for fname in top_unique["image_filename"]:
    shutil.copy(f"exported_dataset/images/{fname}", f"top_unique_images/{fname}")

Response Codes

See Error Handling for the error response format and Python handling patterns.
HTTP CodeMeaning
200Export task ID returned.
401Unauthorized — check your JWT token.
404Dataset not found.
409Dataset is not in READY status.