Inspect and maintain collections

Use this page to check collection state, flush pending writes to disk, optimize storage after deletions, and rebuild indexes.

Before you begin, make sure you have a running VectorAI DB instance and the Python client library installed (pip install actian-vectorai-client).

Get collection state

Use get_state() to retrieve the current VDE state for a collection.

import asyncio
from actian_vectorai import AsyncVectorAIClient

async def main():
    async with AsyncVectorAIClient("localhost:6574") as client:
        state = await client.vde.get_state("my_collection")
        print(f"Collection state: {state}")

asyncio.run(main())

Get collection statistics

Use get_stats() to inspect vector counts and storage usage.

import asyncio
from actian_vectorai import AsyncVectorAIClient

async def main():
    async with AsyncVectorAIClient("localhost:6574") as client:
        stats = await client.vde.get_stats("my_collection")
        print(f"Total vectors: {stats.total_vectors}")
        print(f"Indexed vectors: {stats.indexed_vectors}")
        print(f"Deleted vectors: {stats.deleted_vectors}")
        print(f"Storage bytes: {stats.storage_bytes}")
        print(f"Index memory bytes: {stats.index_memory_bytes}")

asyncio.run(main())

get_stats() returns these fields.

total_vectors: Total number of vectors in the collection.
indexed_vectors: Number of vectors currently indexed.
deleted_vectors: Number of deleted vectors not yet reclaimed.
storage_bytes: Collection storage size in bytes.
index_memory_bytes: Index memory use in bytes.

Flush a collection

VectorAI DB writes data changes to disk asynchronously for performance. Flushing forces pending writes to be persisted immediately.

import asyncio
from actian_vectorai import AsyncVectorAIClient

async def main():
    async with AsyncVectorAIClient("localhost:6574") as client:
        flushed = await client.vde.flush("my_collection")
        print(f"Collection flushed: {flushed}")

asyncio.run(main())

Optimize a collection

Optimization compacts storage and reclaims space from deleted vectors.

import asyncio
from actian_vectorai import AsyncVectorAIClient

async def main():
    async with AsyncVectorAIClient("localhost:6574") as client:
        optimized = await client.vde.optimize("my_collection")
        print(f"Optimization complete: {optimized}")

        stats = await client.vde.get_stats("my_collection")
        print(f"Deleted vectors remaining: {stats.deleted_vectors}")

asyncio.run(main())

Rebuild an index

Use rebuild_index() for a simple rebuild. It returns true when the server accepts or completes the rebuild request.

import asyncio
from actian_vectorai import AsyncVectorAIClient

async def main():
    async with AsyncVectorAIClient("localhost:6574") as client:
        rebuilt = await client.vde.rebuild_index("my_collection")
        print(f"Rebuild accepted: {rebuilt}")

asyncio.run(main())

Monitor rebuild progress

Use trigger_rebuild(wait=False) when you need a task ID and progress reporting.

import asyncio
from actian_vectorai import AsyncVectorAIClient

COLLECTION = "large_dataset"

async def main():
    async with AsyncVectorAIClient("localhost:6574") as client:
        task_id, _ = await client.vde.trigger_rebuild(COLLECTION, wait=False)
        print(f"Rebuild task started: {task_id}")

        while True:
            task = await client.vde.get_rebuild_task(task_id)
            print(
                f"State: {task.state} | "
                f"Progress: {task.progress:.1f}% | "
                f"Phase: {task.current_phase}"
            )

            if str(task.state).endswith("TASK_COMPLETED"):
                break
            if str(task.state).endswith(("TASK_FAILED", "TASK_CANCELLED")):
                raise RuntimeError(task.error_message or f"Rebuild ended in {task.state}")

            await asyncio.sleep(1)

asyncio.run(main())

List rebuild tasks

list_rebuild_tasks() returns a tuple containing the task list and the total count.

import asyncio
from actian_vectorai import AsyncVectorAIClient

async def main():
    async with AsyncVectorAIClient("localhost:6574") as client:
        tasks, total = await client.vde.list_rebuild_tasks()

        print(f"Rebuild tasks: {total}")
        for task in tasks:
            print(f"Task ID: {task.task_id}")
            print(f"  Collection: {task.collection_name}")
            print(f"  State: {task.state}")
            print(f"  Progress: {task.progress:.1f}%")
            print(f"  Started: {task.started_at}")

asyncio.run(main())

Each rebuild task includes these fields.

task_id: Unique identifier for the rebuild task.
collection_name: Name of the collection being rebuilt.
state: Current task state.
progress: Completion percentage from 0 to 100.
current_phase: Current rebuild phase.
started_at: Timestamp when the task started.

Complete maintenance workflow

The following example combines common maintenance operations into one workflow.

import asyncio
from actian_vectorai import AsyncVectorAIClient

async def maintenance_workflow(client, collection_name):
    print(f"=== Maintenance for '{collection_name}' ===")

    stats = await client.vde.get_stats(collection_name)
    print(f"Total vectors: {stats.total_vectors:,}")
    print(f"Indexed vectors: {stats.indexed_vectors:,}")
    print(f"Deleted vectors: {stats.deleted_vectors:,}")

    await client.vde.flush(collection_name)

    if stats.deleted_vectors > 1000:
        await client.vde.optimize(collection_name)

    rebuilt = await client.vde.rebuild_index(collection_name)
    print(f"Rebuild accepted: {rebuilt}")

    final_stats = await client.vde.get_stats(collection_name)
    print(f"Final total vectors: {final_stats.total_vectors:,}")
    print(f"Final index memory: {final_stats.index_memory_bytes / 1024 / 1024:.2f} MB")

async def main():
    async with AsyncVectorAIClient("localhost:6574") as client:
        await maintenance_workflow(client, "products")

asyncio.run(main())

Collections

Points

Vectors

Payload

Search

Filtering

Semantic search

Hybrid search

Distance metrics

Indexing

Inspect and maintain collections

Get collection state

Get collection statistics

Flush a collection

Optimize a collection

Rebuild an index

Monitor rebuild progress

List rebuild tasks

Complete maintenance workflow

Collections

Points

Vectors

Payload

Search

Filtering

Semantic search

Hybrid search

Distance metrics

Indexing

Documentation Index

​Get collection state

​Get collection statistics

​Flush a collection

​Optimize a collection

​Rebuild an index

​Monitor rebuild progress

​List rebuild tasks

​Complete maintenance workflow

Get collection state

Get collection statistics

Flush a collection

Optimize a collection

Rebuild an index

Monitor rebuild progress

List rebuild tasks

Complete maintenance workflow