aiR
aiR is aiXplain's Search model for data indexing and retrieval. It uses an established and well maintained IR engine, Haystack, to enhance maintainability, flexibility, and community support.
Currently, you can only utilise aiR via FineTune, which utilised endpoints 2, 4 and 6, below. Learn more in our guide on How to index and retrieve your data.
In future, we will revamp how you can interact with aiR to make it easier, more flexible, and enabling more of the endpoints.
Endpoints
The model contains eight main components (endpoints).
-
List Collections - Get a list of the existing collections
-
Creating Empty Collection - Create a empty collections.
-
Deleting Collection - Delete Collections
-
Indexing - Ingest raw text documents and convert them into vector representations (embeddings). The process involves two steps:
- Embedding Generation: A neural network, such as a transformer-based model (e.g., BERT, GPT), converts the preprocessed text into dense vector embeddings. These embeddings capture the semantic meaning of the text.
- Vector Storage: The generated vectors are then stored in a vector database (Qdrant) alongside metadata about the original documents for efficient retrieval.
-
Count Documents - Return the number of the documents inside a specific collection
-
Retrieval - Return the most relevant documents to a user query from the vector database. The process involves three steps:
- Query Embedding: The same embedding model used for indexing generates a vector representation of the query.
- Similarity Search: The query vector is compared to the stored document vectors using a similarity metric (e.g., cosine similarity, Euclidean distance). The vector database quickly identifies the nearest neighbors (most similar documents).
- Result Retrieval: The documents corresponding to the closest vectors are retrieved and returned as the search results.
-
Delete Document - Deletes a specific document given its document_id.
-
Update Document - Adds a document or overwrites an existing document.
Frequently Asked Questions (FAQs)
How large can a single document be?
It can be of any size. Size should not be a problem at all. There is a splitter that splits the document into smaller documents of < 100 sentences size. In the coming versions of aiR the splitting can be on a parameter the specifics the delimiter type and the length of the generated documents.
What is the document format/modality/file extension acceptable by the model?
The indexing endpoint expects the payload to be of the following format.
{
"payloads": [
{
"value_type": "text",
"value": "some_text",
"uri": "https://videos2.ascender.ai/v1/BloombergNews-2023-02-10_18-31-33.mp4",
"attributes": {
"source_id": "BloombergNews-2023-02-10_18-31-33",
"lang": "en",
"start_ts": "1.879",
"end_ts": "2.322",
"publisher": "Bloomberg News"
}
}
]
}
How many documents (dataset size) can it handle?
It is tested on up to datasets of 10,000 samples and it can handle larger datasets.
What is the latency of indexing?
Here are the tests on the latency:
dataset of 100 samples ---> 3 seconds
dataset of 5k samples ---> 90 seconds
dataset of 10k samples ---> 190 seconds
What is the latency of retrieval?
Less than a second.
Does the retrieval step include ranking?
Yes, the retrieved items are ranked and given scores.
What embeddings does aiR use? Can they be switched?
For now it uses OpenAI's text-embedding-ada-002. We intend to make the choice of embeddings model a paramater in future.
What vector database does aiR use?
Qdrant
Does the retrieval step include ranking?
Yes, the retrieved items are ranked and given scores.
Future work
- Allow the user to choose the embedding model.
- Allow the user to control the document splitter.
- Allow the user to choose the ranking mechanism. (Cosine similarity, dot product, l2, etc.)
- Allow the user to choose the vector database.