Skip to main content

Overview

Artemis Search is built on several interconnected components that work together to provide powerful, reasoning-based searches. The following diagram illustrates how these components relate to each other:
Artemis Search Components
This diagram illustrates the structure of Artemis Search, highlighting key relationships:
  1. Organizations consist of projects.
  2. Each project serves a particular search task. Each project may have multiple datasets but only one can be active at a time. Further, each project has dedicated machines to process search requests.
  3. Datasets are two-column Pandas dataframes saved as parquet files with two columns (embedding and tags), and only one at a time may be activated.
  4. Machines are cloud servers which process the search requests on the active dataset. These are automatically load balanced.
Let’s explore each of these components in more detail.
Artemis Search goes beyond traditional keyword matching or semantic similarity. Our technology uses task-specialized ML ranking models to enable us to bake “reasoning” and “context” into searches.For example, when searching for “companies that require HIPAA compliance”, our system doesn’t just find companies related to medicine or HIPAA compliance. It actually reasons about which companies would be subject to HIPAA regulations based on their descriptions and activities.

Projects

Projects are the top-level entities in Artemis Search. Each project is dedicated to a particular search task.
A project consists of:
  • A unique name and description
  • One or more datasets (each a pandas DataFrame saved as a parquet file with embeddings and tags columns)
  • One or more machines
  • Configuration settings (e.g., model type)
Think of a project as a self-contained environment for a specific search use case.

Datasets

Datasets are the foundation of your searches. They contain the information that Artemis Search will process and query.
Each dataset is a pandas DataFrame saved as a parquet file with two essential columns:
  1. embedding: Contains OpenAI text-large-3 embeddings of the text you want to search through.
  2. tag: Contains string values associated with each embedding, which will be returned as the content associated with each search result.
You can have multiple datasets in a project, but only one can be active at a time.

Machines

Machines are the computational resources that power your searches.
Each machine runs our ML model on the active dataset when responding to search requests.
  • You can have multiple machines per project for load balancing.
  • At least one machine must be assigned to a project for it to be active.
  • Requests are automatically load-balanced among available machines.

Playground

The playground is where you can experiment with and fine-tune your searches. Under the hood, it uses the API to perform searches.
In the playground, you can:
  • Select which project to experiment with
  • Adjust search parameters like synthetic dataset size, probability threshold, and top-K threshold
  • Enter search queries
  • View and analyze search results in real-time

Search Parameters

  • Synthetic Dataset Size: Controls how much synthetic data is generated for each search request. This parameter must be between 10 and 70. Tuning this parameter allows you to balance between search accuracy and performance.
  • Probability Threshold: Filters results to keep only those above a certain probability of matching the search query. This parameter must be between 0 and 1. Tuning this parameter truncates results but does not affect search time.
  • Top-K Threshold: Limits the number of top results returned. This parameter must be between 1 and Infinity. Tuning this parameter truncates results but does not affect search time.
Adjusting these parameters allows you to fine-tune the balance between search accuracy and performance.

API Integration

Artemis Search provides a RESTful API for seamless integration with your applications.
  • Each organization has its own API key for authentication.
  • The main endpoint is /search, which accepts parameters like search_query, num_batches, top_k, filter_query, and project_id.
  • API requests must include your API token in the Authorization header.
For detailed API documentation, check out our API Reference section.
Understanding these key concepts will help you leverage the full power of Artemis Search in your projects. If you’re ready to start building, head over to our Quickstart Guide to set up your first project.
I