Learn about datasets in Artemis Search and how they power intelligent searches
company_description | company_name | id |
---|---|---|
’Acme is a startup that makes widgets' | 'Acme’ | 1 |
’Wayne Enterprises is a startup that makes widgets' | 'Wayne Enterprises’ | 2 |
’Parker Industries is a startup that makes widgets' | 'Parker Industries’ | 3 |
Choosing Source Data
company_description
as the text we are embedding and the id
as the tags. This choice makes sense since we want to be able to search over the company descriptions and the ids uniquely identify each company.Preparing the Dataset
company_description
column into embeddings and the id
column into string tags, and then store the result as a parquet file.embedding
or tag
.
embedding
or tag
since these are reserved column names.embedding | tag | size |
---|---|---|
[…] | ’Acme’ | 1 |
[…] | ’Parker Industries’ | 3 |
embedding
and tag
columns that we need. However, we also have an size
column. We would call this column a filter_column
since it is not used for searching directly but can be used to filter the search results.