In the last post I explain how we can put elasticsearch running with custom synonyms, configs and plugins on docker.
Now, let’s see what are the types we can use to define a data model in Elasticsearch.
We don’t need to define our data model in detail if we don’t want custom and advanced search or if our data is simple, we can simply send a request to insert the document in one index that is not yet created and Elasticsearch will automatically create the index with default definition in each field.
But if we want something more custom and improve our search, we should disable the index autocreation in Elasticsearch Cluster Settings and make a PUT request to create the index with custom definition.
Disable Automatic Index creation
To disable the automatic index creation, we should add the following setting to the Elasticsearch Cluster, in our elasticsearch.yaml:
action.auto_create_index: "+.*"
This line will allow Elasticsearch create some internal indexes that are required to run starting with “.” like “.monitoring” and will block the other ones.
Types of fields
Each field has its own type. Next, we will do a small resume about what type we can use, what each fields accepts and what we can do with it:
- Binary → Accepts a Base64 encoded string and is not searchable, however could be used for sorting, aggregations, or scripting if the property doc_values is setter to true. Can store images or videos, in cases where we want to make image aggregations, for example.
- Boolean → Accepts JSON true and false values, can be strings that will be parsed as true or false and is searchable if we set the property index to true (default value). More details about other configurations here.
- Keywords → Accepts all types of values, used to store and index non-text data, such as product categories, product names, and other similar values. Is better than numeric type if we want to make terms queries because search is faster on keyword fields that in numeric fields, but should not be used to full-text search.
- Text → Accepts all types of values, should be used to store and index full-text data values like descriptions, product descriptions, blog posts. Not used for sorting and almost never used for aggregations.