Elastic Search – Fast Employee search and Aggregation.
Couple of years back, I was reading on Elastic Search and
found it to be very useful. I came up with a proof of concept to show some of
the data used by our clients in a very useful way. I cannot talk about that PoC
here, as the data is PII. The data provided was very intuitive and there was
lot of actionable and intelligent info generated. It had tremendous impact on
the clients that resulted in lot of efficiency savings and increasing their
bottom line. This PoC was very well received by the clients and this was added
as a feature to the product. Elastic Search and Kibana features were used in
this product.
Here I am demonstrating similar features used on the PoC
with generic data. I am using randomly generated employee database as sample
dataset for the demo.
Elastic Search has an interesting feature about Aggregation.
You can bucket your employee data in to specific logical units for further
aggregation and search. For Ex. A typical employee class data consists of
Employee Number, First Name, Last Name, Address, City, State, Zip. We can do
aggregation of City, State and Zip, so that we can localize our searches again
within these buckets. Also, with aggregation, we can figure how many students
are there in a particular, City or state or a zip code.
You can look at the attached code for the implementation. I
am going to illustrate some key concepts which are associated with
ElasticSearch.
Indexing: While Indexing/seeding the data, EdgeNGrams,
Analyzers and Filters. This will help us fast searching of the text, either
partial or complete. The key thing is, should be able to search partial text
too and also the complete text should be available for aggregation. This is
achieved with EdgeGrams and Analyzers.
Code Snippet:
public void CreateEmployeeIndex3()
{
client.CreateIndex(IndexName, i => i
.Settings(s => s
.Analysis(a => a
.TokenFilters(tf => tf
.EdgeNGram("edge_ngrams", e => e
.MinGram(1)
.MaxGram(50)
.Side(EdgeNGramSide.Front)))
.Analyzers(analyzer => analyzer
.Custom("partial_text", ca => ca
.Filters(new string[] { "lowercase", "edge_ngrams" })
.Tokenizer("standard"))
.Custom("full_text", ca => ca
.Filters(new string[] { "standard", "lowercase" })
.Tokenizer("standard")))))
.Mappings(m => m
.Map<EmployeeInfo>(mm => mm
.AutoMap()
.Properties(p => p
.Text(t => t
.Name(n => n.Employee_Num)
.Analyzer("partial_text")
.SearchAnalyzer("full_text"))
.Text(t => t
.Name(n => n.First_Name)
.Analyzer("partial_text")
.SearchAnalyzer("full_text"))
.Text(t => t
.Name(n => n.Last_Name)
.Analyzer("partial_text")
.SearchAnalyzer("full_text"))
.Text(t => t
.Name(n => n.Address)
.Analyzer("partial_text")
.SearchAnalyzer("full_text"))
.Text(t => t
.Name(n => n.City)
.Fields(f => f
.Text(tt => tt
.Name("mycity")
.Analyzer("partial_text")
.SearchAnalyzer("full_text"))
.Keyword(k => k
.Name("keyword_city")
.IgnoreAbove(256)
)))
.Text(t => t
.Name(n => n.State)
.Fields(f => f
.Text(tt => tt
.Name("mystate")
.Analyzer("partial_text")
.SearchAnalyzer("full_text"))
.Keyword(k => k
.Name("keyword_state")
.IgnoreAbove(256)
)))
.Text(t => t
.Name(n => n.Zip)
.Fields(f => f
.Text(tt => tt
.Name("myzip")
.Analyzer("partial_text")
.SearchAnalyzer("full_text"))
.Keyword(k => k
.Name("keyword_zip")
.IgnoreAbove(256)
)))
)))
);
}
Aggregation:
Illustrating aggregation using the UI.
Here the City, State and Zip are the aggregating buckets.
When we click any or all of these and click update, it will list down all the
cities, states and Zips in which the employees belong to.
We can further select any of these and refine our search
even further.
Search:
The search in ES is really really fast.
Lets say I am searching for an employee. I can search either
based on employee number or first name or last name. Complete text is not
required only partial text also is enough. ES will try to find the matching element.
NOTE:
·
I have used Visual Studio 2017 Community Edition.
·
Elastic Search Version is 5.5.0
Complete code can be downloaded here. I have not included the packages, try getting them from NuGet
https://sites.google.com/site/letuscodecrazy/home-1
For Elastic search download from its website.
Important Links:


No comments:
Post a Comment