Over the years, natural language processing, in the world of search, went from interesting detail to a must have, especially in areas such as e-commerce. Engineers started incorporating classification, synonym generation, named entity recognition and much more into their search systems giving users better search results and in some cases leading to more revenue.
Lucene has a lot of options for configuring similarity. By extension, Solr and Elasticsearch have the same options. Similarity makes the base of your relevancy score: how similar is this document (actually, this field in this document) to the query? I’m saying the base of the score because, on top of this score, you can apply per-field boosts, function scoring (e.g. boost more recent documents) and re-ranking (e.g. Learning to Rank).
During the Entity Extraction For Product Searches talk that Radu Gheorghe and I gave at Activate conference in Montreal last year, we talked about various natural language processing and machine learning algorithms. We showed entity extraction both on top of Solr and using external libraries. In this post we dig into Learning to Rank with Solr Streaming Expressions.
The search-first problem-solving approach—meaning “open up the log search tool” (Splunk, ELK, Loggly, SumoLogic, Scalyr, etc)—is a costly and time-consuming operation during which the true source of a problem is rarely pinpointed in short order. Log search tools require work by the user to transform text strings into fields that are ready for statistical analysis.
We’ve been working with Elasticsearch since its inception, either with clients on consulting for Elasticsearch products and Elasticsearch production support, or by building our own hosted log management solution. For the last 4 years, we’ve also been sharing our knowledge through Elasticsearch training classes. In 2018, we had remote public training classes on a fixed quarterly schedule, so you can more easily plan your learning time and budget.
If you rely on Elasticsearch for centralized logging, you cannot afford to experience performance issues. Slow queries, or worse — cluster downtime, is not an option. Your Elasticsearch cluster needs to be optimized to deliver fast results.
This tutorial walks through using Rancher to deploy Elasticsearch into a Kubernetes cluster. At the end of this article, you will have a fully functional 2-node Elasticsearch cluster, complete with sample data and examples of successful queries.
We’re excited to announce the release of the Field Stats API plugin for Elasticsearch. The Field Stats API used to be present from Elasticsearch 1.6 to 5.6, to provide efficient statistics for fields of each index. For example, the minimum and maximum values of a date field.