Search

Full text search in Starmind is powered by Elasticsearch. For question search, there is an additional similarity search algorithm which is based on reflective random indexing.

Question search

Full text search

To search for specific questions, use the GET /questions endpoint with a query parameter.

The search will return questions where one of the search terms (excluding stop words) occurs in the question title, question description, tags, in a solution, a comment, or in an attachment. However, certain low-quality matches might still be excluded from the search results, especially if other questions are significantly more relevant with respect to the given query.

Prefixes of a word with at least three characters are indexed as well. Consequently, a question containing the word Starmind will be returned as a search result when using the search query Star. However, such prefix matches will be ranked much lower than exact matches (questions containing the word Star).

The search results are a combination of a full text search (inverse-index based) and similarity search (based on reflective random indexing; see below). When ranking the results, the (tf-idf based) relevance from the full text search receives the most weight, while the similarity search is only weakly weighted. As a result, results from the similar question search that don't contain exact matches for the given query, will only appear lower down in the search results.

Attention: The order of question-search results is influenced by the sort parameter. If you intend to receive questions by their relevance, make sure to remove any additional sorting criteria.

Similar question search

The similar question search can be directly queried using the GET /questions/:ID/similar endpoint (to obtain questions that are similar to a given published question) or the GET /search/questions-similar endpoint (to obtain questions that are similar to a given text query). For example, the first endpoint can be used to link to related questions when a user is viewing a published question and the second endpoint can be used to show potentially relevant published questions while a user is writing a new question.

The similar question search focusses on the overall topic of each question and does not depend as much on the choice of words in the text of the question and its solutions. A such, the similar question search does not work as well for keyword searches. For keyword searches, the query parameter on the GET /questions endpoint should be used.

The similar question search is based on reflective random indexing. This algorithm maps each question (including any tags and solutions) to a vector in a higher dimensional space by analysing word co-occurrences. Vectors that lie close to each other (cosine similarity) are good matches for similar content. When executing a similar question search with a text query, the query is converted to a vector as well, and those questions are returned whose vectors lie closest to the query vector.

User search

To search for specific users, use the GET /users endpoint with a query parameter.

Users can be found by full name, email address or custom fields.

Incremental search is supported. For example, when searching for John, users with the surname Johnson will also be returned. However, exact matches are given a higher relevance, so if there is a user with first name John, he will appear before any user named Johnson in the result of this search (unless a sort parameter overrides this behaviour).

The search deals in a flexible way with German names. A user named Müller will not only be found when search for Müller, but also when searching for Muller or Mueller.