contact us.
call us.
join us.
we respect your data
At Sagittarius, we want to share our passion and excitement for digital. By providing your details you agree to be contacted by us.
We will treat your personal data with respect and you can find details in our Privacy Statement - this includes:
- What information do we collect about you
- How will we use the information about you
- Access to your information and correction
call us.
join us.
win with us.
We exist to make your business thrive and our greatest reward is our returning clients. Our focus is and always will be on our clients and not on industry awards and accreditations, which could account for why we’ve won so many of them…
Stemming Search Terms in Sitecore Solr Indexes .

Anton Tishchenko
I wrote about stemming in Sitecore Lucene content search in my previous article. But, just to remind you: Stemming is the process of reducing inflected (or sometimes derived) words to their word stem or root form. It allows you to make your search to return more relevant results. That is why usage of stemming could be a good and easy option to improve your search.
Configuring stemming in Solr is even easier than configuring it in Sitecore Lucene Content Search. You don’t need to write even one line of code. All you need is configuration.
There is schema.xml file in configuration of each Solr core. When you will open it you will see that there is field type text_en:
<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory" /> <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.EnglishPossessiveFilterFactory" /> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt" /> <!-- Optionally you may want to use this less aggressive stemmer instead of PorterStemFilterFactory: <filter class="solr.EnglishMinimalStemFilterFactory"/> --> <filter class="solr.PorterStemFilterFactory" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory" /> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.EnglishPossessiveFilterFactory" /> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt" /> <!-- Optionally you may want to use this less aggressive stemmer instead of PorterStemFilterFactory: <filter class="solr.EnglishMinimalStemFilterFactory"/> --> <filter class="solr.PorterStemFilterFactory" /> </analyzer> </fieldType>
It contains filter solr.PorterStemFilterFactory that do stemming of your indexed document and your query. Compared to Lucene.Net, you have three options what stemmer to use for English language: Porter, Lovins or Porter2 Also, you have the ability for stemming documents in different languages: Armenian, Basque, Catalan, Danish, Dutch, Finnish, French, German, Hungarian, Italian, Norwegian, Portuguese, Romanian, Russian, Spanish, Swedish, Turkish.
To use stemming on your field you should change its type from text_general to text_en:
<field name="_content" type="text_en" indexed="true" stored="false" />
Then you need to restart Solr and rebuild indexes. And this one small configuration change will improve search quality on your website.
Want to read more about my findings? You can find more of my blogs here.
want to speak to one of our experts?
