Solr:Get the offset of a highlighted field

This feature of getting the offset of a highlighted field is requested both in Elasticsearch(here) and Solr(here) but is implemented in neither of them.

I tested the solution in this issue in solr 6.6.0 and it proved to work.

Here are the changes to solr configuration files.(The techproducts examples packed in solr 6.6.0 binary is used.(./bin/solr start -e techproducts)

1.Switch from Managed Schema to Manually Edited schema.xml by copying the managed-schema file to schema.xml and set schema factory to ClassicIndexSchemaFactory in solrconfig.xml.

2.Update the target field(e.g:name field) to be termVectors=”true” termPositions=”true” termOffsets=”true”

change to the schema.xml

 

3.Create a lib directory in the collection top(e.g. /opt/solr-6.6.0/example/techproducts/solr/techproducts/lib),and copy the solr-positionshighlighter.jar in this issue to it.

4.Add the jar file above to solr class path.(in solrconfig.xml,add the line of “<lib dir=”./lib” />“)

5.Configure the highlight component to use the customized  highlighter by adding the following lines to solrconfig.xml,and rename the original highlight search component to another name(e.g:highlight2).

  <searchComponent class="solr.HighlightComponent" name="highlight">
    <highlighting class="org.apache.solr.highlight.PositionsSolrHighlighter"/>
  </searchComponent>
changes to the solrconfig.xml

6.start solr(/bin/solr start -e techproducts) and reindex(by posting the documents again)

7.search some keyword(e.g.canon) and see the result

example query:http://localhost:8983/solr/techproducts/select?hl.fl=name&hl=on&indent=on&q=name:canon&wt=json

example response:

{
  "highlighting":{
    "9885A004":{
      "name":{
        "terms":{
          "canon":{
            "position":0,
            "offsets":[0,
              5]}}}},
    "0579B002":{
      "name":{
        "terms":{}}}}
}