One of the things I like most about the last enhancements- maybe introduced with FEP6, not sure of that- is the possibility to create an extension of the SOLR index. An extension index is, in a nutshell, an index which complements the data provided by the base index. A enlightening explanation is reported in the following IC page: Extending the WebSphere Commerce search base index schema.
According to the Info Center: "... dynamic information such as product inventory or ratings change more frequently. Therefore, it is put into a separate index, where it can be refreshed at different intervals than the base product index". That's awesome and it looks exactly the basic idea under the concept of Inventory Index (introduced with FEP6).
Above is an extraction of the image reported in the IC page already mentioned before.
Using this feature we can finally think to implement re-index strategies with high frequency, pretty useful for dynamic data. The basic bricks used by the WCS team to build up this new feature are- according to my investigation- a custom Search Component (SolrSearchMultipleQueryComponent) and the possibility to make Join between different indexes.
That combination gives to the WCS implementation of SOLR an added value makes the SOLR native Join feature transparent to the developers. Let's review the main things ...
Create an extension index
The creation of an extension index is a pretty straightforward task; it could be outlined in the following 4 steps:
- Copy the core templates from the folder generic;
- Update schema.xml according to the data you need to keep in the extension index;
- Register the extension index in the solr.xml;
- Define the new core (subcore) as an extension of the CatalogEntry core updating the table SRCHCONFEXT;
The above procedure has been extracted from the documentation provided by the Info Center. In particular, the IC reports a step-by-step guide to store Ranking data in an extension index.
If you create the Inventory Index, using the script setupSearchIndex with the parameter indexsubtype valued as Inventory, you will automatically get an extension index called Inventory, with the data and status of the SKU's stock related to each fulfillment center defined.
Access the cores
Once the extension index is up and running you can access the data querying directly the core or you could also use them to filter the base index results. I'll give few example, taking again the Ranking data mentioned in the Info Center:
#1 Access the Ranking data, querying the core of the extension index
#2 Access the Ranking data, querying the core of the extension index and filter the results with specific ranking range
#3 Access the Catalog Entry data, querying the core of the base index and filter the results with specific ranking range
We could actually do the same for the Inventory core, created following the IC "Indexing external inventory data in WebSphere Commerce search":
#1 Access the Inventory data, querying the core of the extension index
#2 Access the Inventory data, querying the core of the extension index and filter the results with specific stock range
#3 Access the Catalog Entry data, querying the core of the base index and filter the results with specific stock range
Cross data between extension and base index: core and subcores
Since I created the Inventory Index for the first time I was a bit confused the way the Catalog Entry index could interact with this extension index. So, I tried to understand something more. Investigating the SOLR documentation I found a native way to cross data between different cores: the Join. It actually works exactly as the referential integrity constraint of the RDBMS. However, WCS add a new concept not really defined in SOLR: the subcore. In fact, the data between core and subcores are crossed but the join between them is not visible at tool to the developer. It seems it has been implemented in the custom Search component: SolrSearchMultipleQueryComponent.
Looking at the CatalogEntry index (base index) configuration file- solrconfig.xml- you will find the following definition for the RequestHandler:
<arr name="components"> <str>wc_query</str> <str>wc_facet</str> <str>mlt</str> <str>stats</str> <str>debug</str> <str>wc_spellcheck</str> </arr>
and the wc_query Search Component:
<searchComponent name="wc_query" class="com.ibm.commerce.foundation.internal.server.services.search.component.solr.SolrSearchMultipleQueryComponent"> <int name="cacheSize">1320000</int> <str name="referenceField">catentry_id</str> <arr name="subCores"></arr> </searchComponent>
So, the search of the base index is handled by the component SolrSearchMultipleQueryComponent. Supposedly, this component makes the join between the cores. At runtime, the relationships between base core and subcores are defined in the table SRCHCONFEXT. In this way, we can querying the base index using filter data of the extended indexes. Cool!
The extended index looks a very powerful feature; it allows to avoid the re-index of the whole data keeping the dynamic information on a subcore. However, it could impact the performances since data crossing are executing in order to retrieve the information needed. Keep use this feature but care about the possible performance degradation.