This topic's title could seems a bit alarming but it's just a pinch of irony I would give, most of all because I struggled for so many weeks on this feature that it was like fighting against the "dark". However, I just want to focus on few things we should think about before to plan the introduction of the delta index when the environment is updated through the data load.
In particular, if we have to introduce the delta index- in place of the full- and the data load is already set up, or we have to plan from scratch the data load process, we should keep in mind that the data load operations, the order they are executed and the delta updates recorded in the temporary delta tables (TI_DELTA_CATENTRY and TI_DELTA_CATGROUP) are strictly connected; we should not consider the data load a black box respect to the SOLR index, when we need to update the system through the delta mode. The SOLR index and the Data Load are unfortunately tied in a way we cannot think to build a loading process without knowing in which way they interact.
Ok, let's leave fancy words! I will report two specific cases:
Stealing the Info Center definition, "A bundle is a collection of catalog entries that allow customers to buy multiple items with one click", so in other words, a bundle is a simple catentry with one or more components, where each component is another catentry.
The management center allows to create a bundle using the "Catalog tool" option "Create a bundle", filling all the needed fields and finally add the components.
Of course, it allows to delete a bundle, simply selecting the item's menu "Delete"
Ok, till now nothing really new. They are basic operations in CMC. However, it's useful understand what happens under the cover when a delete operation is done. As soon as we delete a bundle, the temporary delta table TI_DELTA_CATENTRY will be populated properly. It will contain:
So for example, let's suppose I delete with CMC the bundle 221701 which has one component: 141047.
The TI_DELTA_CATENTRY will contain the following data:
Running the UpdateSearchIndex (in mode = 1, delta mode) the bundle will properly disappear from the front end, meaning the delta updates is working fine.
So far so good, since we are using the management center. The troubles could come if we use the data load to delete the bundles, a really common scenario!
If we need to remove bundles we can follow the Info Center indications (keep your attention on "Delete" paragraph):
Keeping in mind we also need to update the delta tables, to have a working data load, we need:
You can find the sample files IBM provides in the Data Load sample folder. In particular, we should focus on the loader file; I attach an example (the IBM's one): wc-loader-catalog-entry.xml.
You can also find it in your environment under the folder: WC/xml/config/com.ibm.commerce.catalog/dataload/.
Once again, our aim is to delete the bundle; running the data load, in particular the Info Center example reported just above, we will see the TI_DELTA_CATENTRY got populated but in a different way than the Management Center does. In fact, this time it will contain just the following data:
10001; 219201;D;2013-11-25 17:28:14.996485
You see? It's missing the "Update" action of the components.
Despite the data load has followed our indication and deleted the bundle (in the DB the requested bundle will be marked as "mark for delete") and the UpdateSearchIndex scheduled job ran in delta mode without any apparent issue, the bundle is still visible in the front end cause it's still in the SOLR index indeed, like if nothing happened!
Basically, that happens cause the mediator CatalogEntrySearchIndexMediator does not add the "update" actions of the components. It had to work exactly like the Management Center but it does not.
Once we understood they way the search mediators work, we know how we have to set up the dataload. In this specific case, the datalod was made by just a single step:
To overcome the mentioned problem, we need to add a further step in the loading process. In particular:
It means, we will need:
In this way the data load will process two steps: delete of bundles components and than delete of bundles. The temporary index delta table will be populated exactly as the Management center does, and the SOLR index will be updated properly.
Soon I'll provide further info about this interactions between SOLR delta index and Data Load, in particular discussing about the operation of delete relationships between products and categories.