Indexing
...
Entities
Sync

Pipelines

22min

If an attribute has been registered with Klevu for indexing that does not mean that the entity data from that attribute is automatically sent to Klevu. It means it can be sent to Klevu.

Entity data indexed to Klevu is defined in the indexing pipelines YAML files. These files can either be:

The auto-generated pipeline YAML files can be found in var/klevu/indexing/pipeline/.



PHP-SDK, Pipelines & Magento Overview

Klevu indexing uses Klevu’s Data Indexing JSON API. Klevu has developed a PHP-SDK to assist developers with integrating Klevu indexing into any PHP Platform. This PHP-SDK is used to index data to Klevu in these Magento modules.

Klevu has also developed PHP-Pipelines, these are generic PHP pipelines that can be used for any purpose and are independent of Magento and the PHP-SDK.

These 2 modules are brought together via php-sdk-pipelines, adding PHP-SDK functionality into the pipelines.

There is a 4th module, module-m2-pipelines which integrates these new components into Magento. This handles dependency injection and adds some Magento context into the pipelines.



Types of Pipeline

Create Record Pipeline: Pipeline\CreateRecord

Create and return a new object. Example: Create an object with keys id and type

YAML


Output:

JSON


Fallback Pipeline: Pipeline\Fallback

If any validation exception ValidationExceptionInterface is thrown, catch it and move on to the next stage, otherwise return the output from this stage.

Example:

  • Get the product type,
  • Validate that the type is “bundle”
    • If valid include this file product/bundle.yml
    • If not valid move to the next stage which will include this file product/default.yml
YAML


Note: the stage names have been removed from this example to make it easier to read

Iterate Pipeline: Pipeline\Iterate

Loop over the provided records.

Example: Iterate over all records provided into the pipeline

YAML


or get all customer groups, iterate over then, setting the current customer group at the start of each loop

YAML



Pipeline Stages

Static Value

Add a value to the output.

Example:

YAML


Output:

JSON


Extract

Get the value from the extraction and pass it to the next stage Example:

YAML


Output:

JSON


Transform

Transform the provided value using one or multiple transformers. Example:

YAML


Output:

JSON


Alternatively, transformations can be passed into extractions, skipping the need to add an extra stage.

Example:

YAML


Output:

JSON


Validate

Example:

YAML


If validation fails an exception of type Klevu\Pipelines\Exception\ValidationExceptionInterface will be thrown. Fallback pipelines will catch this an proceed to the next stage. See module-m2-indexing-products/etc/pipeline/attributes/images.yml

Register Context

Register a variable to be used elsewhere in the pipeline.

Example: Extract the current product and then register the context as currentProduct

YAML


Then later in the pipeline (possibly in another file, even another module) the context can be used.

YAML


Output:

JSON


Log

Log the current context. Example: module-m2-indexing/etc/pipeline/process-batch-payload.yml Take whatever the current context is (the output from the previous operation) and log it along with the message “Chunked payload” at log level “debug”.

YAML


Output (formatted to make it more readable):

Text



Magento Implementation of Pipelines

When we trigger a sync from Magento, whether via cron or CLI, Klevu\Indexing\Service\EntitySyncOrchestratorService::execute is executed. This loops through all entityIndexerServices. These entityIndexerServices are injected via di.xml from each entity indexing module (module-m2-indexing-categories, module-m2-indexing-cms, module-ms-indexing-products).

e.g. from module-ms-indexing-products/etc/di.xml

XML


The injected EntityIndexerService is a virtual type, into which we pass an EntityIndexingRecordProvider and a pipeline YAML file.

XML


The EntityIndexingRecordProvider will provide the records to index. The pipelineConfigurationFilepath contains the default yaml file containing instructions on how to process those records.



Updated 19 Nov 2024
Doc contributor
Doc contributor
Did this page help you?