How to read data from Endeca CAS record store instance, Run ATG CMS and Endeca baseline update through Dynamo admin? - ATG Endeca Integration Issues

If you are not sure whether the ATG records are pushed at Endeca CAS level or not, then read the record store from the CAS by running below script.

/opt/app/endeca/CAS/version/bin>./recordstore-cmd.sh read-baseline -a <appname_en_data> -f <new-file-path>.xml

Ex. - /opt/app/endeca/CAS/version/bin>./recordstore-cmd.sh read-baseline -a APPNAME_en_data -f ../../APPNAME_en_data.xml

It will help to cross verify if any record is not showing in Endeca but merchandized in BCC and deployed. Sometime CMS(Catalog Maintenance Service) fails and does not pass latest records to Endeca. Try manually running CMS and Endeca Indexing.

Running ATG CMS(Catalog Maintenance Service):

http://localhost:port/dyn/admin/atg/commerce/admin/en/maintenance/startService.jhtml?process=BasicMaintProcess

Running Endeca Baseline update:
http://localhost:port/dyn/admin/nucleus//atg/commerce/endeca/index/ProductCatalogSimpleIndexingAdmin/

Endeca Baseline - Index process fails immediately on RepositoryExport(ATG-Endeca) Process

Possible symptoms in log: /atg/commerce/endeca/index/ProductCatalogSimpleIndexingAdmin ---java.lang.RuntimeException: org.apache.commons.httpclient.ConnectTimeoutException:
The host did not accept the connection within timeout of 30000 ms
There might be a problem with the connectivity between ATG and Endeca CAS – possibly a configuration issue or networking issue.

Check:
1. Check CASHostName and CASPort properties set properly in below components


/atg/endeca/index/SchemaDocumentSubmitter

/atg/endeca/index/DataDocumentSubmitter

/atg/endeca/index/DimensionDocumentSubmitter

Also modify  atg/endeca/index/IndexingApplicationConfiguration component (in ATG ).

2. Ping target Endeca CAS server from the command line on the ATG box to verify that it is accessible on the network.

3. Make sure  DNS resolution isn’t mis-configured such that the CAS hostname is resolving to an IP address of some server other than the actual CAS server.

Endeca data model design - Pros/Cons

Problem Statement:
         Consider  ~30,000 sku’s that are each represented uniquely across up to 8000 stores.   Each store could have up to 20 unique fields for each product. These include various prices, sale prices, coupon codes and on hand inventory which are specific for every store.

This totals up to:

30,000 skus * 8000 stores * 20 unique fields = 4.8 BILLION cells of data that need to be stored in Endeca.

Solution:

There are two viable data models available:

1) “Wide” model that consists of adding store-specific attributes to each base product record. This equates to 30,000 rows of data (one for each product) where each row has 160,000 columns of data.

2) “Multi-Record” model that consists of a full copy of each product record with one store’s data attached. This equates to 240 million rows of data for each product-store combination where each row has 20 columns of data.

Wide record  pro’s and con’s:

Pros:
1) Simple, performant queries

Cons:
1) High indexing time
2) Complex process to dynamically create attributes
3) Complex dimension mapping for precise values like price
4) Indexing scales poorly for >100k properties
5) Complex display logic

Multi-Record pro’s and con’s:

Pros:
1) Simple, Performant Queries
2) Simple Indexing Logic
3) Simple Dimension Mapping for Precise Values
4) Simple Attribute Display Logic

Cons:
1) Large Index Size:  Memory and Disk Footprint
2) Possible Run-time Performance Issues From Inadequate Memory

How to check the Endeca Dgraph health?

A quick way of checking the health of a Dgraph or an Agraph is by accessing the URL:

http://DgraphServerNameOrIP:DgraphPort/admin?op=ping


The Dgraph quickly returns a lightweight HTML response page with the following content:

dgraph host:port responding at date/time

The Dgraph ping is the recommended mechanism for performing automated health checks with a load balancer.

Endeca Forge State: how to restore forge state?

Restoring state is a simple matter of copying the appropriate files from a backup directory to the state directory prior to the next baseline/delta update.

To revert to a previous version of state:


1. Locate the appropriate backup of the data/state directory that you wish to rollback to.

2. For autogen state, copy the autogen* files into data/state

3. For delta update state files, copy the appropriate delta_state*.bin file to into data/state

Tips to Improve performance of ATG-Endeca integration environment (Version 3.1.1 and 3.1.2)

1) Make Sure below patch is applied
     
 Patch 17342677  - It reduces the number of supplemental objects that are returned with the queries and fixes an XML Parser locking problem.

2) Check the properties being returned by Endeca ( Apply Endeca setSelection feature)
     In Assembler, you can select which properties are returned back with the search results
     Include only properties that are required on the application. Here is the ATG component path
  /atg/endeca/assembler/cartridge/handler/config/ResultsListConfig.properties
 Refer - http://ravihonakamble.blogspot.com/2015/06/endeca-select-feature-aka-set-selection.html 

3) Disable Endeca preview on Production
Use/dyn/admin/nucleus/atg/endeca/assembler/cartridge/manager/AssemblerSettings/ component and  set previewEnabled = false

4) Configure records Per Aggregate Record set to 1
 atg/endeca/assembler/cartridge/handler/config/ResultsListConfig
   # For aggregate records, sets the number of sub records that should be included in the results
 subRecordsPerAggregateRecord=ONE  


5) Ensure non-Endeca URLs don’t hit Assembler
Use /atg/endeca/assembler/AssemblerPipelineServlet.ignoreRequestURIPattern component to set URL patterns