Oracle Commerce- Think Out of the box!!!: Endeca CAS | Multiple Record store merge

Endeca CAS | Multiple Record store merge - Oracle Commerce 11.1

Multiple record store could be used to merge data. This merging would be work as switch join. Find out the steps below to use multiple record store :-

1. Open <Endeca App Path>\APPNAME\config\cas\last-mile-crawl.xml file and add all record stores. Find out sample below. Multiple record store is highlighted below. We can add any number of record stores to merge.

<id>com.endeca.cas.source.RecordStoreMerger</id>

</moduleId>

<key>dataRecordStores</key>

<value>APPNAME-data</value>

<value>APPNAME-web-crawl</value>

</moduleProperty>

<key>dimensionValueRecordStores</key>

<value>APPNAME-dimvals</value>

</moduleProperty>

</moduleProperties>

</sourceConfig>

2. Run following command to update configuration in CAS

a. <Installpath>\CAS\11.1.0\bin >cas-cmd.bat updateCrawls -f <Endeca App Path>\APPNAME\config\cas\last-mile-crawl.xml

3. Run Baseline.

Note :- Endeca CAS uses record.id as unique identifier. We can define our own. Suppose two record appears in record store A and B with same record id . CAS would discard one of the record.

37 comments:

UnknownFriday, July 10, 2015 at 10:20:00 AM CDT
Hi Ravi,

In the context of your above thread I am having some below queries,

I got success to add both the record-store's (using CAS based app deployment model) and I used Datasource approach of the workbench to achieve my configuration , with the help of Manipulators(Modifying script) and (Filter Scripts) on data Source.

Where I have to specify my other properties from the record's such as,

1. I have to map my "record.id" property with endeca property name "Product.RepositoryId" , (As I am using Product.RepositoryId as a rollupKey )
2. I need to define some properties as a searchable, dimension etc. where I can configure them?
3. If suppose I want to rename my source i/p property with new endeca property name.

In short , in terms of older developer studio based pipeline approach we used to do that with Dimensions, Properties , PropertyMapper section . So now where we should do all that??

Could you please provide your valuable inputs.

Thanks,
Swapnil
ReplyDelete
Replies
RaviFriday, July 10, 2015 at 12:23:00 PM CDT
Hi Swapnil,

You can find details about how to create dimension and properties in forge-less approach

http://ravihonakamble.blogspot.com/2015/07/oracle-commerce-11x-how-to-define.html

Let me know if you have any doubts during setup.

Regards,
Ravi
ReplyDelete
Replies
UnknownTuesday, July 14, 2015 at 2:26:00 AM CDT
Hi Ravi,

Thanks, for your reply. I have defined below mapping,

"document.text" : {
"propertyDataType" : "ALPHA",
"mergeAction" : "ADD",
"sourcePropertyNames" : ["Endeca.Document.Text"],
"jcr:primaryType" : "endeca:property",
"isRecordSearchEnabled" : true,
"isRecordFilterable" : true
},
"document.name" : {
"propertyDataType" : "ALPHA",
"mergeAction" : "ADD",
"sourcePropertyNames" : ["Endeca.FileSystem.Name"],
"jcr:primaryType" : "endeca:property",
"isRecordSearchEnabled" : true,
"isRecordFilterable" : true
}

I tried the above mappings and able to index the data. But I am bit confused when I was trying to search any contents inside the JSP_reference application for "document.text" property/(This is full crawl contents of the PDF file).
I am facing following issues:-

1. Even my record contains the Rollup_key , when I am searching for the contents with (Property=All , match _mode=All , term=Contents) in Jsp_ref application , it's not populating result.
But when I am using (Property=document.text, match_mode=All , term=Contents) , It's giving me proper result.

So do you know why this misbehave at the MDEX level? Any solution to overcome this situation?

2. I am using PDF files for crawl having size more than 2MB each. So is it happening because of search data is large in size..?

3. Do I need to specify some extra parameters during the property(document.text) formation apart from the above formation.?

Thanks,
Swapnil
ReplyDelete
Replies
UnknownTuesday, July 14, 2015 at 4:33:00 AM CDT
Hi Ravi,

Thanks, for your reply. I have used below mappings,
"document.text" : {
"propertyDataType" : "ALPHA",
"mergeAction" : "ADD",
"sourcePropertyNames" : ["Endeca.Document.Text"],
"jcr:primaryType" : "endeca:property",
"isRecordSearchEnabled" : true,
"isRecordFilterable" : true
},
"document.name" : {
"propertyDataType" : "ALPHA",
"mergeAction" : "ADD",
"sourcePropertyNames" : ["Endeca.FileSystem.Name"],
"jcr:primaryType" : "endeca:property",
"isRecordSearchEnabled" : true,
"isRecordFilterable" : true
}

I tried the above mappings and able to index the data. But I am bit confused when I was trying to search any contents inside the JSP_reference application for "document.text" property/(This is full crawl contents of the PDF file).
I am facing following issues:-

1. Even my record contains the Rollup_key , when I am searching for the contents with (Property=All , match _mode=All , term=Contents) in Jsp_ref application , it's not populating result.
But when I am using (Property=document.text, match_mode=All , term=Contents) , It's giving me proper result.

So do you know why this misbehave at the MDEX level? Any solution to overcome this situation?

2. I am using PDF files for crawl having size more than 2MB each. So is it happening because of search data is large in size..?

3. Do I need to specify some extra parameters during the property(document.text) formation apart from the above formation.?

Thanks,
Swapnil

ReplyDelete
Replies
RaviTuesday, July 14, 2015 at 10:57:00 AM CDT
Hi Swapnil,

You need to add above two properties as part of "All" search interface to search across those fields.

Here are the steps for forge-less approach:

1) Go to MDEX folder
/opt/app/endeca/apps/CRS/config/mdex

2) Add newly created properties in below two files:
CRS.recsearch_config.xml
CRS.recsearch_indexes.xml

3) Run indexing

I suggest to create separate search interface for Crawled content and use it rather using All search interface.

Let me know if you are still facing issues.

Regards,
Ravi
ReplyDelete
Replies
UnknownWednesday, July 15, 2015 at 2:14:00 AM CDT
Hi Ravi,

Thank you very much ...! Really appreciate your helpful move. It worked very well for me .Just wanted to know in terms of CAS-Based approach which Oracle Guided Search Document(11.1) specifies all this configuration details (Which previously we are configuring through Developer Studio pipeline configuration).

Thanks,
Swapnil
ReplyDelete
Replies
RaviWednesday, July 15, 2015 at 11:51:00 PM CDT
Hi Swapnil,

Glad that it worked for you :).

You can find more details in Endeca Developer Guide. Here is documentation related blog where you will find version specific links:.

http://ravihonakamble.blogspot.com/search/label/MDEX%20documentation

Regards,
Ravi
ReplyDelete
Replies
UnknownSunday, October 4, 2015 at 11:31:00 AM CDT
Hi Ravi,

Your blog is really helpful.

I am working with the above approach to show content from WebCenter Sites to CRS, in Virtual Machine (Demo Machine)

I have configured everything properly.

records from WCS are in StoreContentRepository and indexing is done properly.

My "endeca_jspref" application is showing results for WCS contents very well.. I have removed filters from CRS.
But on hitting search on OOTB CRS, I am not getting proper result list.

Its giving me 3 records for single article (Not Consolidated), In short Not applying the Rollup key during search.

Other records(like OOTB Products, PDF records) are working fine..

Do you have any suggestions or check,s for me.

Regards,

Shailesh Mane
ReplyDelete
Replies
UnknownMonday, October 5, 2015 at 1:42:00 AM CDT
Hi Ravi,

Is there any way to apply one or more roll-up key during querying the MDEX? Is there any OOTB CRS supported component available.. I am asking this because I have seen there are around 4 rollup key's applied at "Index-Config.json" file in CRS. But in Shailesh's context he is configuring one by tweaking the respective SCAC,SCIC ect. files , as described above.

Do we have any other workaround for achieving this.

Records,
Swapnil
ReplyDelete
Replies
UnknownWednesday, November 4, 2015 at 7:21:00 AM CST
Hi Ravi,

The situation is, I'm trying to add derived properties according to your post
http://ravihonakamble.blogspot.com/2015/07/oracle-commerce-11x-how-to-define.html
The .derived_props.xml file is added into /apps/appname/config/mdex, according to my last-mile-crawl configuration.
And I've added a roll-up key in endeca_jspref.

Next, I run to run baseline index, but the CAS output log shows:
INFO [cas] [cas-appname-last-mile-crawl-worker-3] com.endeca.itl.executor.output.mdex.MdexConfigurationTransformer.[appname-last-mile-crawl]: skipping copying of non-MDEX config file: .derived_props.xml
So in the folder /apps/appname/data/cas_output, appname.derived_props.xml isn't update. (I change other file in mdex folder and it got changed in cas_output)
I also tried and same thing happens to:
CRS.recsearch_config.xml
CRS.recsearch_indexes.xml

So now sadly, when I click on one roll-up key, the "DERIVED PROPERTIES:" Label shows empty result.
Would you suggest anything that I can do?
Thanks,
Leung
ReplyDelete
Replies
UnknownMonday, November 30, 2015 at 1:49:00 AM CST
Hi Ravi,

I need some help.
I am trying to combine the WEB crawler output with the atg product catalog output in Oracle guided search CAS based Indexing approach.
But I have seen that the WEB crawler output doesn't contains any property like - "Record.id"
Instead it is creating the "Endeca.Id".
In my case I need to map the "Endeca.id"(from Web Crawler Output) to the "Record.id".
OR
Want to add new property at the time of crawling.

Could you please suggest me the best way to achieve this.

Regards,
Shailesh
ReplyDelete
Replies
DjangoSaturday, December 19, 2015 at 2:15:00 AM CST
Hi Ravi,

Very nice content helped me lot in learning.I have a query regarding migration,How to migrate dimension data from 11.0 to 11.1(forge less), is there any utility to generate input file for 11.1 CAS from 11.0 dimension export.
ReplyDelete
Replies
sandeep dandinWednesday, May 11, 2016 at 6:05:00 AM CDT
Hi Ravi,

How Record Manipulators are used in CAS based. Please let me know the steps.

Thanks
Sandeep Dandin
ReplyDelete
Replies
UnknownFriday, May 20, 2016 at 5:19:00 AM CDT
This comment has been removed by the author.
ReplyDelete
Replies
UnknownFriday, May 20, 2016 at 5:29:00 AM CDT
Hi Ravi,

It has been nice going through your blogs. Although, I am stuck in one basic question.

In forge we can create record-> manipulators where we write tag. Do we have similar feature in forge less approach. If not then what is the work around. Quick response will be well appreciated.

Sumit
ReplyDelete
Replies
UnknownMonday, May 23, 2016 at 6:21:00 AM CDT
Hi Ravi,

I have seen this note:
Endeca CAS uses record.id as unique identifier. We can define our own.

Please tell me, how and where can I do this and where can I overwrite existing configurations?

Kindest regards,
Heiko
ReplyDelete
Replies
DalThursday, June 9, 2016 at 2:06:00 PM CDT
Hi Ravi,

I have a question. I am using web crawler to crawl data from some internal sites. TRecords are generated with Endeca.Id as unique key. I have added my crawler record store instance to last-mile-crawl of my endeca application. But these web crawler records are getting skipped during base line. Other record store instances in the last-mile crawl has record.id as unique key and those are getting indexed. Can you suggest me what to do?
ReplyDelete
Replies
UnknownThursday, July 14, 2016 at 8:20:00 AM CDT
This comment has been removed by the author.
ReplyDelete
Replies
UnknownTuesday, July 19, 2016 at 6:03:00 AM CDT
Hi Ravi,

I am trying to migrate webcrawling from CAS-Forge to only CAS. In previous one, I have record manipulator to ensure it removes error pages etc. Can i achieve the same thing directly in CAS. Currenly, I am writing the webcrawl output directly to recordstore. Please let me know if it can be done.

Regards,
Sumit
ReplyDelete
Replies
VPThursday, July 21, 2016 at 4:08:00 AM CDT
Hi Ravi,
Can you tell me how I can create a delta pipeline in conjunction with my existing baseline pipeline? I want to do an incremental crawl from db but not sure how to proceed. Please help.

thanks
ReplyDelete
Replies
AnonymousThursday, July 21, 2016 at 8:13:00 AM CDT
Hi Ravi

Could ypu help me out here please?
Can you tell me which two queries can be fired together, that is reccord and aggregation,record and navigation,record and dimension, navigation and dimension?
ReplyDelete
Replies
UnknownWednesday, August 1, 2018 at 7:53:00 AM CDT
Hi Ravi,

Can you please help me in the below content?

I am trying to merge the file system record stores(generated with ID Endeca.id) and OOTB data record stores(generated with common.id).

While indexing the data, I am getting an issue like, "missing source key common.id".

Tried multiple ways to resolve this. I modified the data-recordstore.xml under the APPNAME/config/cas/ by changing the record spec id as 'Endeca.Id'. But it is failing for other data source.

help you please suggest? Should I write a manipulator to change the record id as 'common.id' for both?

Thanks,
Prasanthi b
ReplyDelete
Replies
Omk@r PatilFriday, April 26, 2019 at 3:01:00 AM CDT
Hi ravi,

I have a issue regarding my indexing, my baseline indexing is getting completed successfully but m not able to see any product in endeca jspref.
can you please guide me how to resolve this issue.

thanks in advance
ReplyDelete
Replies
aiyushrocksFriday, October 18, 2019 at 12:34:00 PM CDT
Hi Ravi,

I have issue with endeca indexing. I create new InterestOutputConfig component and add it into ProductCatalogSimpleIndexingAdmin component. After indexing i can see correct in dyn/admin at InterestOutputConfig component by generate ID.

But that new record type not shown in jspref. Also i indexed list of interest with product, that also shown in ProductCatalogOutputConfig component. but not in jspref.

Can you please guide or suggest any thing is missing.

Thanks in advance
ReplyDelete
Replies

Add comment

Oracle Commerce- Think Out of the box!!!

Endeca CAS | Multiple Record store merge - Oracle Commerce 11.1

37 comments:

Featured Post

How to find indexed pages and count on Google Search Engine? - SEO Sitemap Tips

Endeca CAS | Multiple Record store merge - Oracle Commerce 11.1

37 comments:

Featured Post

How to find indexed pages and count on Google Search Engine? - SEO Sitemap Tips

Subscribe To