I'll be reading it further, and also discussing with my development colleague, Brian, but it looks rather relevant: -
WCM Seedlist is the output format which is being used by WebSphere Portal to crawl and index WCM content. WCM Seedlist is based on the seedlist framework for seedlist v1.0. This is an ATOM feed based format. It doesn't only list links, but also provides additional metadata (author, category, publish date, access rights etc.). It furthermore tells the crawler how to handle the currently processed link: add to index, remove from index, update in index etc.
This article describes how to extend the WCM Seedlist which is consumed by search crawlers like PSE, Omnifind, Google [EITAN: Is that true that google can crawl Seedlist? ] etc. You learn how to enrich the provided metadata with your own custom metadata: update, delete or add new values. The idea is to use the custom metadata in extended search queries to provide better or specific search results.
This article describes how to extend the WCM Seedlist which is consumed by search crawlers like PSE, Omnifind, Google [EITAN: Is that true that google can crawl Seedlist? ] etc. You learn how to enrich the provided metadata with your own custom metadata: update, delete or add new values. The idea is to use the custom metadata in extended search queries to provide better or specific search results.
For more information, go check out the article - Extend WebSphere Portal WCM Seedlist with Custom Metadata
2 comments:
hi,
I would like to add your blog to our websphere library [http://websphere.gizapage.com]. Let us know, if it is okey for you.
Thanks
Hi Joseph, thanks for your comments. Please feel free to go ahead and list the blog. If it helps, I'm also listed on http://planetlotus.org, regards, Dave
Post a Comment