Wednesday, 26 August 2015

Content Enrichment

Content enrichment is about manipulating crawled content before it is added to the search index. For example, add a sentiment analysis score to indexed social activity.

Some components that support content enrichment are:

  • Version control
  • Technical metadata (formats, format versions, validation rules, etc)
  • Provenance data (processing history)

Questions:


  • How standardised is the enrichment information?
  • How volatile is enriched information?
  • When is the content enhanced (by author, during submission, during editorial, etc)?
  • Where does enhanced information live (embedded, externally)?

Key challenges:


  • What is the master source/copy of the information?
  • Is the information normalised or de-normalised (repeating parent metadata across child elements)?
  • How to synchronised across multiple systems?




No comments:

Post a Comment

Online Encyclopedia of Statistical Science (Free)

Please, click on the chart below to go to the source: