Mar 14, 2010

Readings - XML Retrieval

The IIR chapter on XML retrieval contained a number of basic elements about XML and XML parsing tools as an introduction then launched into an extensive discussion of XML retrieval methods.

One element of confusion I had was with the Structured Document Retrieval Principle. The principle itself is not at all confusing, but the example given in the text seems to be the opposite of the ideals of the principle. If the idea is to provide the most specific element vis a vis the query, why would a query for Macbeth return the Title Macbeth rather than the Scene Macbeth's Castle, which is a more specific element (i.e. further down the element tree)?

The discussion of the vector space model of XML retrieval is a little confusing, but I suspect this will be clarified in the lecture this week.

It would also be nice have a little more discussion of data-centric XML retrieval. The chapter basically blows this off as something this is best not handled in XML retrieval, but maybe we could talk about that a little bit. I am curious, even, what a data centric XML file would look like, given that most data is tabular and linked across fields.

No comments:

Post a Comment