Apr 18, 2010

Muddiest Point - Final Week

Once again, I don't have any real muddy points from this last week.

Thanks!

Apr 4, 2010

Muddiest Point - Week ?

Now, this isnt a muddiest point, but i was filling out my taxes using Turbo Tax and became fascinated by the IR system they use for Q&A. The system appears to tailor search results based on the portion of the tax form you are working on and after you select a link for an answer there is a relevance feedback element for the search result. I feel a little embarrassed by how fascinated I am by the Turbo Tax website!

Mar 21, 2010

Readings - Web search and link analysis

The web search basics chapter (19 in IIR) was pretty much review at this point, so I don't think there is really much to say about it. Not that review is not useful, it certainly help refresh the memory.

As for the other readings, they are rather heady and I suspect that Dr. He's lecture will do much to clarify the details.

Muddiest Point - Unit 8

Apparently I read the right chapter from the wrong book last week, WOOPS! Anyway, no muddiest points for me this week.

Mar 14, 2010

Readings - XML Retrieval

The IIR chapter on XML retrieval contained a number of basic elements about XML and XML parsing tools as an introduction then launched into an extensive discussion of XML retrieval methods.

One element of confusion I had was with the Structured Document Retrieval Principle. The principle itself is not at all confusing, but the example given in the text seems to be the opposite of the ideals of the principle. If the idea is to provide the most specific element vis a vis the query, why would a query for Macbeth return the Title Macbeth rather than the Scene Macbeth's Castle, which is a more specific element (i.e. further down the element tree)?

The discussion of the vector space model of XML retrieval is a little confusing, but I suspect this will be clarified in the lecture this week.

It would also be nice have a little more discussion of data-centric XML retrieval. The chapter basically blows this off as something this is best not handled in XML retrieval, but maybe we could talk about that a little bit. I am curious, even, what a data centric XML file would look like, given that most data is tabular and linked across fields.

Muddiest Point - Unit 7

The only issue I have at this point is with the homework. Given that all of the assignments are, to a degree, dependent upon successful completion of the previous assignment, I think it would be helpful to get feedback (grades, comments) relatively soon.

Thanks!

Feb 21, 2010

Readings - Relevance Feedback and Query Expansion

I found the readings this week to be informative, as I had never considered the details of how user-feedback is or could be incorporated in to IR. I specifically found the discussion of pseudo and implicit relevance feedback to be interesting. I wonder about the tradeoffs between query efficiency and retrieval success in pseudo relevance feedback given that one must, presumably, run two queries to get one result. Is  this efficiency not really an issue?

I also found the discussion of thesaurus-based query expansion to be interesting. I have seen some of this in my work with bibliographic databases, but might look into it a little more now that I understand how it works.