brontobot: September 2009

Sep 21, 2009

Muddiest Point - 2.5

Hopefully we can discuss, in class, or in a special session with Dr. He or the TA (whose name I can't remember at this moment) troubleshooting the installation of DSpace or Fedora or Greenstone. I made an attempt at installing all three of these this weekend to no avail. Granted, I could have dug in to the help files and online message boards a little more, I thought that if there was some local knowledge vis a vis installations that we might be able to start using the software a little sooner.

Sep 17, 2009

Unit 3 Reading Notes

Well, now we're getting down in the weeds with some of this material and I kind of like it. The discussion in the Arms chapter answered a lot of questions that have come up in my mind over the last, say, 10 years. Difference between HTML and XML, what is 'mark up', how are electronic texts generated and rendered. While we have certainly only seen the tip of the iceberg when it comes to SGML, XML etc. I wonder how difficult these languages are to learn and to use effectively. I also wonder about OCR and how much it may have improved since Arms wrote this text (its been 10 years after all). Are there OCR programs that are able to read handwritten text, non-standard characters (e.g. the long s seen often before the end of the 18 century), etc.?

I was also fascinated by the discussions in both the Paskin and Lynch documents. While reading Paskin it occurred to me that the proprietary nature of the DOI system might be problematic, or at least viewed as problematic by a community that tends (as far as I can tell) to prefer open systems for software development and programming. The other identifiers described in Lynch seem to each have its own specific niche, and I am curious about interoperability among these identifiers. The URN appears to be more universal in approach, but it is not ready for use at this point (or at least web browsers are not able to make use of them).

As mentioned with the DOI system, the questions raised by Lynch vis a vis uncertainties surrounding potentially proprietary third party databases is unsettling. One would hope that these systems could be implemented in an open (and free) way.

Sep 15, 2009

Muddiest Point - Unit 2

I am curious about how, for instance, the OAI-PMH system works. I have seen a few examples and I know we are going to talk about this later in the class, but it just seems like such a useful protocol. Are there readily available software system that allow the use of the OAI-PMH (i.e. can you use DSPACE or Greenstone)? What are the operational boundaries to setting up a distributed digital library with this type of functionality? I imagine that in some communities getting everyone to format data or metadata in the required way (Dublin CORE?) could be a major sticking point. If the metadata schema required to use the OAI-PMH does not work well with an established institutional metadata schema are there workarounds? For instance, Federal projects are required to store geospatial metadata in FDGC form, can that be augmented in order to set up an OAI-PMH based digital library?

Anyway, maybe I am getting ahead of things, but this is very interesting to me and what I hope to do in the future.

Sep 10, 2009

Reading Response - Week 2

The issues of interoperability raised by Payette et al. are of special interest to me - especially as they relate to digital libraries in the scientific domain. I have seen a need for work in this area while working with scientists on interdisciplinary research projects. Often investigators from different scientific backgrounds have different standards for data formatting which can result in a great deal of efficiency loss when data sharing occurs - due primarily to time spent reformatting data. The architecture proposed by Payette et al. (and by Arms et al. for that matter) not only could allow investigators to search for and share data more easily, but potentially could contain technology, in the form of disseminators, for data conversion (e.g. from one data format to another) in the data sharing process.

The extensibility and flexibility of the systems described in Payette et al. and Arms et al. would also be useful in the types of research work I have been involved with in the past. Historically, data sharing systems I have used have been static, which causes many a problem when switching from one project to another or when integrating new data into a project.

I also appreciate the basic principles of the architecture described by Arms et al.: the need for flexibility in the user interface, the need for straightforward collections management, and the need to keep the social, economic, and legal (and I would add technological to this list) frameworks in mind when developing the library. Quite important, in my experience, is the need for flexibility in the user interface. The interface must take into account as many of the potential needs of the user as possible is crucial. On the one hand, an overly focused interface may restrict the usability of the library. However, there is a limit to this flexibility; one should not generalize the system to an extent that makes it too general. The question is how to balance flexibility with applicability vis a vis the expected user base.

A few questions come to mind when reading these articles. Payette et al. and Suleman et al. are both about a decade old and both articles describe either prototype or early systems as examples of the architecture they describe. How have digital library architectures changed in the intervening years? Are the basic concepts the same as they were when these papers were written? Do we have any examples of large interoperable digital libraries that we can look at?

References:

Payette, S, Blanchi, C, Lagoze, C, Overly, EA. Interoperability for digital objects and repositories, The Cornell/CNRI Experiements. D-Lib Magazine, May 1999, Volume 5, Issue 5.

Suleman, H, and Fox, EA. A Framework for Building Open Digital Libraries, D-Lib Magazine, December 2001. Volume 7 Number 12.

Arms, WY, Blanchi, C, and Overly, EA. An Architecture for Information in Digital Libraries. D-Lib Magazine, February 1997.

Sep 4, 2009

Muddiest Point - Week 1

I am confused about the course of assignments over the next two weeks.

We do not have class next week (Sept. 7). Are we expected to post week 2 reading comments this week even though week 2 doesn't actually happen until the following week? Are week 1 reading comments required?

If you could clarify I'm sure it would be appreciated by more folks than me.

Thanks!

Sep 3, 2009

Blog is go!

here starts the blog i have created for various MLIS program activities. enjoy!