4/17/00 summary
Introduction to metadata: pathways to digital information
Available on the web.
Metadata identifies and describes an information object. Metadata can provide the context and description that would be available in a physical setting. The web was not designed with metadata in mind. Manually created web directories cannot keep up with the web's rate of growth. Automatically created search engines index items of variable quality and take up increasing amounts of bandwidth. The technical cababilities for distributed large-scale indexing are available, but there are no generally accepted indexing standards. Cataloging skills are increasingly important as the web grows. AltaVista supports ``description'' and ``keywords'' meta tags. The Dublin Core is more complex, and has international support, but can be difficult to use. The Resource Description Format (RDF) is a metadata application of XML. Web metadata is suseptible to abuse. Crosswalks are links between metadata systems. Z39.50 provides one type of crosswalk. The Research Libraries Group/Getty Information Institute crosswalk connects several metadata sets for cultural heritage information.
4/18/00 abstract
Cover, R. (1998). Managing names and ontologies: an XML reistry and
repository. Sun Microsystems. Available on the web.
Argues for a central registry and repository for XML standards. Currently, XML specifications are difficult to find, difficult to name, and cannot always be accessed due to web failures. XML tags can be ambiguous. A central repository would alleviate these problems.
4/19/00 abstract and discussion
Denenberg, R. (1996). Structuring and indexing the Internet.
Keynote address at the Workshop on Earth Observation Catalogue
Interoperability. Available on the web
Discusses various methods for indexing and searching the web. Global indices employ spiders to index individual pages, and collect the entire index in one location. Distributed searching sends a query to multiple indices, and merges the results from each of them. Z39.50 provides a standard for distributed searching, Profile for Simple Distributed Search and Ranked Retrieval (ZDSR). Navigational systems provide links between documents that a user may traverse. Z39.50 also provides a navigational standard, the Collective Description Record.
``...many users actually have come to expect advertising, and some feel cheated if advertisements are not present.'' This is hilarious!
The Z39.50 standard is an interesting concept. Ultimately, it should provide a simple way for information sources to communicate and share information, creating a very large distributed information resource. For some reason, Z39.50 seems to have only been adopted by libraries, and I get the impression that they are mostly giving up on it by now. There is no sign of the search engine community moving to accept the standard. In fact, a majority of the links I have found to Z39.50 resources on the web are broken. It appears that these resources no longer exist, having outlived their usefulness.
4/19/00 summary
Berners-Lee, T., & Fischetti, M. (1999). Chapter 13: Machines and the Web. Chapter 14: Weaving the Web. In Weaving the Web: the original design and ultimate destiny of the World Wide Web by its inventor (p. 177-209). San Francisco: HarperSanFrancisco.
Most of the information on the web is in human-readable formats, not machine-readable formats. If more information is machine-readable, then computers will be able to process the information further and provide better services. RDF is an attempt to make more metadata machine-readable. In order to effectively use this ``Semantic Web'', we will need general-purpose inference engines. The presense of the web breaks down geographic boundaries between people. Because it is decentralized, people can make the web into anything they want. Ultimately, this process will change the ways our cultures are structured.
4/17/00 abstract
Dourish, P., Edwards, K., LaMarca, A., and Salisbury,
M. (1999). Presto: An Experimental Architecture for Fluid
Interactive Document Spaces. ACM Transactions on Computer-Human
Interaction, 6(2), 133-161.
Presto is an alternative to the traditional hierarchical system for storing and organizing electronic files. Presto attaches an arbitrary number of arbitrary attributes to files, and the files may be searched or grouped by any combination of attributes. Interfaces are provided so Presto can be accessed as a hierarchical file system, as well as import files from a hierarchical system, allowing attributes to be added to arbitrary files.
4/26/00 class notes
For the portfolio, only index the framework and contents of the portfolio, not the contents of items (such as abstracts) included as examples of a point.
Classification = the process/concept
Classifications = actual schemes for organization
TEI is a complex SGML language for representing documents.
OCLC is advocating the use of PURLs (Permanent URLs)
5/1/00 debate notes
The AltaVista vs. Open Directory Project debate was first. The Open Directory Project was a clear winner in this debate. They enumerated several reasons why their system was better, including 1) organized by humans, and 2) not commercially based. While I personally have some arguments against these points, the AltaVista group did not adequately counter them. The AltaVista group also failed to provide any clear arguments as to why their system was better.
The debate I participated in was based on image search engines. I felt as if we won, but I cannot be certain, since the debate was fairly close. Both sides presented good points and good counter arguments.
Due to a prior engagement, I was unable to attend the third debate.