next up previous
Next: Week 9 - Indexing Up: L505 Journal Previous: Week 7 - Systematic

Week 8 - Systematic Organization: Faceted Schemes

2/28/00 summary

Vickery, B. C. (1966). Intoduction to faceted classification (p. 9-18 only). Faceted classification schemes. New Brunswick, NJ: Rutgers School of Library Service. [On reserve in the SLIS Library: Z696 .A1 R97 v.5]

Describes the origins, importance, and advantages of faceted classification. Faceted classification grew out of the Universal Decimal Classification and the work of Ranganathan. This type of scheme is no less effiecient in retrieval than other systems. It offers the advantage of bringing all aspects of a special field of knowledge together. The vocabulary can be used consistently. Faceted schemes are less costly to construct than more structured systems. However, they do take some effort to construct, and they are not as flexible for general search as more structured systems.

3/1/00 abstract

Soergel, D. (1985). Chapter 14: Index language structure I: conceptual. In Organizing information (p. 251-287), San Diego, CA: Academic Press. [On reserve in the SLIS Library: Z699 .S539 1985]

Discusses methods for structuring indices. Strict hierarchies can cause problems for users. Terms can be broken down into several primitive concepts, called facets. Some terms have more than one Broader Term, resulting in a polyhierarchy. Document retrieval can benefit from a polyhierarchy structure.

3/1/00 abstract and critical summary

Priss, U., and Jacob, E.K. (1999). Utilizing faceted structures for information systems design. In L. Woods (Ed.), Knowledge, Creation, Organization and Use: Proceedings of the 62nd ASIS Annual Meeting (p. 203-212). Medford, NJ: Information Today.

Three Library and Information Science websites were analyzed to determine the effect of the hyperlink structure on access to information. Redefines the concept of faceted thesaurus in terms of mathematical relations. Outlines a software system for dealing with thesauri of this type.

Hmmm. Where to start? This article was both incredibly informative and utterly confusing. Let's start at the beginning:

The article starts with an informal evaluation of web sites. Fine. The results seem to be that there are some problems, but the sites are roughly equal in usability. From this, we are supposed to conclude that "application of a faceted approach to knowledge organization can ensure that the process of organization is less random and more manageable." [p. 205] In other words, using faceted classification will make the websites much better. This is quite a leap....

Jump to theory-land. (This was the incredibly informative part.) The bulk of the paper consists of a formal definition of a faceted thesaurus. This took a while to digest, but it's good material.

Then, back to the application-land. (This is the utterly confusing part. I'm having trouble even making sense of my confusion.) The faceted thesuarus is used to organize our website. How? Use the metadata? This means we have to standardize the vocabulary used in the metadata. This would be a pain, but possible as long as we keep the core of the website under control of people who understand and believe in the thesaurus.

But how does this actually make the website better? Will there be automatically generated links from each page to its related pages? If so, what happens when there are too many related pages? There will still be information that the designers left out, or user needs that weren't anticipated. How does the thesaurus deal with missing information?

3/1/00 abstract

Sanders, G. L. (1995). Introduction to data modeling concepts. In Data modeling (p. 16-38). Danvers, Mass.: Boyd Frasier.

Describes the Entity Relationship (ER) representation system. ER is a graphical method for organizing the relationships between entities (classes) and their attributes.

3/1/00 assignment

Compare and contrast faceted and enumerative classification schemes, outlining the respective advantages and disadvantages of each on the basis of the fundamental characteristics of a classification scheme.

I hate to admit this, but I have no idea what the fundamental characteristics of a classification scheme actually are. Here's my guess:

General Discussion

An enumerative scheme is a traditional hierarchy, with all relevant classes listed. A faceted scheme is something broken down into slots and fillers, though the fillers may be hierarchical. It is unclear whether all slots must be filled to represent an item, and whether objects may have very different types of slots. The Priss and Jacob article seems to stretch this idea to its limit, making me wonder if they're using a faceted scheme, or just a more loosely defined enumerative scheme.

Enumerative Schemes

Advantages:

Disadvantages:

Faceted Schemes

Advantages:

Disadvantages:

3/8/00 class notes

Official Lecture Notes

Jacob Says:

``I think the solution [to organizing the web] is smaller indexing structures.''

I agree with that. There isn't much sense for any one company to try to index everything. Smaller companies or groups of people should index the subjects on which they have expertise. Of course, there are many small, topic-specific search sites. I'm currently working on a project to provide easier access to these sites. One of the problems I'm finding is that the more specific a site becomes, the less effort has gone into it. Presumably this is because topic-specific sites get much less traffic than general sites, and people simply can't afford to put effort into the more specific sites. So how do we improve the quality of the smaller sites????

Shera's properties of classification schemes

Take a look at ISKO (International Society of Knowledge Organization).

``Dolly'' is a good example of a less than common word with very distinct meanings:

After the class discussion on faceted schemes, and my discussions with Prof. Jacob, I have a much better understanding of this type of classification. Faceted schemes are extremely useful when you are working with relatively small domains that have several orthogonal dimensions. This seems to occur most often when working with physical objects, which naturally have several dimensions (shape, size, color, etc.). I still think it is quite a stretch to apply a faceted scheme to general knowledge, but I'm interested in the possibility. I may even make an attempt at using faceted classification in my research, since the items I'm trying to index (search engines) do have some orthogonal features (topic, speed, reliability, expected audience, expected query length, etc.).

3/25/00 abstract and discussion

Ranganathan, S. R. (1962). Canons of classification. In Elements of library classification (p. 45-70). Bombay: Asia Publishing House.

Presents several guidelines (canons) for classification schemes. Compares Dewey Decimal Classification (DDC) and Colon classification with respect to these guidelines. DDC fails to obey some of the guidelines.

This was almost as much fun to read as the Zerubavel papers. It went something like this: 1) Here's a principle that I used to design the Colon classification. 2) Here's a section of DDC that violates the principle. 3) Look (surprise!) how well the Colon classification adheres to the principle. 4) By the way, the Colon classification also orders these topics in my favorite way, which DDC doesn't. 5) Repeat ad nauseum.

Most of the examples make Colon classification seem like a Good Idea. Colon classification may be much nicer than Library of Congress Classification (I'd have to see it in practice to be sure). Sometimes it gets a little strange. For example, mixing upper and lower case, and having commas and apostrophes be meaningful. These conventions would frustrate someone labeling a book, or trying to find a book on the shelves.

I think Dewey is much better for small collections, since it isn't as complex. Colon classification seems to be complex enough to label every bit of knowledge in the world. How many books are there about the specific heat of table salt?

Perhaps we should ignore the details of Colon classification and simply apply the principles to organizing electronic information. With electronic information, we aren't (as much) restricted by space considerations, and the names of classes can be spelled out rather than being represented by cryptic codes.


next up previous
Next: Week 9 - Indexing Up: L505 Journal Previous: Week 7 - Systematic
Ryan Scherle
2000-06-15