3/25/00 Personal Library
I really need to organize my personal library. It's getting just large enough that I have trouble finding certain things (about 500 books). So I've been trying to decided what system to use.
My first thought was to come up with my own (enumerative) scheme. But I'm now fairly convinced that it would be better to use an existing scheme. This would be less work for me, and in the process, I would learn more about the existing scheme, so I could apply that knowledge elsewhere.
I've always hated the Library of Congress classification, with its ugly mixture of letters and numbers. So, the obvious choice is the Dewey Decimal Classification, which is widely used and fairly simple.
See this web page for an interesting essay on why you should use DDC to organize your computer. I haven't gotten to this point yet, but it's an interesting thought.
(At this point, I'm asking the class for any suggestions to compare my train of thought with theirs.)
Colon classification is interesting, but the codes can get just as long and ugly as in LCC. Here is an article on the connections between Colon classification and Yahoo! It somewhat misses the point, since there is a big difference between Colon classification's (systematic, orderly) faceted scheme and Yahoo's (unfocused, amorphous) polyhierarchy, but it's an interesting read.
My first response came back from the class. It was basically ``come up with your own scheme''. After looking over some of the more confusing items again, I still think creating my own scheme would be more trouble than it's worth. I can imagine myself looking for a particular book, and then wondering where I put it, much as I do now. I've already tried that approach for electronic documents. I have a fairly good scheme for electronic things now, but if I didn't have search facilities, I would loose a lot of things. I may end up using Dewey, and then rearrange things that I think are poorly located.
I finally read the Ranganathan article in the readings. It was very interesting...(my full thoughts are in the Week 8 section) I won't use it for my personal library, since it's far too complex for my purposes.
3/25/00 thoughts
There are a lot of tradeoffs in selection of a classification scheme. One of the big problems is that eventually, there must be a shelving order for the actual documents. No matter how documents are arranged, there will always be situations in which a different arrangement would be better. The best we can do (with physical documents) is find a shelving order that is reasonably useful, and provide multiple access points through other means (catalogs, OPACs, etc.). For a small collection, like my own, it isn't necessary to create the additional access points as long as the shelving order is good enough that we can quickly search for an item in two or three areas.
3/27/00 abstract
Aitchison, J., and Gilchrist, A. (1987). Planning and design of thesauri (p. 3-10). Vocabulary control (p. 12-22). Specificity and compound terms (p. 23-33). Structure: basic relationships and classification (p. 34-60). In Thesaurus construction: a practical manual, 2nd ed. London: Aslib.
Describes issues and heuristics for creating a thesaurus. The thesaurus should be considered in relation to its associated system and users. If possible, an existing thesaurus should be used or adapted. Indexing terms should generally be factored into preferred terms. Relationships between terms should be represented. An abbreviated notation may be used to represent the terms.
3/28/00 abstract
Eddison, B., and Batty, D. (1988). Database design: words, words,
words -- descriptors, subject headings index terms. Database 11 (6),
109-113.
Presents background material about thesauri. A thesaurus is a controlled set of terms used to index information in a database. Free-text systems save on time and cost for indexing, but pass this expenditure on to the users. In the 1950's, the United States opted for simpler, computer-based indexing systems, while Europe opted for more complex, faceted indexing systems. The United States is now returning to a more structured approach.
3/28/00 abstract
Batty, D. (1989). Thesaurus construction and maintenance: a survival
kit. Database 12 (1), 13-20.
Describes a process for creating a thesaurus. The range and depth of the thesaurus must be defined, based on anticipated users. Raw vocabulary must be collected from source documents. The raw vocabulary should be clustered and refined. A notation system may be used to impose an order on terms. A system for thesaurus maintenance should be developed.
3/28/00 abstract
Johnson, E. H. (1995). A hypertext interface for a searcher's thesaurus. Available on the web.
Describes a graphical interface to the INSPEC thesaurus. Users can enter keywords to search the thesaurus. The thesaurus will present a hierarchical display showing the word in relation to its broader and narrower terms, as well as a ``cloud'' displaying related terms. Users may click on any term shown in the display to navigate to that term. Preliminary user tests indicate that users prefer the related terms to the term hierarchy.
3/29/00 assignment
| LCSH | ERIC Thesaurus | Reader's Guide | |
| Audience | Librarians | Education | General |
| Specific Content | Subject Headings | Indexing Terms & relationships | terms & citations |
| Coverage | General | General (Education-oriented?) | |
| Frequency of Cumulation | Varies (yearly?) | Quarterly | Yearly/Quarterly |
| Distance from Citations | 2 | 2 | 1 |
| Coordination of Categories | Post-coordinate | Post-coordiante | Pre-coordinate |
| Type of Vocabulary | semi-controlled | natural language? | natural language, proper names |
| Composition of Vocab. | 1-3 word terms, mix of -- and NT | hyphenated phrases | 1-3 word terms |
| Currency of Vocab. | reasonable | reasonable | very |
| Consistency of Vocab. (time) | many additions | ?? | steady additions, some focus changes |
| Consistency of Indexing | ?? | reasonable | difficult to tell |
| Specificity of Descriptors | very | newspaper/conversational | proper names, newspaper/magazine |
| Structure of Organization | polyhierarchical | polyhierarchical | terms with see also refs. |
| Levels in Hierarchy | 4+ | 4 | 1 |
| Presentation | alphabetical | alphabetical | alphabetical |
| Lead-in Vocabulary | yes | yes | yes, but not much |
| Syndetic Structure | yes | yes | yes |
| Definitions (Scope notes) | yes | yes | no |
| Strengths | Very large & comprehensive | Available online | Combined with citations |
| Weaknesses | Related terms not always noted, terms get lost | many floating terms outside the hierarchy | changes based on published content |
class notes 3/29/00
Official Lecture Notes (continued)
When indexing, make sure the document is about each of the terms chosen.
Faceted and enumerative schemes both create hierarchies. The major difference is that enumerative schemes must specify all categories, while a faceted scheme allows them to be systematically created.
Using a faceted thesaurus, you can create
We need to build thesauri that can standardize our representations of concepts over time. This way, the language can change, but the concepts will stay the same, so we won't lose any documents. This must be done in limited domains, since the concepts become ambiguous if we try to use too many domains at once.
Extra note: There is an article entitled ``Artificial Intelligence Meets Natural Stupidity'' (can't remember the author right now) which addresses the subject of choosing natural language names for the concepts we develop and the problems this can cause.
Check out ``Introduction to Metadata'' full book online.
After class discussions, I made some changes to the comparison table:
| LCSH | ERIC Thesaurus | Reader's Guide | |
| Audience | Librarians | Education | General |
| Specific Content | Subject Headings | Indexing Terms & relationships | terms & citations |
| Coverage | Books & Journals | Books & Journals | Periodical articles |
| Frequency of Cumulation | Yearly/Quarterly/Weekly | Varies | Yearly/Quarterly/Monthly |
| Distance from Citations | 2 | 2 | 1 |
| Coordination of Categories | Pre-coordinate | Pre & Post-coordiante | Pre-coordinate |
| Type of Vocabulary | semi-controlled | controlled | natural language, proper names |
| Composition of Vocab. | 1-3 word terms, mix of -- and NT | hyphenated phrases | 1-3 word terms |
| Currency of Vocab. | reasonable | reasonable | very |
| Consistency of Vocab. (time) | many additions | ?? | steady additions, some focus changes |
| Consistency of Indexing | ?? | reasonable | difficult to tell |
| Specificity of Descriptors | very | newspaper/conversational, specific for education | proper names, newspaper/magazine |
| Structure of Organization | polyhierarchical | polyhierarchical | terms with see also refs. |
| Levels in Hierarchy | 4+ | 4 | 1 |
| Presentation | alphabetical | alphabetical | alphabetical |
| Lead-in Vocabulary | yes | yes | yes, but not much |
| Syndetic Structure | yes | yes | yes |
| Definitions (Scope notes) | yes | yes | no |
| Strengths | Very large, widely used & comprehensive | Available online | Combined with citations, easy to use |
| Weaknesses | Related terms not always noted, terms get lost, indirect access to citations | many floating terms outside the hierarchy, indirect access to citations | changes based on published content, limited coverage |