Create an Account
username: password:
 
  MemeStreams Logo

MemeStreams Discussion

search


This page contains all of the posts and discussion on MemeStreams referencing the following web page: The Need for Creating Tag Standards. You can find discussions on MemeStreams as you surf the web, even if you aren't a MemeStreams member, using the Threads Bookmarklet.

The Need for Creating Tag Standards
by Acidus at 10:31 am EST, Jan 15, 2007

Obviously the need for spaces in tags is an important one. Whether it’s “Semantic Web” or “Ford Interceptor” that you need to tag, it’s rather different from “Windows AND Vista” and “Ford AND Interceptor” - and it gets worse if you have a search engine that places OR in there instead of AND. Much worse. The big question is, why doesn’t such a standard already exist? It’s obvious that Web 2.0 is all about connecting ideas and bringing articles, content, and readers together. But anyone looking at the tagging process would immediately assume it’s about the exact opposite: splitting up content, making things difficult to find, and purposely making bloggers’ lives miserable.

With Habari, so far we’ve gone through all the forms, and at the moment we’re at number 3 for compatability and familiarity’s sake. But that may change - hence the need for a visible, tangible tagging standard. The only problem is, tagging isn’t some new concept. A tagging standard isn’t something that we can just whip up and serve on a platter.

What about the noun/verb argument? Look at the tags for this post: “Blogs, Blogging … Tags, Tagging” We just don’t know what people will search for - and we try to cover all the bases. But then you have so many possibilities! Code, Coding; Design, Designing; Research, Researching. For every pair there is one word more likely than the other. But people like to have all the bases covered, hence all the clutter. Tagging is fun, but only if done the right way.

This article touches on a few of the more obvious issues with implementing a tagging system properly. Tom, Rattle and I have already scoped all the places in Memestreams that use the topic system and are discussing ways to replace it with a tagging system. Believe me, it is not an easy problem!

Tagging by its very nature is more chaotic than a hierarchical topic system. Having a a good implementation is only half the battle: people must tag items well. A item that contains odd or tags that don't best describe the article is in danger of fading away. No one knows exactly what terms it could be filed under. This is where topics do very well. By imposing a controlled vocabulary, a searcher can presumably read the entire vocabulary to see all possible topic words they might be interested in.

In a nutshell, here are some big problems with tags:

-How to handle multiple words
-If/how to allow tag delimiter inside a tag
-Does letter case matter
-Punctuation and symbols
-Handling plural or singular words
-Date formating
-Multiple language support
-Colloquialisms/slang


 
RE: The Need for Creating Tag Standards
by Shannon at 1:34 pm EST, Jan 15, 2007

Acidus wrote:
This article touches on a few of the more obvious issues with implementing a tagging system properly. Tom, Rattle and I have already scoped all the places in Memestreams that use the topic system and are discussing ways to replace it with a tagging system. Believe me, it is not an easy problem!

Tagging by its very nature is more chaotic than a hierarchical topic system. Having a a good implementation is only half the battle: people must tag items well. A item that contains odd or tags that don't best describe the article is in danger of fading away. No one knows exactly what terms it could be filed under. This is where topics do very well. By imposing a controlled vocabulary, a searcher can presumably read the entire vocabulary to see all possible topic words they might be interested in.

In a nutshell, here are some big problems with tags:

-How to handle multiple words
-If/how to allow tag delimiter inside a tag
-Does letter case matter
-Punctuation and symbols
-Handling plural or singular words
-Date formating
-Multiple language support
-Colloquialisms/slang

It seems like what's missing is a tag dictionary which is able to group & graph relationships between words based on their actual definition. Such as being able to place in a hierarchy "Fruit" within the context of "plant" "apple" "leaf" in a way which would be relevant. It would be difficult to create such a dictionary, and to keep it modern might even be more difficult. If you look at something like Wiki-pedia(which has hyperlink style references to related topics and words which are connected to other words/topics), and you stripped out of that all of the back story and just kept the main topic and the words which are hyperlinked, organized these words in a treestyle hierarchy, and built a comparative reference based on common typos and variations, you might get close to the tool you're looking for. Such a task might seem nightmarish at first, but since there are much fewer words and phrases in existence than new ones every year, eventually the tool would become useful.


 
RE: The Need for Creating Tag Standards
by dmv at 8:21 am EST, Jan 16, 2007

Acidus wrote:

Obviously the need for spaces in tags is an important one. Whether it’s “Semantic Web” or “Ford Interceptor” that you need to tag, it’s rather different from “Windows AND Vista” and “Ford AND Interceptor” - and it gets worse if you have a search engine that places OR in there instead of AND. Much worse. The big question is, why doesn’t such a standard already exist? It’s obvious that Web 2.0 is all about connecting ideas and bringing articles, content, and readers together. But anyone looking at the tagging process would immediately assume it’s about the exact opposite: splitting up content, making things difficult to find, and purposely making bloggers’ lives miserable.

Tagging by its very nature is more chaotic than a hierarchical topic system. Having a a good implementation is only half the battle: people must tag items well. A item that contains odd or tags that don't best describe the article is in danger of fading away. No one knows exactly what terms it could be filed under. This is where topics do very well. By imposing a controlled vocabulary, a searcher can presumably read the entire vocabulary to see all possible topic words they might be interested in.

I think a number of these problems are caused by an unclear goal for the tagging system. They are more chaotic, but I think their implementation lends itself to the site's objectives.

If the goal of tagging is to facilitate the cataloging and searching of the whole site's content, you must impose structure on the tagging input -- emphasize the user's use of the sitewide vocabulary, with input type specific guidelines or inputs (if picking dates, you click the calendar). Consider Amazon's tagging system.

If the goal of tagging is to enable the user to personalize and catalog their own information stream, then being more openended is appropriate. Consider flickr's tagging system, which is completely freeform. Or, livejournal's system, where it is a user defined vocabulary but with vocab tools to help a user stay internally consistent.

I prefer the latter, with tagging clouds. But I'm not implementing it. :)


 
RE: The Need for Creating Tag Standards
by k at 9:48 am EST, Jan 16, 2007

It seems like what's missing is a tag dictionary which is able to group & graph relationships between words based on their actual definition. Such as being able to place in a hierarchy "Fruit" within the context of "plant" "apple" "leaf" in a way which would be relevant. It would be difficult to create such a dictionary, and to keep it modern might even be more difficult. If you look at something like Wiki-pedia(which has hyperlink style references to related topics and words which are connected to other words/topics), and you stripped out of that all of the back story and just kept the main topic and the words which are hyperlinked, organized these words in a treestyle hierarchy, and built a comparative reference based on common typos and variations, you might get close to the tool you're looking for. Such a task might seem nightmarish at first, but since there are much fewer words and phrases in existence than new ones every year, eventually the tool would become useful.

I felt the same way when I was doing research into this, about 4 years ago now. I think that basically describes a universal ontology, and the fact is that such a thing has massive hurdles, not least because of multiple word meanings.

Sadly, you'd have to put "Fruit" not only into the context of the words you noted (and others), but also in the context of the slang usage for "effeminate" or "gay". "apple" would have to be linked to the fruit-plant sense and the computer-hardware-software-corporation sense and the Beatles-music sense, etc. etc.

Wikipedia handles that with those disambiguation pages, as would (presently at least) any sufficiently complex word-to-word or phrase-to-phrase ontology. Or, that's my meagre understanding of the situation at least.

-k


  
RE: The Need for Creating Tag Standards
by Shannon at 10:46 am EST, Jan 16, 2007

k wrote:

I felt the same way when I was doing research into this, about 4 years ago now. I think that basically describes a universal ontology, and the fact is that such a thing has massive hurdles, not least because of multiple word meanings.

Sadly, you'd have to put "Fruit" not only into the context of the words you noted (and others), but also in the context of the slang usage for "effeminate" or "gay". "apple" would have to be linked to the fruit-plant sense and the computer-hardware-software-corporation sense and the Beatles-music sense, etc. etc.

Wikipedia handles that with those disambiguation pages, as would (presently at least) any sufficiently complex word-to-word or phrase-to-phrase ontology. Or, that's my meagre understanding of the situation at least.

-k

It's ashame that no one uses threaded tags so that relationships might be easier to figure out by context.
Such as:
"computers|apple|macintosh" or "plant|apple|macintosh|red"
If such connections would be recognized by a dictionary, a user could be prompted to specify which character of the questionable words was implied (or even a new use). So someone could tag, and then relate context.


 
 
Powered By Industrial Memetics