My brain hurts from reading around the subject of Cognitive Linguistics, power laws and Semantic Distance and trying to work out clearly in my mind how they are connected. In doing so i can better explain what I mean by improving quality of tags for video through VideoTag and make understood what I perceive to be a higher quality tag.
So here goes:
There are three cognitive levels of tags Superordinate, Basic and Subordinate. Basic level tags have the least cognitive cost to the user – that is they are thought of more quickly. They are more likely to have a high frequency as there is more likely to be agreement on Basic Level tags. Superordinate and subordinate have a higher cognitive cost. In relation to collaborative tagging – superordinate level is difficult to assess. It is most likely that the superordinate tag for videos is video and all basic and subordinate level tags then continue to categorise the video. When tagging a music video for instance, basic level tags may refer to musical genre e.g. rock, indie, dance but the tag music would also be a basic level tag rather than being a superordinate tag that defines the overall category for tags, because it defines the genre of the video. Subordinate level tags on the other may reference the band name, more specific musical genres e.g. Techno, trance, emo, grunge, britpop etc. They may also name band members, cameo roles by celebrities in the video. characters in the video, define the narrative of the video and any specific actions. Keywords taken from the song lyrics would also be classed as subordinate level tags.
The tag cloud below is of My Chemical Romance’s tags on Last.fm – chosen because they have 2 of the most watched videos on YouTube. You Tube tags – my chemical romance famous last words (whilst I would categorise these tags as subordinate level based on the above definition, they also highlight how inadequate YouTube tags are at describing the videos.)
This helps to explain the power law of tags. Tags in the larger font (e.g. emo, rock, alternative) are basic level tags. Tags with smaller font are of subordinate level. In this instance the superordinate tag would be music, but as it is a music site all tags contained with in fall under the umbrella of the music superordinate tag. On a power law graph, the high frequency basic level tags would have high rank, the subordinate level low frequency tags will have low rank and appear in the long tail.
The 80/20 rule can be applied here, agreement of terms can be measured as being 20% based on the frequency of basic level tags. This leaves 80% of tags at subordinate level that describe the resource but may only be of relevance to a few users. In terms of building rich descriptions of video, these subordinate level tags are imperative as they go into more descriptive detail, have greater specificity and can provide a fuller picture as to what the video is about.
As for semantic distance – I have only recently started to read up on this so I am not 100% in my mind of the connection. I think that subordinate level tags are more likely to be semantically narrow because they are related by the basic level tag they are elaborating, making the basic level tags semantically broad.
So what is a high quality tag? In terms of improving descriptions of videos it is a tag of subordinate cognitive level, low rank and low frequency and is semantically narrow. It is worth mentioning though that subordinate level tags are only useful when placed in context with the basic level tag they are adding extra description too. So VideoTag needs to encourage both sets of tags to be useful as a tool to improve accessibility and search of video.