Cognitive Level, Semantic Distance and Power Laws.

My brain hurts from reading around the subject of Cognitive Linguistics, power laws and Semantic Distance and trying to work out clearly in my mind how they are connected. In doing so i can better explain what I mean by improving quality of tags for video through VideoTag and make understood what I perceive to be a higher quality tag.

So here goes:

There are three cognitive levels of tags Superordinate, Basic and Subordinate. Basic level tags have the least cognitive cost to the user – that is they are thought of more quickly. They are more likely to have a high frequency as there is more likely to be agreement on Basic Level tags. Superordinate and subordinate have a higher cognitive cost. In relation to collaborative tagging – superordinate level is difficult to assess. It is most likely that the superordinate tag for videos is video and all basic and subordinate level tags then continue to categorise the video. When tagging a music video for instance, basic level tags may refer to musical genre e.g. rock, indie, dance but the tag music would also be a basic level tag rather than being a superordinate tag that defines the overall category for tags, because it defines the genre of the video.  Subordinate level tags on the other may reference the band name, more specific musical genres e.g. Techno, trance, emo, grunge, britpop etc. They may also name band members, cameo roles by celebrities in the video. characters in the video, define the narrative of the video and any specific actions. Keywords taken from the song lyrics would also be classed as subordinate level tags.

The tag cloud below is of My Chemical Romance’s tags on Last.fm – chosen because they have 2 of the most watched videos on YouTube. You Tube tags – my chemical romance famous last words (whilst I would categorise these tags as subordinate level based on the above definition, they also highlight how inadequate YouTube tags are at describing the videos.)

my chemical romance last.fm tag cloud

This helps to explain the power law of tags. Tags in the larger font (e.g. emo, rock, alternative) are basic level tags. Tags with smaller font are of subordinate level. In this instance the superordinate tag would be music, but as it is a music site all tags contained with in fall under the umbrella of the music superordinate tag. On a power law graph, the high frequency basic level tags would have high rank, the subordinate level low frequency tags will have low rank and appear in the long tail.

The 80/20 rule can be applied here, agreement of terms can be measured as being 20% based on the frequency of basic level tags. This leaves 80% of tags at subordinate level that describe the resource but may only be of relevance to a few users. In terms of building rich descriptions of video, these subordinate level tags are imperative as they go into more descriptive detail, have greater specificity and can provide a fuller picture as to what the video is about.

As for semantic distance – I have only recently started to read up on this so I am not 100% in my mind of the connection. I think that subordinate level tags are more likely to be semantically narrow because they are related by the basic level tag they are elaborating, making the basic level tags semantically broad.

So what is a high quality tag? In terms of improving descriptions of videos it is a tag of subordinate cognitive level, low rank and low frequency and is semantically narrow. It is worth mentioning though that subordinate level tags are only useful when placed in context with the basic level tag they are adding extra description too. So VideoTag needs to encourage both sets of tags to be useful as a tool to improve accessibility and search of video.

Why selecting videos for VideoTag is not as easy as you’d think.

I am warming up to working today – actually I’ve already written some notes and it’s only 10.00, so I thought I deserved a break. I have realised I have been neglecting the blogosphere and so subscribed to feeds from some of the most respected web2.0 blogs. Anyway looking at readwriteweb I saw this article, top 10 youtube videos of all time, which I found interesting as it’s written at the time when I had spent a month searching out videos for VideoTag and knew pretty much every popular video on there.

It sums up why I gave up on rss feeds to provide videos for VideoTag and why I cannot see a way that any version of the game would work synced directly to the YouTube api. The most watched/highly rated/most favourited videos are mostly music videos that are not going to benefit from the extra descriptions that tagging can provide as much as the amateur videos.

6 months on and the top 10 most watched of all time hasn’t really changed

  1. Evolution of Dance

  2. Avril Lavigne – Girlfriend

  3. Lo que tú Quieras Oír

  4. IMVU – http://www.IMVU.com

  5. Timbaland – Apologize – Official Music Video

  6. My Chemical Romance – Teenagers

  7. My Chemical Romance – Famous Last Words

  8. Timbaland – The Way I Are OFFICIAL MUSIC VIDEO

  9. Akon – “Don’t Matter”

  10. CANSEI DE SER SEXY – Music is My Hot Hot Sex

laughing baby has slipped to number 14.

Trying to harness people power and doing it successfully maybe proving people really will watch anything is some video about Britney Spears in a bikini. They want to be the most watched worst video of all time. Their intentions seem honourable, trying to rid the top 10 videos of music videos, but i couldn’t bring myself to watch anything about Britney once never mind 100 times a day.

And for the record, maybe it’s a British thing, but I didn’t even think the most watched video was that funny.

MySpace Developer Platform

It had been an idea of mine to try and create a Facebook app, maybe try and create a content tagging game of sorts, but Facebook is saturated by applications and well honestly, it’s one of the many things I just probably won’t get around to doing. But I was interested to see that MySpace have launched their developer platform using Google’s Open Social technology. MySpace is more open than Facebook, profiles can be viewed by anyone unless users actively make their profiles more private, so it opens up more scope for applications that use all the social data available.

 

 

As a tagging enthusiast I’ve been thinking it’s something missing from MySpace. In Web 2.0, social networks and tagging go hand in hand. I was thinking about what apps might be possible maybe something to do with the music section, tagging bands. But then I came up with the same old problem of why would anyone bother to do it?

 

 

I started thinking about why users use MySpace and I think rather than Facebooks uses of keeping in touch with friends, MySpace is much more about showing off and self promotion. Also attracting new friends and meeting people. This ties in well with tagging motivations, with the main motivations for tagging being to refind information or self promote. It would be good to build a tool that would allow users to tag their friends, tag new MySpace profiles/blogs they find and also to tag themselves. Whilst this would hopefully be of use to users, it would also be useful to researchers. An app of this sort would generate a lot of useful data about the types of tags users use and also what they like to tag. Would more users tag themselves as self promotion, bands they like or use it to organise their friends?

 

 

I think I want to change my PhD application and do this instead!

 

Connecting a laptop to the TV with S-Video

How did anyone solve problems before they invented Google? Given, most problems I’m trying to solve are based in and around computing so there’s more info out there. I’m always pleased to stumble upon a useful forum, discovering that lots of other people have exactly the same problem as me and some more knowledgeable folk have offered solutions. Sometimes they don’t work and then there’s the occasion that you find really useful fixes…like this one…

 

I have bought myself a shiny new laptop and was looking forward to hooking it up to the TV, make the most of my broadband connection, all I need is an s-video lead I thought. But no the picture was black and white. Apparently this is a very common problem something to do with using a scart adapter as my tv doesn’t have an s-video in so only the luminance signal was being carried to through the scart and the chrominance signal which carries the colour wasn’t getting to the TV.

 

 

Then, thanks to Google I followed a link on this forum

to this website http://camp0s.altervista.org/sVideo/sVideo.htm

and I now have a colour picture on my TV – yey! I was put off by the idea of butchering my scart adapter, then thought well it only cost a couple of quid it’s worth a try. My husband volunteered his soldering skills as I don’t have a clue.

 

I just wanted to post a recommendation for this solution. It really does work – sometimes you try fixes on forums and still have the same problems but this did work. OK you do need or know someone with a toolbox and soldering kit. Husbands come in useful sometimes!

 

I’ve got quite knowledgeable about cables since buying my laptop – I set up my first wireless network so my PC and laptop can communicate (I learned the benefits of Ethernet compared to USB – thanks virgin media for the dodgy installation!). I also worked out that by utilising the lead for my video camera I can get sound from my laptop on the TV too! I’m nearly ready for my Saturday job at Maplin or other good cable retailer.

 

So no tagging research recently but i know a lot more about wireless networks and cables.

 

Contemplating the future

My first goal now that my Msc is finished is to write up the VideoTag experiment in a paper, in the hopes the research will get a bit more interest from academic circles.  My main hope is to secure funding for a PhD so i can continue researching tagging and progress VideoTag.

The main areas I next want to focus on are motivating users to tag and game motivation.  Whilst VideoTag has found that a game format can encourage users to tag, I want to find out what motivates users to play games, return to games and stay playing games for long periods.  As well as finding out more about why users tag and whether tagging behaviour is different depending on the resource.  I also want to investigate further whether blind, suggested or guided tagging generate tags of differing cognitive level and whether or not one method of tagging is more successful than another at generating more descriptive tags.  However, the VideoTag experiment highlighted that a game environment is not a good tool to use to produce the data to analyse these methods of tagging.  Therefore, a new experiment needs to be thought up – ideally analysing existing tag data, because to create a new experiment I will face the same user motivation problems.  However, it is rare for sites to use guided tagging so it could be hard to find existing data.  Things to think about.

As for VideoTag I want to redesign it give the homepage more impact, add in the golden tag idea, cut down the video length make a limit of 2 minutes.  Also maybe offer options to quit after 15 seconds and get a new video or continue watching.  It would also be good to add in a vote system at this point to rate the video boring or interesting.  I like the idea of using VideoTag data to potentially develop an interestingness filter for videos.  My long term aim with VideoTag is that by making improvements to the gameplay enough data will be generated that will then make it possible to experiment with using the data to improve accessibility and search of internet video. 

All Finished

Well I’m all finished.  Handed in my report, I’m pleased with all I’ve accomplished through this project.

Here is the Abstract for the report:

Through discussion and analysis of current research in collaborative tagging systems, an emerging area of research was discovered, improving accessibility and search of visual resources through tagging.  Of particular interest were two tagging projects ESP Game and Steve.Museum, where users were encouraged to tag images to improve accessibility and search of images.  VideoTag extends this research by harnessing the user motivations of Play and Competition to increase and improve the meta data of a selection of YouTube videos through tagging. 

The VideoTag tagging experiment consisted of a one player game where users were encouraged to tag a selection of sixty carefully chosen, funny or interesting YouTube Videos.  The videos were separated over five difficulty levels.  Gameplay was carefully planned in order to encourage users to tag the videos more descriptively, using tags of a subordinate rather than basic cognitive level.  The experiment was uncontrolled with random users being attracted to the game through promotion on various Web 2.0 sites.

Analysis of the results focused on whether a game environment is beneficial to encouraging users to tag videos.  Quantitative methods of analysis found VideoTag to be successful at increasing the amount of tags per video compared to YouTube.   A long tail effect was found to present in the tag data which allowed for qualitative analysis of the quality of the tags entered based on their cognitive level.  

As only a small selection of videos were used, tag data generated by the VideoTag experiment is not sufficient to test whether the data can improve search for those selected videos, or create descriptions to improve accessibility for visually impaired users.  Analysis and evaluation does discuss how VideoTag proves as a concept, game based tagging could be used to improve accessibility and search and there is scope for future research . 

Vander Wal, Leicester

Went to hear a talk by Thomas Vander Wal (the creator of the term folksonomy) last week at Leicester De Montfort University. It was interesting. It was good to hear his definition of what folksonomy is. And even better to realise I have been on the right track for the last 6 months! It focussed more on businesses use for folksonomy at marketing products. Using tags to gauge public opinion of products.

I particularly liked his visualisations of folksonomy which I hadn’t seen anywhere else, they showed the relationships between tags as meta data, identity as the user and object being tagged. Looking back whilst writing this it is similar to nodes in the tripartite network of a collaborative tagging system.

 

We ran out of time at the end so he never fully got to discuss what he saw as the future of tagging, which I would have liked to have got his opinion on. I gathered that clustering through co – occurrence relationships was the main development, which adds some order to the folksonomy without necessarily forcing a hierarchy. Examples of how this works on Flickr and Rawsugar showed how it is a very useful development.

 

I would have liked to have asked him some questions, but ran out of time – it is intimidating though asking a question in a group talk situation, I prefer informal workshop one on one discussions I think. I’ll have to get over that.

 

View the slides to accompany the talk here

<6 weeks to go

I’ve got 6 weeks till i have to hand in my project.  In real terms as i don’t get all the time in the world to work that gives me 12 days.  12 days to write up 2 chapters detailing what i’ve done, describing the experiment, explaining the results, analysing the results and critical evaluation of the whole thing.  On top of this i remain completly baffled by SPSS and it really is the best way to produce my longtail graphs.  Although i am tempted to knock some out in excel if time runs out.  To top it all off i have writers block.  The chapters are running round in my head all jumbled up but i can’t get them out onto paper for some reason.

I’m also regretting ever finding facebook as it is a perfect distraction to stop me starting writing.  And it turns out so is this blog as i’m writing this instead of just mind dumping on paper to get myself started!

Almost Web2.0

I thought i’d check if VideoTag was web2.0 or not http://web2.0validator.com

It checks your site against a list of rules that are created through people bookmarking the site in del.icio.us. VideoTag scored 8 out of 66, so i guess i’m not web2.0, however according to the results, i don’t use tags, mention a long tail or use Ajax, which i do, so i guess that makes me 11 out of 66 which is still pretty rubbish.

It did say i was web3.0 though, maybe VideoTag’s too ahead of the game to be web2.0!!

Maybe i’ll set up a rule – “Mentions VideoTag” score an extra point!

The results:

  • Uses python? No
  • Denies the existance of Rocky V ? No
  • Is in public beta? No
  • Rocks out to the dance noise sound of Chinese Forehead ? No
  • Uses inline AJAX ? No
  • Uses the prefix “meta” or “micro”? No
  • Mentions Tag Clouds? No
  • Is Shadows-aware ? No
  • Mentions Neowin.net ? No
  • Apperars to use moo.fx ? No
  • Appears to be non-empty ? No
  • Has a Blogline blogroll ? No
  • Uses tags ? No
  • Appears to be web 3.0 ? Yes!
  • Attempts to be XHTML Strict ? No
  • Uses Google Maps API? No
  • Has favicon ? Yes!
  • Refers to mash-ups ? No
  • Uses Cascading Style Sheets? Yes!
  • Mentions startup ? No
  • Mentions Less is More ? No
  • Received a cease-and-desist from CMP Media or Tim O’Reilly ? No
  • Uses the word meme? No
  • Appears to use AJAX ? No
  • Refers to the Web 2.0 Validator’s ruleset ? No
  • Mentions an “architecture of participation”? No
  • Appears to have a Google Sitemap ? No
  • Appears to use RSS ? No
  • Makes reference to Technorati ? No
  • Refers to Flickr ? No
  • JavaScript by Dreamweaver ? No
  • Faviconized ? Yes!
  • Refers to VCs ? No
  • Mentions The Long Tail ? No
  • Appears to be built using Django ? No
  • Links Slashdot and Digg ? No
  • Mentions Ruby? No
  • Appears to use moo.fx ? No
  • Mentions Cool Words ? No
  • Mentions Nitro ? No
  • Mentions Ruby ? No
  • Has prototype.js ? No
  • Refers to podcasting ? No
  • Mentions Wisdom Of Crowds ? No
  • Creative Commons license ? No
  • Appears to use visual effects? No
  • Appears to use MonoRail ? No
  • Refers to Rocketboom ? No
  • Uses Semantic Markup? Yes!
  • Links to validator? No
  • Refers to del.icio.us ? No
  • Mentions RDF and the Semantic Web? No
  • Actually mentions Web 2.0 ? Yes!
  • Use Catalyst ? No
  • Mentions Neurogami and Web 2.0 ? No
  • Refers to web2.0validator ? No
  • Uses microformats ? No
  • Mentions a blog ? Yes!
  • Does it use DWR Ajax Library? No
  • References Firefox? No
  • Appears to over-punctuate ? No
  • Validates as XHTML 1.1 ? No
  • References isometric.sixsided.org? No
  • Appears to have Adsense ? No
  • Uses the “blink” tag? Yes!
  • Mentions Stickbob? No

you tube rss feeds

In order to analyse the quality of the tags VideoTag is generating, i want to compare them to the existing tags for the videos in YouTube.   So i set about the arduous task of collecting all these tags and entering them into my database.  I checked the YouTube rss feed and no example for favourites, so i googled it and found sites that use the youTube API to let you rss feed users favourites, but you only get the 10 off the first page, which is useless for my 60 odd videos.  So i resorted grudgingly to manually copying and pasting off each video.  Then i had the bright idea of seeing if i could rss feed my playlist.  After a quick google search i found this site:

http://www.ubeek.com/youtube/
I was kicking myself for not thinking of this sooner, it saved me lots of time,  especially when after entering all the YouTube tags for my 63 videos i got this error “possible deep recursion attack” and lost them all!!
Arghhh!!

So my annoyance at not having found it earlier, changed to being thankful i’d found it at all so re entering the 63 sets of tags wouldn’t take quite so long.  grrr i hate it when quick jobs to ease you into your working day take up half of it unexpectedly.