Category Archives: web2.0

Anything from my research that’s related to web 2.0 technology and current research.

Can Twitter be used as viral marketing?

I’ve been redesigning my website – for the last couple of weeks, it was long overdue an update.   Sak Creations has been my Web Design moniker for the last 10 years, but as web design has now become a sideline to my PhD I’ve also put a section up to be the formal home of my research interests.   Leaving this blog to be the home of my infrequent ramblings.

I’ve been giving some thought to SEO whilst redesigning my website,  also sparked by a discussion with my graphic designer friend KJ Creative about how to improve google rankings.  The main reason I’ve kept my research on my Sak Creations domain (as well as the fact I have hosting and everything set up already and I don’t really want to have my identity pinned on Stacey Greenaway, in case I don’t always stay Stacey Greenaway) is that it is creeping up the page rank scale (slowly) and Google know to return Sak Creations if someone types my name in.   So shameless backlinks aside, I have given thought to how you promote websites nowadays when you can’t compete with just good meta information, textual content and the good old reciprocal link.  It seems any designer creeping up the page ranks does so by promoting themselves and their work on various forums and designer community websites, commenting on blogs, writing blogs, twittering about their work.   This social web gives a whole new depth to viral marketing.  I remember sitting in meetings discussing how to market our current projects in viral emails, is it as simple now as just twittering about it and posting a link on facebook?  Well maybe if you have enough followers and friends.   So here’s the next question, how do you get all the followers and friends in order to make you viral marketing/shameless self promotion a success?  Do you have to spend more time twittering about your work than actually doing the work?  Does a marketing assisitants job spec now include running the companies twitter account?

There are more problems with relying on social web to improve those rankings, first to engage in the social web and make it work for you, you have to be a social person and second you need to invest the time on the sites ‘networking’ instead of doing the work.  Is social media really only useful for people with ALOT of time on their hands, or for big companies who can afford to employ someone to be the ‘social’ face of the company/organisation?

So, as a person with not enough time on my hands, but a desire to find new ways to avoid doing the actual work of my PhD by finding tasks that can loosely be deemed as work;  I have decided to succumb to the power of Twitter and set up an account for VideoTag.  I can start twittering about the game and ideas and struggles during development, then hopefully have some followers who want to play the game when it launches.  then I can continue to use it as a vehicle to launch new games.

I’d like to just add another problem to using social media to promote yourself/project – usernames, they are always taken by someone!!! Mr jesse videotag isn’t even using his account!!!  So you can follow videotag2 instead.  Now comes the hard work knowing what to twitter about!

Twitter Button from

**Is part of Twitters appeal the cute little bird?

Is Twitter endorsed by the BBC?

When I started out researching Web 2.0 back in early 2007, I spent ages looking at and trying out different Web 2.0 sites and applications until I settled on my thesis idea. One such site I signed up to was Twitter. At the time I remember the selling point being that it was mobile and everyone at SXSW was apparently twittering, it was the new way for friends to know your every waking moment, as long as you twittered it. It was an American phenomenon, I joined, I knew no one on the site, so I had nobody to Twitter with and so have never used it. Even now I know 3 “real world” people with a Twitter account. By “real world” I mean people I actually talk to in the old fashioned face to face method (only occasionally obviously).

But now, like Facebook and MySpace before it, Twitter is suddenly everywhere. I cannot turn on Radio 1 anymore without hearing DJ’s twittering on about twitter. I succumbed to MySpace, before I knew it was Web 2.0 and I would one day be listening to talks about it. I gave it up in favour of Facebook, which I do use and I like because it is good to find and keep in touch with old friends I would have lost forever. But I am trying not to succumb to Twitter. I find myself mostly put off by the amount of celebrities who have embraced it. I’m sure at 16 I would have loved to know what my idols were doing, I’d have followed them and maybe even made contact with them. But now I couldn’t give a stuff. The days of searching for celebrities on MySpace, finding 150 of the same person and not knowing whether they’re the real person or not, must nearly be over, because we’re told all the time if a celebrity is on Twitter. Perhaps it is so they can get more followers and then they can gauge their level of fame and popularity compared to other celebrities.

But then you have to be more extrovert (egotistical?) to be a celebrity and I think to fully embrace social applications, being extroverted helps. I am an introvert, I approach social sites as I would if I were at a party, I stand by the wall with a small group of friends and wait for people to come to me. I have more time to think about what I’m saying on social sites too and so inevitably end up not saying it. Do I really need another application in my life that allows me to waste all my time informing people of things about my life they have no interest in? And of course reading everybody else’s insights on life? Would my life have been better had I not known my supervisor had had a really good dump at some point last year? There really is such a thing as too much information.

All that said I am finding myself more and more interested in the micro blog concept and how it could be applied so that all these opinions floating around could be utilized in some way? Despite my objections to the furore about twitter, as a social application I can see its uses. Its appeal, like Facebook and MySpace is to feel connected to people. If you can engage in communities and find a place to belong online, it’s possible to escape a lonely or troubled real world existence. These sites are a goldmine of social information, but what other information do they hold? Can a micro blogging environment be utilised to create collective opinions on web resources? In the same way that collaborative tagging can use people power to create collective descriptors for anything from websites, music, photos, movies and academic papers. The same old questions arise however as to what motivates users to interact with a micro blogging site and how can those motivations be harnessed to make the users participate in an activity that produces useful, usable data?

VideoTag experiment – Results

The big plan had been to write up my MSc project – VideoTag into a paper and get it published.  It has been over 12 months now since I finished this experiment and my ideas have moved on,  I am all to aware of the short comings of the project in terms of academic paper.  And I guess, if I am going to publish something, I want to be proud of it.  Not that I’m not proud of VideoTag, but it needs a lot of improving.  I have so many ideas of where I can push this concept for my PhD, that I want to spend my time doing that, rather than re-writing something I did 12 months ago.

However, I figured it would be a shame to never have the results of the experiment somewhere, as I did find out some stuff!!  I found out enough to warrant me being able to continue researching the concept of video tagging games for the next 3 years.

So here they are, mostly for my own benefit so I have somewhere to reference the experiment if necessary in future work.  I am sure my traffic will go through the roof with interest in these!!

VideoTag – A game to encourage tagging of videos to improve accessibility and search.


Data Collection/Tag Analysis
Data Collection
Usage was monitored over a month long period, after which the dataset for analysis was downloaded. Data was analysed for the period July 30th 2007 – September 2nd 2007. Data was also included from the user testing phase, July 15th 2007 – July 30th 2007

Tag Analysis
Quantitative methods of evaluating the VideoTag data involved analysis of sets of tags for a Zipf distribution on a graph.  Furnas et al. (1998) discuss how power law distribution demonstrates the 80-20 rule.  While 20% of the tags have high frequency and therefore a higher probability of agreement on terms, 80% have low frequency and corresponding low probability.  When the tags are inspected for cognitive level, Cattutto et al. (2007) discovered that high rank, high frequency tags are of basic cognitive level, where as low rank, low frequency tags are of subordinate cognitive level.  In terms of analysing VideoTag data this is a useful method to determine whether the game has been effective at improving the specificity of tags, in order that a greater number of subordinate level tags can create user descriptions of video to improve video accessibility.  Following the 80-20 rule, an overall increase in the amount of tags would increase the amount of basic level tags that have a high probability of agreement on terms, improving video search.  Drawing comparisons between the quantity of tags per video generated through VideoTag in relation to tags per video assigned in YouTube, will indicate whether VideoTag has been successful at increasing the amount of tags and therefore could be a useful tool at improving video search.  A paired t-test analysis was conducted to statistically qualify the comparison results.

Tag type was evaluated using qualitative methods, both YouTube and VideoTag tags were analysed for evidence of Golder and Huberman (2005) tag types and comparisons drawn.

General usage analysis revealed that 243 games were played in total during the experimental period. 87 of those games were discounted because the game points score was 0, indicating that the players had not tagged any videos. 37 of those discounted games were guest users (i.e. who did not log in). The 156 valid games were played by 96 unique users, meaning that some users played more than one game. Of the 96 unique users, 73 were registered, 23 were guests.

Table 1 % of tags entered ordered by Tagging Support

Table 1 % of tags entered ordered by Tagging Support

Table 1 % of tags entered ordered by Tagging Support

Blind tags, as defined by Marlow et al. (2006), are free tags entered without prompting the user with suggestions (suggested tagging). Guided tagging, as introduced by Bar Ilan et al. (2006) gives structure to tagging by offering the user guidelines. During the 156 valid games, a total of 4490 tags were entered. 4076 of these were Blind, 68 were Guided and 346 were Pitfalls (Fig 1). The substantial preference for blind over guided tagging means that the tag data generated by VideoTag in this experiment can not be used to compare the cognitive levels of blind or guided tags.  Tag analysis therefore compared blind tags and pitfalls and omitted the suggestion differential.

Fig 2 Frequency of blind tags per video

Fig 2 Frequency of blind tags per video

Fig 1 Frequency of blind tags per video

The long tail effect apparent when blind tag frequencies are plotted on a graph, fig 1, has evidence of a Zipf distribution.  The vast majority of tags, 52.9 % occur only once. In relation to the research findings of Cattuto et al. (2007) and Golder and Huberman (2005), fig 1 would suggest that VideoTag generates an increased number of subordinate level, descriptive tags over basic level tags of high frequency.

Fig 2 Frequency of pitfalls per video

Fig 2 Frequency of pitfalls per video

This finding is emphasised when examining fig 2 which plots the pitfall frequency. Pitfalls were created as basic cognitive level; they were the tags that were expected to have the least cognitive cost. It was expected that these tags would have high frequency, as they would be the tags that came to a players mind first. Few low frequency tags were expected if the basic level tags for each video had been predicted successfully. Partial success is shown, with only 20.23% of pitfalls occurring once. This occurrence can be explained by the fact that the majority of users played level one. Inspection of the tag data (Table 2 provides an example of tag data for a video in level 1) revealed that the majority of high frequency pitfall tags were assigned to videos in Level 1, the most played level. Therefore it could be expected that if the levels of VideoTag had been played more evenly then there would be a lower percentage of low frequency tags.

The low amount of pitfall tags (346 out of 4490) coupled with the high frequency of low frequency blind tags, is an indication of the success of the gameplay element of encouraging users to avoid pitfalls and enter more subordinate cognitive level tags. It has contributed to VideoTag’s effectiveness as a tool for generating more descriptive tags.

Fig 3 Frequency of all tags entered during the VideoTag experiment, grouped by video

Fig 3 Frequency of all tags entered during the VideoTag experiment, grouped by video

This is further implied by analysing the frequency of all tags entered, as shown in fig 3. This graph shows a clear long tail effect, with the majority of tags entered having low frequency. Whilst the high frequency tags are useful for video search, because agreement on terms will be reached quicker, the low frequency tags are important as there is a likelihood that out of the billions of internet users, at least one other user will agree on a term. The amount of tags entered, and the high amount of low frequency tags implies that VideoTag has been successful at generating a large amount of high quality tags that can be used to create descriptions of the videos for visually impaired users. If high agreement was present for all tags, then there would not be enough variety in the tags to sufficiently create the descriptions.

This result is further evident in Fig 4, which represents the frequency of all tags frequency and shows the appearance of a power law (i.e. a straight line in a log-log scale). There is a greater frequency of low frequency tags, which is a pleasing result as this was the main aim of the project, to encourage users to tag videos with more descriptive tags. By generating more low frequency, subordinate level tags, more useful descriptions can be created to improve internet video accessibility.

Fig 4 Frequency of tag frequencies for all tags entered during the experiment.

Fig 4 Frequency of tag frequencies for all tags entered during the experiment.

Analysis suggests that VideoTag has been successful at increasing the amount of tags entered for each video. Geisler, G. and Burns, S. (2007) found the average amount of tags on YouTube to be 6. The average number of tags per video in VideoTag compares very favourably at 71.3. Fig 5 clearly shows this increase in tags generated by VideoTag compared to the tags entered for each video in YouTube. The tags are grouped by Video Id and ordered by levels 1-5 ascending. An interesting anomaly in the graph shows an increased number of tags for one video at each level, indicating that out of a purely random selection, one video was selected more times. With the graph alone it can be said that VideoTag created more tags for videos than are entered on YouTube, which could be beneficial to both video search and accessibility. A paired t-test of the amount of tags entered both on VideoTag and YouTube returned a p value of 0.000 which shows that this difference is statistically significant, proving that a game environment encourages more tags for videos.

Fig 5 Amount of YouTube tags per video compared to the amount of VideoTag tags, ordered by Level.

Fig 5 Amount of YouTube tags per video compared to the amount of VideoTag tags, ordered by Level.

Tag Type Analysis

The majority of VideoTag tags are single word, which is interesting as a conscious decision was made to not limit the format in which users could enter tags, as it was believed all tags regardless of format are useful at improving meta data for a video. only allows single word tags as do other systems but some such as allow multi word tags. It is interesting that the majority of users automatically tag as a single word and do not think to enter full phrase descriptions. It would be interesting to find out if experience at tagging affects the types of tags entered, with more experienced taggers using single word tags as pre-conditioned by systems like, and novice users entering a more varied range of single and multi word tags.

Table 2 compares the VideoTag and YouTube tags. Using this example the types of tag entered can be analysed in relation to the Golder and Huberman (2005) definitions of tag type. YouTube tags can be found to fall into the social tag functions of What or Who it is about (e.g. frog), and Qualities Or Characteristics (e.g. animation and funny) with funny being an Opinion Expression tag. VideoTag tags also have social tag functions and similarly to YouTube tags, are primarily What or Who it is about (e.g. frog, fly, two frogs eating flys) and Qualities Or Characteristics (e.g. cartoon, comedy, taunting, greedy). Few of the Qualities Or Characteristics tags were Opinion Expression tags. The majority of the tags describe the characters, objects or actions in the videos with a few Opinion Expression tags (e.g. funny, humour, silly). It is surprising that not more Opinion Expression tags were entered, they are particularly useful at categorising videos as well as formulating descriptions. It would be interesting to find out, in future research, whether the gameplay of VideoTag deterred users from entering opinion expression tags, by comparing the frequency of Opinion Expression tags to those in tagging systems such as These results were general for all videos. This analysis implies that VideoTag managed to successfully encourage users to enter more descriptive tags for the videos.

This webpage has been created that shows thumbnails of each of the videos in VideoTag and lists the VideoTag tags generated as well as the original YouTube tags.

Table 2 Table comparing tags entered during the VideoTag experiment and YouTube tags for one example video from the VideoTag database.

Table 2 Table comparing tags entered during the VideoTag experiment and YouTube tags for one example video from the VideoTag database.


BAR-ILAN, J., SHOHAM, S., IDAN, A., MILLER, Y. & SHACHAK, A. (2006) Structured vs. unstructured tagging ? A case study. Proceedings of the Collaborative Web Tagging Workshop (WWW ’06), Edinburgh, Scotland.

CATTUTO, C., LORETO, V. & PIETRONERO, L. (2007) Semiotic dynamics and collaborative tagging. Proceedings of the National Academy of Sciences (PNAS), 104(5), pp. 1461-1464.

FURNAS, G.W., LANDAUER, T.K., GOMEZ, L.M., DUMAIS, S. T. (1987) The vocabulary problem in human-system communication. Communications of the Association for Computing Machinery, 30(11), pp. 964-971.

GEISLER, G. and BURNS, S. (2007) Tagging video: conventions and strategies of the YouTube community. JCDL ’07: Proceedings of the 2007 conference on Digital libraries, pp. 480-480

GOLDER, S. & HUBERMAN, B. (2005) The Structure of Collaborative Tagging Systems [Online]. Available from: [cited 09-03-2007]

MARLOW, C., NAAMAN, M., BOYD, D. & DAVIS, M. (2006) HT06, tagging paper, taxonomy, Flickr, academic article, to read. HYPERTEXT ’06: Proceedings of the seventeenth conference on Hypertext and hypermedia, pp. 31-40.

UK Brand Tags

I blogged a couple of weeks ago about a tagging project called brandtags, whilst I liked the concept I really wanted to see more British brands in there.  Seems like lots of other people did too as I received an email this morning saying that a UK version had been created.  And so has an Hispanic version too.

So I gave the UK one a go, it is more enjoyable to use with UK brands as I have more of an opinion on them.  But not many people have used it as yet so it is not as entertaining to go look at the data, at the moment.  Lets hope it catches on, it will be interesting to see what British opinion is of certain brands/companies.

ITV for example, there are a lot of good tags to describe them…

Tag Galaxy

I came across this tagging visualisation project Tag Galaxy on the flowingdata blog. It is the 2008 thesis project of Steven Wood. I love this idea and have really enjoyed playing about with it. You enter a tag, I started with beach (i’m not at all completely focussed on my upcoming holiday), then you are presented with a solar system, with your tag being the sun, and co-occurring tags represented by planets orbiting the main tag sun. Clicking on a planet (related tag) gives a new visualisation this time for a refined search with the two tags. The more tags assigned to the central sun, the more specific the search results. This is an excellent feature as it utilises the capability of filtering results and finding a photo that accurately matches what you want to see. Here is a visualisation I created from the tag beach.

tag galaxy screengrab

Clicking on the sun takes you to a new visualisation showing images tagged with your tag combination.

tag galaxy screengrab

Then if you click on an image it enlarges the image and provides a link to the flickr profile.

tag galaxy screengrab

The planets appear to be clustered but I’m not sure by what metric, at first look it seems by semantics, but either it isn’t accurate, or distribution is random. Another subtle aspect that is more apparent when looking at a solar system for a sun of multiple tags is that the planets’ sizes vary depending on the amount of photos labelled with that tag. I think an improvement would be to exaggerate this feature slightly so it was more obvious on single tag visualisations.

It has a nice touch of being able to be full screen too, a useful feature when zooming in and out of your solar system and making it spin around – which provides hours of fun on its own!

Brand Tags

Gene Smith blogged recently about a tagging game It’s been around since about May. It’s not really a game there’s not a lot to it, tag a brand with the first thing that comes to mind and then see what other people think about it. Not so surprisingly the resulting tag cloud shows I think the same as the majority most of the time. What I like about the tag cloud is that the tag size is weighted completely by frequency so the higher ranking tags are absolutely massive and it works, for this site, you only want to know at the end of the day what the majority think. The usability could be better as I want to see the tag cloud of the brand I have just tagged before I go on to tag another brand, or I’d like to be given a list of the brands I’ve tagged so far so I can go and click back and see other peoples opinions later on. But it’s a work in progress as the creator Noah Briar states on his blog.

The most interesting thing about this is the most useful thing about it, looking at peoples opinions of different brands, there’s some companies who I think would rather they hadn’t looked. In all it’s a very cheap focus group.

Its sister site which just doesn’t have enough celebrities on to be mean about, is entertaining for about a second just to see the most insults one person can collect.

All a bit too American, understandably, but it would be nice to see a few more British brands in there. There’s some obvious British insults in the celeb tag data so there’s proof we’re looking at it.

In all well worth spending a few minutes on when you’re meant to be doing something more productive.

Cognitive Level, Semantic Distance and Power Laws.

My brain hurts from reading around the subject of Cognitive Linguistics, power laws and Semantic Distance and trying to work out clearly in my mind how they are connected. In doing so i can better explain what I mean by improving quality of tags for video through VideoTag and make understood what I perceive to be a higher quality tag.

So here goes:

There are three cognitive levels of tags Superordinate, Basic and Subordinate. Basic level tags have the least cognitive cost to the user – that is they are thought of more quickly. They are more likely to have a high frequency as there is more likely to be agreement on Basic Level tags. Superordinate and subordinate have a higher cognitive cost. In relation to collaborative tagging – superordinate level is difficult to assess. It is most likely that the superordinate tag for videos is video and all basic and subordinate level tags then continue to categorise the video. When tagging a music video for instance, basic level tags may refer to musical genre e.g. rock, indie, dance but the tag music would also be a basic level tag rather than being a superordinate tag that defines the overall category for tags, because it defines the genre of the video.  Subordinate level tags on the other may reference the band name, more specific musical genres e.g. Techno, trance, emo, grunge, britpop etc. They may also name band members, cameo roles by celebrities in the video. characters in the video, define the narrative of the video and any specific actions. Keywords taken from the song lyrics would also be classed as subordinate level tags.

The tag cloud below is of My Chemical Romance’s tags on – chosen because they have 2 of the most watched videos on YouTube. You Tube tags – my chemical romance famous last words (whilst I would categorise these tags as subordinate level based on the above definition, they also highlight how inadequate YouTube tags are at describing the videos.)

my chemical romance tag cloud

This helps to explain the power law of tags. Tags in the larger font (e.g. emo, rock, alternative) are basic level tags. Tags with smaller font are of subordinate level. In this instance the superordinate tag would be music, but as it is a music site all tags contained with in fall under the umbrella of the music superordinate tag. On a power law graph, the high frequency basic level tags would have high rank, the subordinate level low frequency tags will have low rank and appear in the long tail.

The 80/20 rule can be applied here, agreement of terms can be measured as being 20% based on the frequency of basic level tags. This leaves 80% of tags at subordinate level that describe the resource but may only be of relevance to a few users. In terms of building rich descriptions of video, these subordinate level tags are imperative as they go into more descriptive detail, have greater specificity and can provide a fuller picture as to what the video is about.

As for semantic distance – I have only recently started to read up on this so I am not 100% in my mind of the connection. I think that subordinate level tags are more likely to be semantically narrow because they are related by the basic level tag they are elaborating, making the basic level tags semantically broad.

So what is a high quality tag? In terms of improving descriptions of videos it is a tag of subordinate cognitive level, low rank and low frequency and is semantically narrow. It is worth mentioning though that subordinate level tags are only useful when placed in context with the basic level tag they are adding extra description too. So VideoTag needs to encourage both sets of tags to be useful as a tool to improve accessibility and search of video.