The Brass Neck of Ordnance Survey

739px-Iberian_Peninsula_antique_mapOne of my interests is in GIS systems – Geographical Information Systems – and other aspects of computerised and online mapping.  Thanks to Googlemaps, it’s been possible for developers to create map-driven applications for nothing – Google allows access to their mapping infrastructure free for many applications, and it’s brilliant.  To anyone who hasn’t taken a look or had a play, have a look at Google Maps and for you programming types out there, take a look at the Google Maps API.

Now, what really peeves me as a UK citizen is that our own Ordnance Survey – the folks who make maps – haven’t got any facility for getting hold of mapping data free of charge.  I am aware of a rather scrappy ‘trial set’ of data that is available for use with GIS systems, but honestly – the OS was traditionally funded by the UK Government and it is only in recent years that it has been spun off.  It should not be beyond the capabilities of the current Government – who’ve always whined about innovation and creativity being a driving force of British business – and the OS to make available a system similar to the Google Maps one using UK Centric OS data, at negligible cost to software developers and end users, to actually make it easier for the development of geographically based applications on the Web, on the Mobile Internet and on our desktops.

But it hasn’t happened yet.  And this morning I find out about the ‘Geovation’ project – a project to attempt to generate innovative ideas based on the use of geographical data and concepts.  Hey, it’s supported by the OS!  I can see nothing on the site that suggests that there’s any OS data available to play with – indeed I think the only data set mentioned is Google Maps!

To be honest, this is shaping up to be an astonishing lost opportunity for the Ordnance Survey – they could have leveraged this project by making data or even some sort of API available at a reasonable cost for small businesses  or zero cost for non-commercial development and research.  It doesn’t look like it’s going to happen – I get the impression they’re going to lurk around picking up good ideas from people and then take them back and see what money they can make from them.

I may be wrong on all counts – I genuinely and sincerely hope I am, and that there is a nice, cheap, API and full UK dataset out there waiting to support companies and individuals looking at the Geovation Challenge.  Why do I think there isn’t, though?

Real Time Search – how important?

searchglassWell, both Microsoft and Google have stated that they’re adding the capability to search Twitter feeds in real-time to their search engines.   What does this mean to us mere mortals who tweet and search?

The example that I’ve seen given about the usefulness of Real Time Search (RTS) is to do with skiing – not a topic close to my heart, or one which I know much about.  My knowledge stops at things strapped to your feet and the requirement for snow…  Anyway, the example given is that you Google your favourite ski resort and along side the nromal search results returned by Google, there would also be a number of relevant, recent Tweets, that could, for example, include information about current conditions on the slopes.  The Tweets will appear based on their content or, if the Tweeter has set their account up accordingly, the location from which the Tweet has been made (geocoded Tweet).  On a purely technical basis, this is quite something.  The hamsters powering Google’s server will be running around in their wheels like crazy…

There has been an add in available for a while for Firefox using Greasemonkey that does something similar, and the effect is pretty cool, although I’m yet to be convinced about the value of most Tweets in terms of conveying information meaningful to alot of people, except in a few sets of circumstances. 

As for the importance of this combination of Tweets and Search Engine results, it’s pretty early in the game to tell but I have my own concerns and thoughts on the issue that I’ll share here.  And then in a few months time I can come back and either pat myself on the back or quietly remove this post…

Privacy

A little while ago I published this item – ‘Google and The Dead Past’ in which I commented on the convergence of search technologies – Search Engine, Twitter and Facebook being three data sources – and expressed a fear that we might be moving very slowly towards a form of voluntary surveillance society, where our regular use of Social Networks  would result in much of our lives being available for review on search engines in near real-time if we weren’t careful.  Well, we now have Tweets being folded in to the Search mix; I assume that it won’t belong before Twitpics get included, and then if Facebook open up their API to facilitate searching,  my comments in that article are coming closer to reality!

Of course,  just as with standard Search Engine manegemnt on a website, it is posisble to exclude your tweets form this search.  Google have had a few gremlins with this, but they’re getting there, and it’s likely that, were they ever to join the party, Facebook would do the same thing. Whether people would avail themselves of these tools is another matter.

Relevance

Just how the search engine’s ranking system will be applied to Tweets is an inetersting question.  For example, Google’s Pagerank algorithm relies on many things, including links to a page, links from it, the nature of the links, etc.  as well as content.  This is simply not going to work on Tweets, so it’s safe to assume that some other form of relevance rating will be used.  And Bing will have something totally different – as will any other Search Engine involved in searching Tweets.  I am forced to wonder how relevant the results of Real Time Search will be.  Obviously it will improve with time, but so will the ability of spammers to game the system.

Perspective

Those of us old enough to remember the TV news reports of the Falklands War in 1982 would remember that events could happen in the South Atlantic a good few days before we saw it on the news.  By the time of the First Gulf War, CNN was reporting on events as they happened from it’s own reporters and within hours from the wider military theatre of operations.  By the Second Gulf War, in 2003, there were journalists embedded with infantry units carrying satellite phones and digital cameras and literally reporting on ongoing fire-fights.  It’s been said that the Falklands were reported from the point of view of the Government, the First Gulf War from the point of view of the generals and the Second Gulf War from the perspective of an infantry Platoon leader or tank commander.

The result is that whilst the Platoon Leader point of view gives us immediacy, it allows no time for contemplation of wider issues.  And the immediate perspective of one person in a large news event, for example, can give a very distorted view.  I very much expect that Tweets in search result could easily give rise to ‘firestorms’ of rumour that flare up and then get corrected within minutes.  What impact this will have on news gathering and the general emotional health of people doing searches on new stories – to be seeing a view of the world that is from the bottom up, changing every few minutes, I’m not sure.  Whilst this sort of immediate citizen journalism is great in theory I’m not sure that it’s good in practice;  tweets available to all on a Real Time Search might manipulate the news as much as report it.

So…Real Time Search important?  Conceivably yes – but perhaps in the wrong way.

A good time to upgrade WordPress!

Wordpress LogoI’ve just upgraded various blogs I look after – including my own – to WordPress 2.8.5.  This release is regarded as a ‘hardening’ release by WordPress themselves, and if you’re reasonably up to date the upgrade is a piece of cake – the automatic installer does it all for you.

It might also be a good time to take a look at your WordPress setup in general.  Good practice with any website installation tehse days states that the less you have on a website, the less places there are for malware to hide, so one thing to do immediately is to remove any unused themes or plugins – use your FTP client to back them up if you can’t lay your hands on your originals.  If you do decide to change theme or use the Plugins again, just install them.  Whilst there are some nasties that lurk in the ‘Default’ theme, it’s probably best to leave that installed because it gives you a fallback position if a Plugin breaks your custom theme.

If you have statistics running, take a good look at any ‘spikes’ in the page views.   I use the WordPress stats package and find it perfectly adequate for my needs – which is basically stroking my ego to see if people are reading what I write.  Looking at my page view, I noticed a spike over 3 days early last week – twice as many hits on the site as usual.  Unless you’ve recently done a push for readership, or have blogged on a matter of wide interest, this can indicate a compromise of your site – as I found.

The stats package provides a list of search terns that are used Looking at things in more detail I noticed that whilst the pages accessed were familiar to me, the search terms that were used to get there were most certainly not.  ‘Girlfriends boobs’ is not something I tend to write about on this site!!  Now, given that those terms must have been on the site somewhere to get the hit.  I took a look at the logs provided by my hosting company, and also wandered around my site with FTP.  I DID find evidence of some dodgy looking links, buried in a sub-directory inside teh WordPress installation being accessed by looking at the logs.  However, checking with FTP revealed noting – I realsied that my upgrade to 2.8.5 had wiped out the evidence.  I’ve not had any similar strange search terms showing up since then.

So – summing up:

  1. Keep upgraded.
  2. Remove anything you don’t need.
  3. Install some simple stats and watch Page Views for unexpected spikes.  Get a ‘feel’ for the normal sort of readership levels of your site.
  4. Keep an eye on search terms used to get to your blog.  If ‘odd’ search expressions turn up then start ferreting around. If you have a Google account, register your site with Google and keep an eye on unfollowable links, etc.  Learn what logs are available from your hosting provider and use them.

That’s my lesson for today on WordPress!  As for the upgrade – 2.8.5 works like a charm and has no bad habits that I can find!

The end of an era…

Joe's old Nokia...an excellent phone!

Joe's old Nokia...an excellent phone!

Well, the end of a personal era for me.  A number of years ago – I think 2003 or thereabouts – I needed a new phone.  I honestly can’t remember what I had before the trusty Nokia shown on the left.  I think it was a Motorola of some sort – similar sort of functionality, though – phone, SMS, Breakout game.  I remember walking in to one of the phone shops on Fargate and ogling the more modern phones, gaining the attention of a salesman and then immediately asking him for the cheapest PAYG handset he could offer me.  A few minutes later I left with the Nokia 3410 that cost me £25.  I stayed with T-Mobile so I coudl keep my existing phone number and started my relationship with the most reliable and bombproof phone I’ve had to date.

The 3410 supported Java and apparently could connect to the Internet – quite what I’d be able to do with a small green screen I have no idea, and I never managed to get the Internet connectivity working.  I don’t think I really spent a lot of time on it – after all, all I wanted from the phone was the ability to talk to people when I couldn’t be with them – something that Alexander Graham Bell would definitely approve of.  Even texting for me was something done in extremis – there was a time when the only time I sent text messages was from noisy pubs where I didn’t want to go outside in to teh ferezing cold and lose my seat to make a call!

The Nokia was a brilliant little phone.  Not long after I got it I dropped it and cracked the fascia.  Fortunately my wife (who changes phones with greater frequency than I do) had a 3310 in the cupboard so I ended up with the front and back fascias not matching – a sort of two tone case.  Soon the vase was getting adorned with little stickers – charity stickers after I’d given money, little stick-on ticks from my niece, etc.  It was beginning to look like the guitar case of a particularly well travelled pub-rocker. The only think I couldn’t do with it was open the darn case – it seemed to be put together in such a way that precluded access to nail biting forty-somethings.  However, it could be coerced in to opening up by my sticking it in my breast pocket and running for a bus – the phone would fly out, hit the pavement and then explode in to 4 pieces – two halves of the case, the battery and the gubbins of the phone.  Opportunities like this were excellent for cleaning out cat hairs and other crud that had accumulated inside the phone.

After a while I began to realise that people remembered my phone – it was pretty much old fashioned when I bought it new.  As the new, all singing, all dancing handsets with more brains than your average X-Factor contestant hit the market, my magnificent machine was recognised amongst my friends as a steam powered anachronism that fitted the personality of it’s owner.  You see, despite my professional engagement with technology, I’ve always been something of a Luddite in some ways.  I don’t like being on the bleeding edge of things, or for that matter the leading edge.  God created ‘early adopters’ to experience the bugs and foul ups that normal people shouldn’t have to.  I was writing web services and partaking in ‘meeja’ projects with a phone that was appropriate for an operative from Warehouse 13 or a character in a Steampunk novel. I had one client who came in to the office one day holding an old 3410 and offering it to me for ‘spares’ – I think that rumours started that I actually planned to hand maintain the phone, and that in years to come whilst the rest of the world revelled in smart phones with a higher IQ than their owners, I’d still be chattering away in to my green-screen, steam-powered telephone.

Well, sure enough, all good things had to come to an end.  Earlier on this year I noticed that somehow some cat-fur had managed to get in to the display, and a couple of the keys were starting to get rather ‘sticky’.  I also had increasing numbers of people wanting to text me (and expecting a text message back – a process which with my inability to text through a numeric keyboard tenedd to drive me to the closer edges of insanity!)  And I was also getting increasing amounts of spam.  Some time ago I rather stupidly used the number for a contact number on the WHOIS registry of web domains – it escaped from there and seems to now be doing the rounds of insurance companies and other financial services organisations, generating lots of people trying to ring me to sell me financial services…. 

In June of this year I received a phone call from BT – my Broadband supplier – asking me whether I had a mobile phone contracA Crackberry similar to mine!t, and if not would I be interested in a smartphone of some sort.  The price was not much more expensive than what I paid on PAYG, and it included more than enough minutes and text messages…and whichever one I went for a real (if small) QWERTY keyboard…and other gadgets like MP3 players and Internet connections!  And so I’ve ended up with a Crackberry…OK…a Blackberry, just in case the manufacturers are reading!

And I’m delighted.  Actually, I think that some who know me may well think I’m besotted with the darn thing.  It’s a very capable and well equipped piece of kit, and is the only phone that I’ve ever had that I ‘fondle’ when I’m not actually using it.  It’s got a nice little facebook application, a Twitter application, I stuck a memory card in and put my music on, I can take photos with it, it has a great contacts book and diary AND I can make phone calls and send text messages at a rate that satisfies my texting friends! 

AND – I can receive photos of my dear God-daughter when she’s out on travels with mum and dad!

So…I wonder how long teh new phone will last me?  The 3410 was with me for 5 or so years; I’d like to give the Crackberry at least that long – the last thing I want to do is to get in the habit of regularly changing phones!!

Wolfram Alpha – how not to make friends and influence people!

Hmmm…this is becoming WA corner recently – take a look at my previous piece here.  I was less than impressed with the technology and considered it either over-hyped or released too early in to the world.  However, I did hope that as time progressed there might be improvements in the results set returned and, more importantly from a developer point of view, an API published that would allow developers to build new applications to stretch and maybe improve WA.

So, this week an API was announced for Wolfram Alpha on the company’s Blog and I was pretty excited about the prospect of trying out a few things.  Despite my grumbles about the results returned, I was hopeful that with a suitable API encouraging third party developments, the underlying technology and data sets at WA might see an improvement.  My hopes survived for as long as it took me to start reading the small print – in particular this little document, the price list.  Now, I’m aware that WA has cost money to develop but to charge for developers to make use of teh API seems to be one of the dumbest and most counter-productive things they could do.  There are some ‘pioneer grants’ available for the developers, but I get the impression that these are still likely to involve shelling out money.

Google do not charge developers for use of the API until you start using the API in ‘closed’ systems and with a large number of calls.  They certainly don’t charge you during the development cycle – they have more sense.

Now, let’s assume I wanted to develop an API based application for WA – what we in the trade call a ‘proof of concept’ model – i.e. something that proves whether or not the bright idea that we sketched out on the back of a beer-mat in the pub will actually work.  How many requests might I get through to develop such an application?  Well, the other day I wrote some code to retrieve data from a Postcode / Geocode system’s API.  Now, this was a VERY simple application – send a Postcode, retrieve a list of addresses, send a code number, retrieve a full street address with map reference.  Let’s say 2 calls to the remote API for something very straight forward.  During code development and ‘in house’ testing I made about 30 or 40 API calls.  Now, during more formal testing on the client site that’s going to increase somewhat – probably in to the low-hundreds.  And this is for a problem with a well defined structure, with a finite returnable answer set – i.e lists of addresses, a single address or nothing at all, all in a set, predictable format. 

By the very nature of the sort of problem that WA has been set up to deal with, the problems passed up via an API are unlikley to be as well defined and the results set returned is also unlikely to be as simple to deal with as my addresses.  When I did some API work with Google for a client I found I was generating hundreds of API calls and responses during development, let alone testing.  For WA, I’m looking at $60 for 1000 API requests, and $0.08 for each additional request beyond the thousand I initially pay for.  Obviously, I can buy a bigger bundle, but the inference is clear – it ain’t gonnna be cheap developing for the WA API. 

API developments typically involve a learning curve for the API syntax and methods of use.  This is par for the course and to be expected.  However, when the API is interfacing to a curated data set like WA, we have an additional problem of whether the data set will actually contain the sort of data that we’re wanting to get back.  And whether it will be available in the sort of format we’re interested in.  And whether the curated data is timely compared to the data that is being made available through non-curated data sets like those available via Google – or other APIs, for that matter.  Clearly, if your problem space IS covered by WA and the data set WA has available contains what you want in the format in which you want it, then perhaps the API fee is worthwhile.  But for those developers wanting to try something new out, they’re most likely to look to free APIs to test their ideas, and spend time and energy working the wrinkles out in an environment that isn’t costing them pennies for the simplest query.

I’m afraid WA have dropped the ball big time here; by charging for ALL development use of the API they’ve alienated a large source of free development and critical expertise.  Look at how Google has benefited from the sheer number of developers doing ‘stuff’ with their various APIs.  Can you imagine that happening had they charged all the way?  Hardly likely. 

If WA were to make a limited  ‘sandbox’ set of data available for developers via a free of charge API, that would at least allow the developers to get the wrinkles out of their code.  The company could then charge for use of the ‘live’ WA datasets, and would have the additional advantage of the code being run against the live system being reasonably bug free.  By charging from the first line of code written, they’re restricting the development of their own product and driving people in to the arms of Google, Amazon, Bing and the like.  WA doesn’t appear to be offering a lot that is truly revolutionary; so-so natural language query interface against a curated data set.  I doubt it will be long before third party developers start producing the same from Google.

Google and ‘The Dead Past’

Earlier this year we saw the launch of Google’s Street View system – here – and with it came a plethora of complaints about the invasion of privacy implications.   I was one of the happy complainants – Google had a view right through my house, showing people in the house.  To be honest, their reaction was swift and the imagery was removed, but it was an invasion of privacy and I’m still to be convinced that there is any long term gain to be obtained from the system.  yes, I’m aware of all the ‘well, you can see what a neighbourhood’s like before buying a house there’ arguments, but if you do all your checking out of the largest investment you’ll ever make on the Internet then you deserve to find yourself living between a Crack Den and a student house.

Enough…step back and breathe…the title of this piece is ‘Google and ‘The Dead Past’ – now what on Earth do I mean by that?

Science Fiction afficionados amongst you may recognise part of the title as coming from an old story by Isaac Asimov, in which a researcher develops a time viewer to look in to the past, only to eventually realise that the past starts exactly a fraction of a second ago – for all practical, human purposes, the past, to his machine, is identical to the present.  He’s accidentally invented the world’s finest surveillance machine.  As a character says at the end of the story – ‘Happy goldfish bowl to you, to me, to everyone, and may each of you fry in hell forever.’

Now, there’s a looooong way to go between Google and eternal damnation through surveillance, but as is often pointed out, the road to Hell is firstly paved with good itentions and always starts with a single step.  Let’s do soem of that old style extrapolation, though, and see what we’ve got coming up in our future.  Here are a few things that have been posited and talked about as being part of our online future,  some of which are already here, some of which are extrapolations, all of which are technically feasible, if not yet politically acceptable.

  1. Decreased latency between changes in the online world and those changes turning up in Search Engines.  At the moment we might expect a day or so even on busy sites regularly trawled by search engines – a possible future might be that items get folded in to search space within hours.  We’re also already heading towards Tweets being searchable – perhaps future APIs will allow combined searches of facebook, Twitter and general webspace all in one shot?
  2. Use of  ‘mechanical turk’ approaches in encouraging people to use their spare time to classify images, scan online video, etc.  to tag media that are currently not searchable by search engines in their raw form. Imagine that being idone in near real-time.  DARPA are already researching tools to extract context out of text and digitised speech; perhaps some degree of automated scanning of video will follow.  And it’s not outlandish to suggest that what might be useful for the military will sooner or later find its way into civillian online life.
  3. The possibilities inherent in IP Version 6 for a massively enlarged Internet Protocol addressing space make it easier than ever to ensure that everything that can have a separate IP address will have a separate IP address.  Combine that with the geolocation capabilities that come with reduced cost GPS chip sets – many phones now have GPS built in – and the tracking of devices (and their owners) in real time or near real time, sold to us as extensions of the social media experience, becomes a reality.
  4. The increasing usage of ‘Cloud’ computing where everything about you is stored not on your computer or phone but on a ‘cloud’ storage system run by your phone company (T-Mobile?), software supplier (Microsoft?), media seller (Amazon?) puts all your digital life in to teh network – where it can be scanned and examined in transit or in storage.

Add to the technical advances the willingness for peopel to share their activities via Social media (or eventually the commoditisation of their activity patterns and media interests, as ISPs and phone companies realise that people will give up a lot of privacy for cheaper connectivity) and we are perhaps heading towards the science fiction scenario described above.

If people were concerned about the impact of Street View on their lives – a single snapshot taken as a one off – imagine the possible impact of your real-life world being captured as a mosaic by different sources and then being rendered and made searchable by interconnected search tools.  A phone call positions you in one place, photographs taken on the same phone and geo-tagged by the software are sent to a searchable social media site and so identify who you were with and when.  You show up in other photos,  as a recipient of a call from another phone, and so on.  The other evening I was asked ‘Who doesn’t want to be tagged in these photos?’ – the new social nicety for people who are concerned over the privacy of their friends.   Sooner or later I’m certain that nicety will slip by the wayside, and it will be up to us to police our own image online.

A recent business enterprise where people are being asked to monitor CCTV cameras in their spare time  – Internet Eyes – may be regarded as distastefully intrusive, but I do wonder whether it’s the start of a whole range of ‘mechanical turk’ type activities where people are encouraged to act as high-tech lace-curtain twitchers.  That past is not looking as dead anymore.

Are you feeling spied on yet?  If not, I’m sure you soon will be.

Wolfram Alpha – too early released or over-hyped?

In case you’re saying, “Wolfram what?”, here’s a little reading:

http://www.wolframalpha.com/

http://www.bbc.co.uk/blogs/technology/2009/05/does_wolfram_work.html

http://news.bbc.co.uk/1/hi/technology/8052798.stm

http://www.guardian.co.uk/news/blog/2009/may/18/wolfram-review-test-google-search

http://www.theregister.co.uk/2009/05/19/dziuba_wolfram/

http://www.theregister.co.uk/2009/03/17/wolfram_alpha/

http://www.theregister.co.uk/2009/05/18/wolfram_alpha/

 

OK – I’ll start by announcing a vested interest here.  I occasionally write software that attempts to make sense out of straight English questions and phrases, and then by cunning trickery makes the response from the program appear ‘sensible’ as well.  So I know something about how to make software appear smarter than it actually is.  And I’m afraid that at first glance I regard Wolfram Alpha as over-hyped, under-delivering and pretty much unsure of it’s position in the world.

But, the folks at Wolfram Research score highly for getting the coverage they’ve managed!

WA is described as a Computational Knowledge Engine, rather than a search engine.  However, it’s raison d’etre is to answer questions, and nowadays any piece of software on the internet that does that is always going to be regarded by users as some sort of search engine, and the ‘Gold Standard’ against which all search engines tend to be judged is Google.  So, first question…

Is it fair to compare WA and Google?

Not really, and Wolfram himself acknowledges this.  WA is regarded by the company as a means of getting information out of teh raw data to be found on the Web, and it does this by having what’s called ‘curated’ data – that is, Wolfram’s team manage sources used for the data and also the rpesentation of the data.  This makes it very good at returning solid factual and mathematically oriented data in a human readable form. 

Whereas Google will return you a list of pages that may be useful, WA will return data structured in to a useful looking page of facts – no links, just the facts.  And a list of sources used to derive the infromation. The results displayed are said to be ‘computed’ by Wolfram Research, rather than just listed as is the case of a search engine.

Is it a dead end?

WA relies on curated data – that is, a massaging and manipulation process to get the existing web data in to a format that is searchable by the WA algorithms and that is then also presentable in a suitable fomat for review.  This is likely to be a relatively labour intensive process.  Let’s see why…

In a perfect world, all web data would be tagged with ‘semantic tagging’ – basically additional information that allows the meaning of a web page to be more explicitly obvious.  Google, for all it’s cleverness, doesn’t have any idea about the meaning of web page content – just how well or poorly it’s connected to other web pages and what words and phrases appear withjin the page.  They do apply a bit of ‘secret sauce’ to attempt to get teh results o your search closer to what you really want, assuming you want roughly the same as others who’ve searched the Google search space for the same thing.  Semantic tagging would allow a suitably written search engine to start building relationships between web pages based on real meaning.  Now, you might just see the start of a problem here…..

If a machine can’t derive meaning from a web page, then the Semantic tagging is going to have to be human driven.  So for such a tool to be useful we need to have some way of ensuring as much web data as possible would be tagged.  Or, start from tomorrow and say that every new page should be tagged, and write off the previous decade of web content.  You see the problem.

What the WA team have done is taken a set of data from the web, and massaged and standardised it in to a format that their software can handle, then front-ended this system with a piece of software that makes a good stab at natural langauge processing to get the meaning of your question out of your phrase.  For example, typing in ‘Compare the weather in the UK and USA’ might cause the system to assume that you want comparative weather statistics for those two countries.  (BTW – it doesn’t, more on this later)

The bottom line here is that the data set has had to be manually created – something that is clearly not posisble on a regular basis.  And a similar process would ahve to be carried out to get things semantically tagged.  And if we COULD come up with a piece of sofwtare that could do the semantic analysis of any piece of text on the web, then neither of tehse approaches would be needed anyway.

In a way, WA is a clever sleight of hand; but ultimately it’s a dead end that could potentially swallow up a lot of valuable effort.

Is it any good?

The million dollar question.  Back to my ‘Compare the weather in the UK and US’ question.  the reason I picked this was that WA is supposed to have a front end capable of some understanding of the question, and weather data is amongst the curated data set.  I got a Wolfram|Alpha isn’t sure what to do with your input. response.  So, I simplified and gave WA : “Compare rainfall london washington” – same response.  I then went to Google and entered the same search.  And at the bottom of Page 1 found a link : http://www.skyscrapercity.com/showthread.php?t=349393 that had the figures of interest.  Now, and before anyone starts on me, I appreciate that the data that would have been provided by WA would have been checked and so would be accurate.  But I deliberately put a question to WA that I expected it should be able to answer if it was living up to the hype.

I then gave WA ‘rainfall london’ as a search and got some general information (not a lot) about London.  Giving ‘rainfall london’ to Google and found links to little graphs coming out of my ears.  A similar search on rainfall washington to Google gave me similar links to data on Washington rainfall.

WA failed the test, I’m afraid. 

Will it get better?

The smartness of any search tool depends upon the data and the algorithms.  As we’re relying on curated data here, then improvements might come through modifications to data, but that might require considerable effort.  If the algorithms are ‘adaptive’ – i.e. they can learn whether answers they gave were good or bad – then there might be hope.  This would rely on a feedback mechanism from searchers to the sofwtare, basically saying ‘Yes’ or ‘No’.  If the algorithms have to be hand crafted – improvement is likely BUT there is the risk of over-fitting the algorithms to suit the questions that people have asked – not the general searching of what MAY be asked.

And time passes…

As it turned out, this post never moved from ‘Draft’ to ‘Published’ because of that thing called ‘Life’.  So, a month or two have passed, and I’ve decided to return to Wolfram Alpha and see what’s changed….

Given the current interest in the band Boyzone, I did a quick search.  WA pointed me to a Wiki entry – good – but nothing else.  Google pointed me to stacks of stuff.  Average rainfall in London got me some useful information about rainfall in the last week.  OK….back to one of my original questions ‘Compare rainfall London Washington’ – this time I got the London data with the Washington equivalent on it as well – sort of what I wanted.  Google was less helpful this time than back when I wrote this piece.

So…am I more impressed?  Maybe a little.  Do I feel it’s a dead end?  Probably, yes, except in very specific areas taht might already be served by things like Google and Wiki anyway.

Do I have an alternative solution for the problem?

If I did, do you think I’d blog it here and expose myself to all that criticism? 🙂

Am I a twit not to Twitter?

OK….I remember a year or so ago saying i’d never join Facebook, and then making myself look a pudding within a month or so when i started using Facebook to keep me in touch with friends after I stopped using another online service.

Now, around the same time I also made a brief investigation of the Twitter service – some more information here.  Whilst I can’t argue that it’s popular, and has attracted a vast amount of traffic and interest, including being used in the Australian bushfires and the Mumbai terrorist attacks, I’m still yet to be convinced of the value of telling the world precisely what I’m doing in 140 byte chunks.

Let’s face it, I’m too busy / idle to maintain my Facebook status more than once a day on average, so the idea of me managing to ‘tweet’ happily several times a day on the Twitter system is probably minimal.  And I’m not convinced of the overall value of most of the content that seems to be generated on Twitter; allow me to explain.

Too short!

To begin with, 140 characters is shorter than an SMS message, and unless you’re skilled at putting highly informative short messages together, the informational content of such messages is limited purely by the size of the message, unless you send a string of such messages.

Too distracting!

We then move on to whether Tweeting encourages the attention span of a boiled potatoe; it’s a disruptive technology in all the wrng ways – it simply disrupts your attention by a string of pointless inanities appearing in your Phone, Twitter client or web browser.

What does it do that other media doesn’t?

In terms of brevity you have SMS messages or Facebook statuses.  In terms of information content you have Email, blogs or Forum posts.  Tweets are ephemeral – they’re not naturally persistent and are as short lived as real birdsong.

So, what the Hell is it all about?  I’m aware of the use of this sort of technology in crisis situations but is this genuinely making appropriate use of the available technology?  I’m yet to be convinced that Twitter is anything but another toy for the technorati, and one whose lifespan in it’s current form is probably going to be limited by the emerging financial realism in the world.  I’ve heard of alternative uses – people using hardware to automatically place Twitter messages in to the ‘twittersphere’ form such things as potted plants and the old standby of IT departments, the drinks machine.  These messages are then picked up by a piece of software listening on Twitter for ‘tweets’ from the appropriate account.  This is nothing different to using UDP packets, for example, but at least there’s a more easily accessible interface here.

But I’m not convinced – someone, anyone, convince me of the value of this application, PLEASE!

You pays peanuts…..

And you get monkeys.

I assume most of us have heard this phrase. It’s become almost a mantra with me in my professional life because the last 6 months have exposed me to an interesting aspect of the freelance world that I’ve not been aware of until now; the fact that there are a Hell of a lot of people out there expecting a lot of work for next to nothing!

Allow me to elaborate…I get most of my work through ‘word of mouth’ – this has always been the way and after 20 odd years in IT it seems to have worked well. But I still like to chase the odd new client – after all, nothing wilts faster than laurels that have been sat on, as they say. In many ways, the availability of Internet web sites that allow people wishing work to be done to advertise their requireents for people like me to pick up the jobs should have ade things easier, but it hasn’t.

In fact, I’m beginning to regard such sites as one of the worst things that has happened to ‘professional’ freelancers and contractors, because they have totally distorted the market. Don’t get me wrong; I’m a firm believer in market forces but these sites are actually pushing the markets for freelance development work to the brink of extinction. And this isn’t going to be a rant about out-sourcing…

My concern is that people are posting requests for work like the following:

“Develop a highly interactive and very aesthetic media review website. A good example is Yahoo! TV. The site is going to cater for commercial considerations i.e web ads. Want a site that would load fast as well.
Hence, beautiful but efficient. Must do the job. “

This is a real advert, tweaked for punctuation and spelling in two places.  Now – this isn’t a hobby site, it’s not a charity.  The poster is open in that there will be advertising and will be catering for ‘commercial considerations’.  That’s the full ‘job brief’ against which people are expected to bid, by the way.  Now, let’s assume that we can put something together like the Yahoo TV site – here and ignore the content and imagery side of things for now.  It’s got forums, photo galleries, all sorts of cute stuff.  I wouldn’t even want to try tackling it – a wise man knows his limitations, after all.  But I can guess the sort of development time – you’re looking at the minimum of 2-3 man-months here, I’d estimate.  

And the suggested budget?  £250.  Yes, Two Hundred and Fifty Pounds.  No missing zeroes.

I cannot imagine the most desperate out sourcer being willing to work for that sort of money, let alone a programmer in the UK, US or Europe.

Oddly enough I came across this today:

http://technology.timesonline.co.uk/tol/news/tech_and_web/the_web/article5483244.ece?token=null&offset=0&page=1

An article in the Times dealing with Amazon’s Turk’ project which harnesses the available time of people to do online jobs of various sorts.  Where you might be expected to work for a couple of pence an hour, if that.

Digital exploitation?  You betcha.  There are projects that rely on the good nature of people to get things done – projects where the bottom line is a better, publically and freely available service, rather than profits to corporations who can already dictate terms to much of the online world.

Some years ago I was involved in film making and there was a very rich culture of ‘No-budget’ filming, where productions were put together with no budget except for the essentials of film stock or tape – everything else was borrowed, begged or blagged.  But part of the contract was that anyone involved would get a copy of the material for their own portfolio and an on-screen credit – ‘Credit and VHS’ – as well as being fed and watered on set.  This model could, of course, be exploited but rarely was, because the world of film making was relatively insular and someone pulling a fast one would immediately find it difficult to crew-up next time around.

Perhaps we need to start being similarly watchful in the information marketplace?