Archive for the 'Meta' Category

A new year…

…a new blog.

I wasn’t neglecting this blog enough, so I decided I needed to make another one.

No, seriously, this all makes perfect sense in my head. Really.


On internet exposure

Last week I put three photos on Flickr.

Green-purple-redWestpac's Batteries Are LowWithout a leg to stand on

All three were uploaded at the same time (within a minute or so). All three are tagged with “Sydney”, two to four descriptive words, the lens I used, and a location. I didn’t add any of them to any groups. All three were linked from my Facebook feed and were visible in the sidebar of this page, but as far as I can tell (from the referrer list on the Flickr stats page) that didn’t contribute more than a couple of hits.

In the three or four days since I put them up, the first and third have been viewed 6 and 3 times respectively, which is about average for me. But the one in the middle (the Westpac building) has been viewed 38 times.

38 isn’t a huge number by any standards, but it’s a very clear outlier.

No one has commented on it, or added it as a favourite, or, as far as I can tell, linked to it from anywhere else. It’s not obviously a better photo than the other two or any others that I’ve uploaded recently.

There are a few possible explanations… Maybe one or both of the “westpac” and “low battery” tags are interesting enough that thirty-odd people have searched for them, but not interesting enough for other people to have used the same tag and pushed mine off the top of the search results. (The latter part of this theory seems to be true, at least.) Another is that I’ve hit some note with the inscrutable whims of Flickr’s “interestingness” measure.

Whatever. The point is that somehow something I did got caught in a local eddy of the chaotic system of internet popularity, and attracted more attention than everything else I’ve done in the last month put together. That doesn’t count the dust storm photo, which in about two days was viewed more times (900-ish) than any of my other photos.

Of course, DMM had a photo of the dust storm that got 8605 views. But then, that made the Flickr Explore page. Of his six photos of the dust storm, it was the only one with more than 2000 views.

Where I’m going is this… Exposure on the internet seems to have a sort of exponential growth behaviour. Getting bumped from “complete obscurity” up one rung to “noticed for a brief moment” is at least an order of magnitude. And the same thing happens at every level above that – there’s no such thing as slightly more exposure, only lots more exposure. Regardless of how well-known any person is, a very small percentage of the work they’ve done will make up a very large percentage of what people have seen.

When I spell it out like this, it’s actually pretty obvious. And not at all restricted to the internet.


1 comment

Unified communication

A familiar part of being an online-dweller in the Web 2.0 world is that moment when you have to make a decision along the lines of “should I express this thought as a blog post, a tweet, a Facebook status update, an email, a forum post, an edit to a wiki, a…”

I assume I’m not the first to have the follow-up thought: “why do I have to decide?” We seem to have ended up with a handful of semi-standard ways to communicate with a wide audience, none of which serves exactly the same purpose, but with the result of having to either choose between them (which restricts the audience) or redundantly post to all of them (which fragments the resulting discussion).

Got my Google Wave invite last week, and so far I haven’t done much with it, but it does seem to be an attempt to face the problem of unifying a bunch of different ways to communicate – albeit by adding another different way to communicate.

I was about to say that the drawback of Wave is that it’s another instance of Google Owns All The World’s Data (at least we can be grateful that it’s not Microsoft), but I just checked the Wikipedia page (naturally) and apparently the plan is to eventually let anyone run a Wave server. So that’ll be interesting to watch.

Anyway, the point I was getting to was that I recently started having random thoughts about what a Grand Unified Social Network would look like. That led me to the Wikipedia page (again, naturally) on distributed social networks, but so far the term seems to refer mainly to attempts to get Facebook contact lists and whatnot into the open. Okay, maybe that’s just because the term is “distributed social network” not “distributed communication network” or something, but still.

What I’m vaguely imagining (and I’m more or less making this up as I go) is something where I post some data (say, a tweet/status-update-like message, but whatever) to my local Grand Unified Social Network Server (uh… GUSNS?) with some kind of visibility settings. Depending on the visibility settings, it pushes and/or makes it visible to other users on the same local server, and/or pushes it upstream to some kind of GUSNS hub in a vaguely DNSey way, which distributes it to other users. It might be attached to an existing message/user/object, like a conversation thread. And, unless it’s explicitly public, it’s encrypted using a key known to the circle of people to whom it’s supposed to be visible.

Okay, that description makes even less sense than it did in my head. Obviously I haven’t been thinking about this long enough.

Maybe I should go and read up on Wave and XMPP and whatnot and see if this is all just already done.

1 comment

Androidy goodness

Test post from wpToGo on my new(ish) HTC Hero. I may well be sitting in front of a computer that would be much easier to type on, making this a pretty pointless exercise, but hey, gadgetlust is gadgetlust.


Comprehension test

The CAPTCHA war is starting to get old.

Unfortunately, it seems that the fundamental flaw in a CAPTCHA, that makes it different from the proper Turing test, is that it’s administered by a computer, not a human. If I had a chance to hold a brief conversation with the comment-leaving entity, I’d be able to tell the difference between a human and a spambot trivially. But the webserver is the one holding the conversation with commenters, not me.

Right there is an interesting point though: This is a blog. Someone leaving a comment is participating in a conversation that started at the top of the page. Shouldn’t I be able to lean on that somehow, to only allow comments that move the conversation in a human-like direction?

I suppose that’s achieved by turning moderation on. And I do delete spam comments afterwards. The problem isn’t quite that I’m not in a position to judge who’s human, it’s that I want the comment to appear on the site as soon as it’s entered, without my interaction. Hmm.

Now, suppose that, instead of a CAPTCHA, each post had some kind of comprehension test. To enter a comment, you need to provide a short answer to a question that will be obvious if you’ve read the post. Not only will this prove difficult for spambots, but it will also filter out human spammers and people who don’t read the post before commenting.

Gosh, where do I start on the problems with this plan. It means that I have to put a bit of extra thought into each post to come up with a question – it has to be unambiguous, but ideally not just a single word gleaned from the post, because that would be vulnerable to a spambot just trying every word. It can’t be so tricksy that it blocks legitimate commenters (a problem it shares with CAPTCHAs, but along a different axis).

Then there’s the problem of what to do with all my old posts that don’t already have a question – especially since they’re the ones attracting all the spam (presumably the ones that rank high on Google or something). In fact, I might need to regularly change old questions, because unlike the current CAPTCHA system which has a (semi-)unique challenge each time you load the page, the comprehension question always stays the same, so someone can throw an indefinite amount of spam at an abnormally popular post by just answering the question once. I don’t know enough about the mechanics and motivations of spamming to know whether that’s something anyone would want to do, but it seems like a big hole.

Encouragingly, though, there are some upsides. I already mentioned that it’s also an obstacle to human spammers of various kinds. It also doesn’t have the accessibility problems that CAPTCHAs have (for visually impaired readers and such). And it might actually be interesting to embed information in every post – it’d be like a whole series of mini-puzzle-making exercises. Not for everyone, but I might enjoy it. Maybe.

Some of you will presumably wonder why I don’t just turn Akismet on. That’s not the point. The point is… okay I’m not entirely sure what the point is, but at the moment this is more interesting to me as a problem-solving exercise than a spam-blocking exercise. So much so that I’m thinking I might actually try this. It’ll be a fun gimmick at any rate.

The comprehension test for this post, if and when I implement it, will be: What is the acrostic formed by this post? This is a bad question to use too often, because a spambot can just look for the word “acrostic” in the question and work it out easily, but the idea is that the space of possible questions is big enough that they can’t solve them in general. (And if they do, then maybe makers of spambots will contribute to the next major breakthrough in AI.)


The CAPTCHA war continues

What I learnt from logging CAPTCHA attempts was, essentially, nothing. Every post either entered nothing at all or the correct word, and all of the correct ones were obviously from the same source (not the same IP address, but structured the same way and advertising similar things). That is, it’s not being broken by a huge dictionary attack or anything.captcha

So I think the remaining alternatives are that someone has actually broken my CAPTCHA with 100% accuracy, or this is getting back to a human at some point. Possibly I’m caught in one of those man-in-the-middle attacks where my CAPTCHA is relayed to some porn site and decoded by a horny teenager. Alternatively, people are getting paid to leave spam. I’ve heard of this sort of thing happening – Mr Shellshear caught some of it a while ago – but if that were the case I’d expect the comments to bear traces of humanity as well, and as it stands they’re fairly obviously auto-generated. I think. Or written by particularly obtuse and formulaic humans.

Now that I read over that last paragraph again, none of those options stands out as being clearly more likely than the others.

So in a further attempt to narrow down the options, I’ve changed my CAPTCHA again. This has two advantages. One, it’s different enough to the old one that I’m fairly confident that it won’t be immediately legible to any bot that could read it before. And two, the new one looks cool. Also illegible, so I probably won’t leave it this way forever, but come on, it’s sleek.


Best. Spam. Ever.

Good morning. Realize that true happiness lies within you. Waste no time and effort searching for peace and contentment and joy in the world outside. Remember that there is no happiness in having or in getting, but only in giving. Reach out. Share. Smile. Hug. Happiness is a perfume you cannot pour on others without getting a few drops on yourself.

I am from Samoa and learning to read in English, please tell me right I wrote the following sentence: “Infrastructure is a nordic holding member financing reflected by mastercard.”

No comments


Seems like the font change to the CAPTCHA wasn’t enough to hold off the spambots.

Now that I’m thinking about this clearly, the spam doesn’t actually annoy me much. What annoys me is not knowing how they’re doing it. So, instead of trying to stop them, I’ve started logging CAPTCHA attempts. Updates to come if and when I learn anything new.

No comments


In the last few days I’ve been getting spam comments!

This is vaguely surprising, because I have a home-made CAPTCHA (more of an interesting programming exercise than something I desperately needed, but still). My vague understanding of these things is that CAPTCHA-aware spambots are usually narrowly targeted at a particular class of them that are either widely used, or guarding a particularly lucrative high-traffic site. So the fact that spam has been getting through means that one or more of the following is true:

  • My blog is far more popular than the number of real comments I get would seem to suggest.
  • My CAPTCHA resembles a widely-used class of CAPTCHAs closely enough that a generic attack on them is working against mine.
  • There’s some security hole in the CAPTCHA plugin, or WordPress, or some other means I don’t know about to post a WordPress comment that’s allowing spambots to bypass the CAPTCHA.
  • A wide range of different people have recently taken an interest in my blog, all of whom have a peculiarly similar tendency to make comments that are irrelevant to the post in question (but would like to draw my attention to various services to earn me money, enhance my manliness, or provide me with downloads of varying legality).
  • Somewhere, a spambot has gained sentience.

I’ve changed the font in the CAPTCHA to test the first two hypotheses.



Hi there! Didja miss me?

1 comment

Next Page »