Wednesday, February 8th, 2006
Always people who some weird browser. CSS will drive you nuts. Header issues. Firefox extensions. Every broken interaction will hit your database and slow it down. It’ll drive you nuts. Need to get your browser shit fixed up straight off.
Don’t do it. Whatever you think you’re doing to need to deal with is not going to be the problem. Read Cal Henderson and Brad Fitzpatricks’ presentations on how to make things go faster. These things talk about, not scaling but how to deal with databases. How to deal with SQL not being great. Think about splitting data over multiple machines. Things like one webserver talking to 8 dbs will blow up in your face. Test every single SQL query.
Set up a monitoring system because you’ll get paged at 2 am. No good being asleep if your db goes down because it’ll stay down til you wake up.
Understand your db. Lots of apps have tags, but that doesn’t map to SQL at all, so understand tricks and tips. Have different tables so you don’t have partial indexing.
Understand where latency is ok. Figure out where you can be sloppy, and be sloppy, e.g. RSS feeds don’t need to be instant.
Idiots will break stuff in ways you can’t yet imagine. You can’t predict how what they will do.
Do stuff with Apache. Images off a different server, RSS. Throttle.
Were created early on in Del.icio.us. Helps the ‘priesthood’ get a measure of comfort in your system. They want to take their data and go home if you go offline. Make the API easier to get into and out of, the more it will be used. Make it simple. Del.icio.us is just XML.
Unique IDs in your db is a mistake for scaling. But do not expose that ID to the outside world, because some idiot is going to use that to try to scrape your database. That’s a hint that people want the data, but they aren’t going to wait for you to give it to them, so they will hammer your systems to get it.
Significantly influence behaviours. Also, which ones do you leave out? If no tags, Del.icio.us wouldn’t have worked. Try not to add features that exist elsewhere. No need to come up with a new way to do things that are already being done, so don’t use messaging, because we have email already.
When people ask for stuff, that’s important info. Try to understand why they are asking for that. Extract the use case. E.g. people want boolean search on the tags, but hardly anyone searches on more than one tag at at time.
Important in Del.icio.us. It’s another set of tools to get in and out of the system. Everything that could have a feed should have a feed. This was more important a year or two ago, when RSS was the big thing. Now it’s not quite so exciting, and maybe there’s another feature that will cause people to show up.
Need to understand the headers, caching, etc. If you can stash the timestamp then this can save an enormous amount of effort.
Primary vehicle for people to to find your stuff. Don’t do session keys. Leave underlying framework out. It doesn’t help the user.
Also allows you to expose some functionality to the priesthood who care.
Watch for interesting behaviour in the system. If people do things you didn’t expect or intend, that’s neat.
Choose a problem that you yourself have. Had a text file with 20,000 links in it. Couldn’t find stuff anymore. Had a URL, space, then descriptive text, e.g. wifi. That was the first tagging system.
Build a db system which was like Del.icio.us which was single user, for him. Used that for a few years, then built for other users. Del.icio.us came out end 2003.
Don’t look for problems you don’t have, because someone else who is passionate about it will solve it better than you can.
See teasers. Limited betas. All horrible. Every day you don’t have something out in the world you are losing information, feedback, users, reputations. Get it out there, get the release done asap.
Anything like RSS has some element of attention. This is useful, interesting behaviour. We do the ‘what’s been popular in the last 24 hours’, and people pay attention to that. That works reasonably well when the population is small and all biased in the same direction, so that things the group finds interesting the whole group will find interesting.
As the group gets larger, the bias will drift. But you can aggregate with a given tag and then can compensate for the decrease in bias.
Figure out how to keep things on topic or fragment into different piles of attention.
Is attention theft. Spag - tag spam. But people will try to get into any aggregation of attention.
We don’t do a top ten, because it’s biased in favour of stuff that’s been around forever. And, as soon as you do, people will try to get to the top of that list.
Understand what you’re’ building as you build it to avoid these things.
Don’t provide feedback when someone spams, and you figure out the fix. Don’t give them an error message because then they know they’re doing it wrong and will try again.
Is not really about classification or organisation, it’s user interface. It’s a way to store your working state or context. Useful for recall. Ok for discovery because someone might tag similarly to you. Bad for distribution.
Not all metadata is tags. People ask for automatic metadata, but that’s not the value - the value is attention, that you saw it and decided that it was important enough to tag. Auto-tagging doesn’t help you do what you’re trying to do.
If you make it too easy… because there’s a small transaction cost then that adds value. But don’t make them do too much work.
Beware librarians who want an official list of tags.
Why are people there? Have to expect the user to be selfish. But build a system people like and this breeds evangelism.
Watch what you find your time doing. If you spend a lot of time building a feature that no one uses then that’s a waste. Be careful.
Guesswork backed by numbers. If someone is using a feature in week one, do they also use it in week five. Measure the system itself. But also understand that in the data that the system collects, measure behaviour rather than claims. In Del.icio.us there’s no stars, because why would you bookmark something that’s bad? So rely on what people are doing not what they are claiming to the system.
User acceptance testing. Very important that everyone on the team sees the user testing. Don’t give people a list of To Dos and then watch them do that because they have a vastly different behaviour than what they do in real life. People don’t read, don’t follow instructions… people know this intuitively, but it’s far worse than you realise.
Speak the user’s language. Del.icio.us is about bookmarking. Bookmarking was a Netscape or Firefox thing. In IE it’s favourites. So make sure that you speak your user’s language or they’ll leave.
Don’t make users register before they get into your site. Give them as much functionality as possible, even give them anonymous access to start with. Logging in is a big barrier. People want a good idea of what they are going to get if they register because it’s a lot of work. People are afraid of giving out email addresses, spyware, etc. They want to know what they are going to get out of it. You can’t tell them, you have to show them. Let them wander round and get a feel for it. That’s the only way to get them into the system.
Present the appropriate gestures and verbs, save, copy, bookmark, etc.
Then when you have to register, make it as short as possible, and take them back to where they were. Don’t dump them on the front page again.
If you’re doing something different, then understand where you’re breaking away from normal behaviour. Design should be standard - tabs on top, nav, logo top left. What is the structure of the world? Emulate that as far as possible, but be careful of what your design implies, what it makes people expect.
It’s the user’s data, not your data. You get to use it, but it’s still their data. They have to be able to add, modify, delete their data. Up to and including removing themselves and their account from their system.
When Del.icio.us deletes a bookmark, the data’s purged from the system. Once it’s gone it’s gone.
Spent nothing on promoting Del.icio.us, ever. To do this, enable evangelism, so people want other in your system, want to tell people. So enable every communication method possible.
Email is tricky because you don’t want to be a spammer. But RSS. Think of every RSS read as a client app. Viral vectors. Desktop apps can eat a lot of data through http, so figure every app out, and see if you can get in there.
There’s not ‘quite’ a Del.icio.us community, because Del.icio.us is a tool, and the communities are elsewhere. The communities can use the tool, but no need for Del.icio.us to have a community. There are flame wars and all that, interactions that you don’t want people to have.