I was away over the Easter long weekend and didn’t get to see the whole ‘amazon fail’ thing happen. Actually, I didn’t even really find out about it until today.

The story goes that sometime on Saturday Amazon suddenly stopped returning pro-lesbian/gay/bi/transgender (LGBT) material in search results – in some cases returning anti-LGBT material instead.

I’ll leave the morality of why this was a bad thing alone – other far more eloquent folks have written about it already – and instead offer my view of just how this can happen.

Some people have commented that they simply cannot believe this was a mistake, and that it must have been a deliberate act by someone at Amazon. Others have said that they cannot believe there isn’t more checks and balances against this sort of thing happening.

I don’t have any special knowledge of the specifics of this situation, but I still believe that it’s within the realm of possibility that this could have been a completely unintentional side-effect of another change.

Here’s a hypothetical situation which could explain what happened.

Let’s suppose that someone doing database administration on the product management side of things for Amazon has the ability to make direct, or semi-direct changes to the data in the database. 

Let’s also suppose that in order to prevent human error, the system that Amazon Employees use has a delay of (say) one hour before changes are sent to the actual production database.  That continual one-hour buffer would have some sort of checks to see if there were unusual behaviour – such as updates to more than some reasonable number of records at any one time.

It’s not unlikely that Amazon have a rather complex categorisation system that allows products to be placed in any number of categories, and that categories can belong to other categories. 

From here it’s but a hop-skip-and-blunder into someone updating a series of categories to ensure they’re marked as adult. A command to mark any category with ‘lesbian’ and ’sex’ in the title as Adult might seem fairly reasonable if you aren’t careful.  If this is one small series of category updates, it may not trigger any alerts, even though tends of of products are now categorised as Adult by association to those categories. (See Data Normalization Side note below)

Whenever a human is involved in something, there is always a chance for something to go wrong in entirely unexpected ways. The sign of intelligence though is learning from your mistakes and ensuring you do what you can to prevent it from re-occurring.

In Amazon’s case, this might be ensuring that they add a check to see if there is a major difference in the number of Adult products. But no matter how many checks and reviews you have – People will still manage to break things in new and creative ways.

Further Reading:

Side  note: Data Normalization

Because of the nature of computer databases, it’s encouraged (and very efficient) to group common information together, and reduce duplication of data by keeping only a reference to related information.

An example of this might be in a genealogy system – for any particular person, you have information about who the parents are. Instead of storing the Firstname, Lastname, and Date of Birth of each parent – I would instead store the unique identifier for each of the parent’s record.  This act of storing just the unique identifier is part of Normalisation of data. 

When it comes time to display that information about a person’s parents – you look up the record for each parent and retrieve any information needed then. 
This means you also have only one location to update information about any one person – for instance if a person died, you could put the location and date of death on that person’s record. When I then needed information about people who have living parents – I can cross-reference those tables.

Comments are closed.