Database of harassers

There are increasingly frequent proposals for a database of harassers, that is, some central storage of identifying information about people who perpetrate harassment of various kinds.

Database of what?

The proposal seems to be for a database in which one could query a name, a pseudonym, an IP address or other semi-identifying information and at least find out if this is a known harasser or abuser. At the extreme end of data provision, one would receive complete copies of the other reports and the name of the accused if known, or the entire database would be available for browsing.

Purpose of this page

This page is intended to be an overview of requirements for such a database, partly to discourage casual efforts. (Doing it badly is probably worse than not doing it at all.)

If you want to have a detailed discussion of such a database or start making specifications, please use Talk:Database of harassers instead.

Problems that need to be solved

This is a (possibly incomplete) list of problems that need to be considered before making such a database.

What good would it do?

What is the aim of the database?

to gather enough data to get the police to investigate a person seriously?
to identify harassers and abusers so that the geek community can shame them?
to identify harassers and abusers so that future victims can avoid them?
to help survivors understand what happened to them and build networks of supporters?

A concern with #1 and #2 is that the target communities, the police and geeks in general, have a massive resistance to acknowledging the seriousness of the problem and there's no special reason to think that more data will massively change things. #3 has the issue that it makes potential victims responsible for their own safety. #4 is a more understandable goal, but it's important to make sure the database actually meets that need.

Privacy issues

What information is stored about accuser and accused? How long is information stored for? Is it always public, never public, or public in some situations? How long is it public for? What happens in the event of someone recanting their report?

Leak and hack concerns

Such a database would be a major target for sophisticated hackers, both in order to destroy it and simply to have fun. If the database stores non-public information, what steps are taken to protect it from hacking? What about associated information, eg, the email archives of the sysadmins or project leads? They may need equally strong protection.

What about deliberate leaks by, eg, disaffected former staff or volunteers?

Who watches the watchers?

Who gets to access the database? What are these people like, why do people have a reason to trust them? What happens when a report against one of them is filed?

False reports

There are two possible types of malicious false reports:

false reports designed to smear a person (including as a form of retribution)
false reports designed to reduce the credibility of the database

While the rate of false reports of abuse is very overblown in popular discourse, false reports do occur and in a project of this size should be anticipated.

Separately, there's the issue of accusations of false reports, whether true or not, in order to defend an accused or generally smear the database.

Retribution

People naming harassers and accusers may be targeted by either that person or others for further abuse. But at the same time, publishing accusations against people while allowing their accuser to be anonymous is tricky ethically.

Legal issues

Incidents that involve the police and courts

If an incident is separately reported to police, how does the database deal with issues of, for example, contempt of court and restrictions on reporting during trial? Even if not covered by restrictions (due to, say, being not hosted in the same country), there is risk of prejudicing trials.

If charges are dropped, or the alleged offender is tried and found not guilty, what happens to their entry in the database?

Defamation

There's at least a reasonable chance that the hosts of the database could be held liable for defaming (libelling) people listed in the database if the truth of the reports can't be sufficiently verified. (In fact, in some jurisdictions the truth alone isn't a defence: you need to show both truth and public interest to defend against defamation suits.) This will have a chilling effect as even successful defences against legal actions take a lot of time and money.

Related projects

Project Callisto is an open-source reporting system for victims of sexual violence to file an escrowed, encrypted report about their experiences, with the ability to alert victims if multiple reports are filed against the same perpetrator. It is deployed on several US college campuses.

Less formal versions of this already exist:

This wiki's Incidents documentation, which solves some of the problems (privacy especially) by largely summarising publicly available documentation elsewhere.
Websites that identify harassers