Can AI Let Justice Be Done?

Terence Mauri has won plaudits for his commentary on disruptive technology. He holds visiting positions at MIT and the London Business school, and his views are widely published.  So when I saw an article about his predictions for the legal system I could be sure it would be thought-provoking.  I wasn’t disappointed: “Robotic judges that can determine guilt will be ‘commonplace’ within 50 years”1. That’s quite a claim, and as Niels Bohr quipped, “Prediction is very difficult, especially about the future…”. But it seemed to me that this moonshot shone a light on a couple of traps that often catch those who aim AI at justice and individual rights.

Mauri’s hypothesis runs like this: data gathered from defendants -- audio, visual, thermal and more -- will be analysed in search of cues to deception, perhaps irregular speech patterns, facial expressions or body temperature variations.  This analysis will be able to detect cues imperceptible to humans, and the signs of deception will be detected with 99.9% accuracy.

There are several concerning failings with this hypothetical approach. The first is critical: Mauri seems to conflate justice with the detection of lies.  That is far too simplistic. Some liars are really good at it, for a start2. But justice, which a lawyer might call the correct application of the law to facts, is more than lie detection. In common-law systems, a crime is committed when a guilty act is perpetrated by a person with a “guilty mind”, or mens rea. So, part of what is going on in a criminal court, and in the mind of the jurors, is an attempt to answer the question of the state of mind of a defendant. Granted, a defendant with the required mens rea might have to lie to hide it, but those lies are unlikely to be binary flags on the question of state of mind.  For example, part of the mens rea for theft is the “intention to permanently deprive” another of the article stolen.  Jurors have to assess the accused’s intention, and whilst a detected lie might help to do so, that lie resides within a panoply of information including other lies, truths, statements, contradictions and inferences that is a human being and his or her evidence.

The law and its application give plenty more pause for thought. Take unlawful discrimination, which can occur without it being intended3. And what of mistake - a truthful but mistaken witness?  What about when the question to answer is whether actions were “reasonable”, whether a belief is “worthy of respect in a democratic society”, or whether a consequence of a negligent act was “foreseeable”?  These legal tests are subtle.

It is true though that machine learning systems are able to find patterns and features that elude human observers. Mauri might be pinning his hopes on AI of that ilk being able to extract the traces of guilt from the wealth of data about a case and its dramatis personae. If only he hadn’t fastened onto lying as a proxy for guilt. If AI is to help in determining guilt or wrongdoing, it will have to be more than just a big lie detector.

A second concern lies with the “acts of faith” that almost always underpin applications of AI to human behaviour. In Mauri’s case, perhaps he has in mind an AI judge that will take in every scrap of data about a defendant and squeeze the juice out of it without helpful labelled training data or hand-coded features. Unfortunately, though, I suspect that the extrapolation is most likely to go something like this: differences between liars and truth-tellers must show up in verbal and non-verbal cues, and in time AI will develop to the point of infallibly picking up those differences. Unfortunately, the science doesn’t encourage that act of faith. Vrij et al (ibid) make some critical observations: that there is an absence of nonverbal and verbal cues uniquely related to deceit -- there is no Pinocchio’s growing nose; that the differences between truth tellers and liars are typically small; liars actively try to appear credible; and as said, some are just very good at it.

It is curious to observe that psychological experiments suffer from defects akin to training set problems. In a typical experimental design, some participants are asked to perform a task and others act as the control.  Mann et al4, for example, sought to understand behavioural change when individuals smuggle illicit items. However, what is the true value of such studies if one is asking “ordinary” people to behave in a criminal manner?  One is observing how non-criminals behave when trying to act as criminals, while one actually wants to observe how criminals behave.

In ML applications we see systems trained on the wrong participants, too. Hiring software is a case in point: the platforms are typically trained on a cohort of employees believed to have the desirable characteristics for the job. That cohort is interviewed and the recordings used to machine-learn behavioural features such as eye contact, range of vocabulary, facial expressions, and vocal intonation. Then candidates undergo a video interview and are scored against the learned features. But let’s check what they are being measured against: the training cohort has “made it” by definition, got the job, and like or not, bears the imprint of society bias; it has too few disabled people, and too many who made it from their connections more than their talent; it has been squeezed into the cultural mould of its employer; it is further down the career path, its members are not nervous about the outcome. And all that on top of the questionable “acts of faith” assuming a link between behavioural features and competence.

I would not want the reader to think me pessimistic, far from it.  I just think extrapolating from current ML applications and approaches, even where they find great success, is too narrow.  My instinct is that AI will change the legal system and its operation profoundly.  But the profound changes will depend on research into deeply human topics: what does it mean to believe someone?  What is fairness and how can we detect its presence? And what is “gut feeling”, the seldom-confessed casting vote in human judgement?  My own act of faith is to think that these very human concepts are mysteries of complexity only, not of stuff that is out of scientific grasp.

Going back to ML applications, the concerns I have expressed might be connected to what Matthew Syed5 would call a lack of cognitive diversity. Thinking about AI applied to individual rights cannot be done well from a tech-dominant position. There are too few voices in the room (in the instant case a lawyer and a psychologist would help). All too often, developers rush to build elegant and sophisticated AI on foundations of sand: because the underlying science is questionable, or in flux, or not interrogated, or just missing; and because they are trying to answer the wrong questions.

Footnotes

1 First appearing in The Telegraph, 19 October 2020

2 Aldert Vrij, Par Anders Granhag, and Stephen Porter, Psychological Science in the Public Interest 11(3) 89–121

3 In English law, motive is generally irrelevant in assessing whether discrimination is lawful

4 In English law, motive is generally irrelevant in assessing whether discrimination is lawful

Mann, S., Deeb, H., Vrij, A., Hope, L., and Pontigia, L. (2020). Detecting smugglers: identifying strategies and behaviours in individuals in possession of illicit objects. Appl. Cogn. Psychol. 34, 372–386. doi: 10.1002/acp.3622

5 “Rebel Ideas”, John Murray (publishers) 2019

Author Bio
Phil is an employment lawyer and former computational physicist. To his delight the latter helps him get under the hood of AI, and the former is a window into profound questions of equality and individual rights. He works to bridge the legal and technical worlds and to advise and comment on their intersection.

Citation
For attribution in academic contexts or books, please cite this work as

Phil Lindan, "Can AI Let Justice Be Done?", The Gradient, 2021.

BibTeX citation:

@article{lindan2021roboticjudges,
author = {Lindan, Phil},
title = {Can AI Let Justice Be Done?},
journal = {The Gradient},
year = {2021},
howpublished = {\url{https://thegradient.pub/robotic-judges/} },
}

If you enjoyed this piece and want to hear more, subscribe to the Gradient and follow us on Twitter.