The case before Judge Kevin Newsom of the 11th U.S. Circuit Court of Appeals involved a landscaper accused of negligence in installing an in-ground trampoline. The central question was whether the landscaper’s insurance policy, covering landscaping work, would apply to this installation.
Judge Newsom incorporated AI technology into his legal reasoning process, as detailed in his May 28 concurrence.
He posed two questions to ChatGPT and Google Gemini (Large Language Models, LLMs):
Both LLMs affirmed that installing an in-ground trampoline could be considered landscaping, as it modifies the appearance and function of outdoor space.
On one hand, Judge Newsom’s judicial approach can be likened to using a dictionary. Alternatively, he may have been using the AI system to establish what an average person might think. This latter interpretation is significant as it touches on how humans may increasingly make decisions about other humans through an AI filter in the future.
The law has a long tradition of considering what an average person would do to assess the reasonableness of someone’s behavior. A prime example is the “man on the Bondi tram” test, used in Australian and British Commonwealth legal systems (known as the “man on the Clapham omnibus” in British jurisdictions).
This legal construct serves as a benchmark for societal standards, positing a hypothetical individual of average intelligence and moral character, typically envisioned as a public transport commuter. Courts have long employed this concept to determine reasonable behavior or interpretation in legal scenarios.
The concept of the reasonable person has its roots in 19th-century English common law. The term “man on the Clapham omnibus” was first mentioned over a century ago by Collins M.R., attributing the phrase to Sir Charles Bowen in the libel case McQuire v. Western Morning News [1903] 2 KB 100.
Since then, the term “reasonable man” or “reasonable person” has been prominently used as a “touchstone” in Australian tort law.1 For example, in McHale v. Watson (1966) 115 CLR 199, McTiernan A.C.J. referred to Lord Macmillan’s formulation:
This definition has been instrumental in shaping the Australian legal understanding of reasonableness. The usage and application of the reasonable person test have been continually articulated and clarified by the High Court.
Justice Mason further refined the concept of reasonableness in a landmark negligence case, Wyong Shire Council v. Shirt (1980) 146 CLR 40, stating:
In Papatonakis v. Australian Telecommunications Commission (1985) 156 CLR 7, Justice Deane elaborated upon the reasonable person test when a level of technical skill was required. Justice Deane considered that the occupier of a “make-shift telephone pole” could not disclaim liability after a linesman fell by relying on the concept of an ordinary reasonable person when a level of technical skill was involved:
The reference to the “hypothetical person on a hypothetical Bondi tram” is specific to Australian jurisprudence. There has also been a Victorian formulation of the “man on the Bourke Street tram”.3 Such terms are said to aid in “effectively look[ing] to what a person of good sense, possessed of the relevant facts, would consider acceptable” in the circumstances.4
These cases collectively establish the framework for applying the reasonableness test in Australian law, providing a robust foundation for judicial decision-making across various legal domains.
Maybe. There are compelling arguments for utilizing AI systems to determine the human mean of reasonable thinking. Some of these arguments include:
AI models trained on large datasets may provide a more comprehensive view of language use than an individual human perspective of what an average person would think.
While not free from biases in training data, AI may offer perspectives less influenced by individual judges’ experiences.
AI models can potentially capture more recent language usage, adapting to evolving meanings faster than traditional resources.
AI could provide more uniform interpretations across similar cases, potentially enhancing legal predictability.
In many ways, the advantages of an AI system to assist with assessing human reasonableness align with the potential usage of AI systems in company boardrooms. See our thinking on Synthetic Directors here.
However, there are challenges:
AI models can produce incorrect information or “hallucinate” responses, raising reliability concerns. Human oversight (in this case, a judge) is essential to mitigate hallucinations and falsehoods in AI models such as LLMs. However, the question of the ‘supply chain’ of a human decision becomes problematic if AI is involved in feeding sub-decisions (such as interim reasonableness) to the final decision or preparing material for the final adjudicator. The presentation of material to a decision-maker is not necessarily a neutral affair and can sway the decision. The supply chain of decision-making is one of Stirling & Rose’s early criticisms of the EU AI Act, noting that AI judicial decision-making is a high-risk activity. In most cases, the preparation of material to be given to a human decision-maker is not considered a high-risk activity.
Incorporating AI into judicial decisions raises questions about process transparency and responsibility assignment. Fundamentally, this assessment deals with whether a question of reasonableness before the court is one of law or fact (or both) – easier in principle than in practice. From an administrative law perspective, determining whether a question is one of law or fact requires its own assessment of reasonableness. This goes beyond the landscaping issue facing Judge Newsom but touches on the core matter of accountability and transparency of decisions, which have been traditionally reserved for the judiciary rather than a less clear LLM. Even if using an LLM could be likened narrowly to reviewing a dictionary to assess the meaning of a term, dictionaries have a clear traceable history in terms of the meanings of words over each edition, and a human edits the dictionary – at least for now (see Human Insight below).
Overreliance on AI for language interpretation could diminish the nuanced understanding that human judges bring to the process.
Ensuring AI models capture diverse perspectives remains a challenge, similar to criticisms of traditional reasonableness tests. This is particularly problematic since almost all available LLMs as of this article’s date are trained on largely USA-based data and exhibit a unique American voice. To reiterate Justice Spigelman’s decision in Waterways Authority, does this appropriately reflect the values and expectations of ordinary Australians? If not, what dataset, prompting, or training is required to achieve this in a uniquely Australian context?
With the advent of agentic AI and robotic AI acting on behalf of human principals, assessing what constitutes the reasonable man on the Bondi tram may become more complex. We must consider whether, beyond just judicial decision-making, day-to-day decision-making will increasingly be a hybrid of delegated AI decision-making and human decision-making.
The lynchpin of the test is fundamentally an intellectually rational individual reflective of the day-to-day populace. Accordingly, is the man on the tram a fully human construct, or will it become an embedded human–AI construct?
Stirling & Rose is an end-to-end corporate law advisory service for the lawyers, the technologists, the founders, and the policy makers.
We specialise in providing corporate advisory services to emerging technology companies.
We are experts in artificial intelligence, digital assets, smart legal contracts, regulation, corporate fundraising, AOs/DAOs, space, quantum, digital identity, robotics, privacy and cybersecurity.
When you pursue new frontiers, we bring the legal infrastructure.
Want to discuss the digital future?
Get in touch at info@stirlingandrose.com | 1800 178 218