How to turn audio clips into realistic lip-synced video


UW (University of Washington) | UW researchers create realistic video from audio files alone

University of Washington researchers at the UW Graphics and Image Laboratory have developed new algorithms that turn audio clips into a realistic, lip-synced video, starting with an existing video of  that person speaking on a different topic.

As detailed in a paper to be presented Aug. 2 at  SIGGRAPH 2017, the team successfully generated a highly realistic video of former president Barack Obama talking about terrorism, fatherhood, job creation and other topics, using audio clips of those speeches and existing weekly video addresses in which he originally spoke on a different topic decades ago.

Realistic audio-to-video conversion has practical applications like improving video conferencing for meetings (streaming audio over the internet takes up far less bandwidth than video, reducing video glitches), or holding a conversation with a historical figure in virtual reality, said Ira Kemelmacher-Shlizerman, an assistant professor at the UW’s Paul G. Allen School of Computer Science & Engineering.


Supasorn Suwajanakorn | Teaser — Synthesizing Obama: Learning Lip Sync from Audio

This beats previous audio-to-video conversion processes, which have involved filming multiple people in a studio saying the same sentences over and over to try to capture how a particular sound correlates to different mouth shapes, which is expensive, tedious and time-consuming. The new machine learning tool may also help overcome the “uncanny valley” problem, which has dogged efforts to create realistic video from audio.

How to do it

A neural network first converts the sounds from an audio file into basic mouth shapes. Then the system grafts and blends those mouth shapes onto an existing target video and adjusts the timing to create a realistic, lip-synced video of the person delivering the new speech. (credit: University of Washington)

1. Find or record a video of the person (or use video chat tools like Skype to create a new video) for the neural network to learn from. There are millions of hours of video that already exist from interviews, video chats, movies, television programs and other sources, the researchers note. (Obama was chosen because there were hours of presidential videos in the public domain.)

2. Train the neural network to watch videos of the person and translate different audio sounds into basic mouth shapes.

3. The system then uses the audio of an individual’s speech to generate realistic mouth shapes, which are then grafted onto and blended with the head of that person. Use a small time shift to enable the neural network to anticipate what the person is going to say next.

4. Currently, the neural network is designed to learn on one individual at a time, meaning that Obama’s voice — speaking words he actually uttered — is the only information used to “drive” the synthesized video. Future steps, however, include helping the algorithms generalize across situations to recognize a person’s voice and speech patterns with less data, with only an hour of video to learn from, for instance, instead of 14 hours.

Fakes of fakes

So the obvious question is: Can you use someone else’s voice on a video (assuming enough videos)? The researchers said they decided against going down the path, but they didn’t say it was impossible.

Even more pernicious: the original video person’s words (not just the voice) could be faked using Princeton/Adobe’s “VoCo” software (when available) — simply by editing a text transcript of their voice recording — or the fake voice itself could be modified.

Or Disney Research’s FaceDirector could be used to edit recorded substitute facial expressions (along with the fake voice) into the video.

However, by reversing the process — feeding video into the neural network instead of just audio — one could also potentially develop algorithms that could detect whether a video is real or manufactured, the researchers note.

The research was funded by Samsung, Google, Facebook, Intel, and the UW Animation Research Labs. You can contact the research team at audiolipsync@cs.washington.edu.


Abstract of Synthesizing Obama: Learning Lip Sync from Audio

Given audio of President Barack Obama, we synthesize a high quality video of him speaking with accurate lip sync, composited into a target video clip. Trained on many hours of his weekly address footage, a recurrent neural network learns the mapping from raw audio features to mouth shapes. Given the mouth shape at each time instant, we synthesize high quality mouth texture, and composite it with proper 3D pose matching to change what he appears to be saying in a target video to match the input audio track. Our approach produces photorealistic results.

In a neurotechnology future, human-rights laws will need to be revisited

New forms of brainwashing include transcranial magnetic stimulation (TMS) to neuromodulate the brain regions responsible for social prejudice and political and religious beliefs, say researchers. (credit: U.S. National Library of Medicine)

New human rights laws to prepare for rapid current advances in neurotechnology that may put “freedom of mind” at risk have been proposed in the open access journal Life Sciences, Society and Policy.

Four new human rights laws could emerge in the near future to protect against exploitation and loss of privacy, the authors of the study suggest: The right to cognitive liberty, the right to mental privacy, the right to mental integrity, and the right to psychological continuity.

Advances in neural engineering, brain imaging, and neurotechnology put freedom of the mind at risk, says Marcello Ienca, lead author and PhD student at the Institute for Biomedical Ethics at the University of Basel. “Our proposed laws would give people the right to refuse coercive and invasive neurotechnology, protect the privacy of data collected by neurotechnology, and protect the physical and psychological aspects of the mind from damage by the misuse of neurotechnology.”

Potential misuses

Sophisticated brain imaging and the development of brain-computer interfaces have moved away from a clinical setting into the consumer domain. There’s a risk that the technology could be misused and create unprecedented threats to personal freedom. For example:

  • Uses in criminal court as a tool for assessing criminal responsibility or even the risk of re-offending.*
  • Consumer companies using brain imaging for “neuromarketing” to understand consumer behavior and elicit desired responses from customers.
  • “Brain decoders” that can turn a person’s brain imaging data into images, text or sound.**
  • Hacking, allowing a third-party to eavesdrop on someone’s mind.***

International human rights laws currently make no specific mention of neuroscience. But as with the genetic revolution, the on-going neurorevolution will require consideration of human-rights laws and even the creation of new ones, the authors suggest.

* “A possibly game-changing use of neurotechnology in the legal field has been illustrated by Aharoni et al. (2013). In this study, researchers followed a group of 96 male prisoners at prison release. Using fMRI, prisoners’ brains were scanned during the performance of computer tasks in which they had to make quick decisions and inhibit impulsive reactions. The researchers followed the ex-convicts for 4 years to see how they behaved. The study results indicate that those individuals showing low activity in a brain region associated with decision-making and action (the Anterior Cingulate Cortex, ACC) are more likely to commit crimes again within 4 years of release (Aharoni et al. 2013). According to the study, the risk of recidivism is more than double in individuals showing low activity in that region of the brain than in individuals with high activity in that region. Their results suggest a “potential neurocognitive biomarker for persistent antisocial behavior”. In other words, brain scans can theoretically help determine whether certain convicted persons are at an increased risk of reoffending if released.” — Marcello Ienca and Roberto Andorno/Life Sciences, Society and Policy

** NASA and Jaguar are jointly developing a technology called Mind Sense, which will measure brainwaves to monitor the driver’s concentration in the car (Biondi and Skrypchuk 2017). If brain activity indicates poor concentration, then the steering wheel or pedals could vibrate to raise the driver’s awareness of the danger. This technology can contribute to reduce the number of accidents caused by drivers who are stressed or distracted. However, it also opens theoretically the possibility for third parties to use brain decoders to eavesdropping on people’s states of mind. — Marcello Ienca and Roberto Andorno/Life Sciences, Society and Policy

*** Criminally motivated actors could selectively erase memories from their victims’ brains to prevent being identified by them later on or simply to cause them harm. On the long term-scenario, they could be used by surveillance and security agencies with the purpose of selectively erasing dangerous, inconvenient from people’s brain as portrayed in the movie Men in Black with the so-called neuralyzer— Marcello Ienca and Roberto Andorno/Life Sciences, Society and Policy


Abstract of Towards new human rights in the age of neuroscience and neurotechnology

Rapid advancements in human neuroscience and neurotechnology open unprecedented possibilities for accessing, collecting, sharing and manipulating information from the human brain. Such applications raise important challenges to human rights principles that need to be addressed to prevent unintended consequences. This paper assesses the implications of emerging neurotechnology applications in the context of the human rights framework and suggests that existing human rights may not be sufficient to respond to these emerging issues. After analysing the relationship between neuroscience and human rights, we identify four new rights that may become of great relevance in the coming decades: the right to cognitive liberty, the right to mental privacy, the right to mental integrity, and the right to psychological continuity.

Elon Musk wants to enhance us as superhuman cyborgs to deal with superintelligent AI

(credit: Neuralink Corp.)

It’s the year 2021. A quadriplegic patient has just had one million “neural lace” microparticles injected into her brain, the world’s first human with an internet communication system using a wireless implanted brain-mind interface — and empowering her as the first superhuman cyborg. …

No, this is not a science-fiction movie plot. It’s the actual first public step — just four years from now — in Tesla CEO Elon Musk’s business plan for his latest new venture, Neuralink. It’s now explained for the first time on Tim Urban’s WaitButWhy blog.

Dealing with the superintelligence existential risk

Such a system would allow for radically improved communication between people, Musk believes. But for Musk, the big concern is AI safety. “AI is obviously going to surpass human intelligence by a lot,” he says. “There’s some risk at that point that something bad happens, something that we can’t control, that humanity can’t control after that point — either a small group of people monopolize AI power, or the AI goes rogue, or something like that.”

“This is what keeps Elon up at night,” says Urban. “He sees it as only a matter of time before superintelligent AI rises up on this planet — and when that happens, he believes that it’s critical that we don’t end up as part of ‘everyone else.’ That’s why, in a future world made up of AI and everyone else, he thinks we have only one good option: To be AI.”

Neural dust: an ultrasonic, low power solution for chronic brain-machine interfaces (credit: Swarm Lab/UC Berkeley)

To achieve his, Neuralink CEO Musk has met with more than 1,000 people, narrowing it down initially to eight experts, such as Paul Merolla, who spent the last seven years as the lead chip designer at IBM on their DARPA-funded SyNAPSE program to design neuromorphic (brain-inspired) chips with 5.4 billion transistors (each with 1 million neurons and 256 million synapses), and Dongjin (DJ) Seo, who while at UC Berkeley designed an ultrasonic backscatter system for powering and communicating with implanted bioelectronics called neural dust for recording brain activity.*

Mesh electronics being injected through sub-100 micrometer inner diameter glass needle into aqueous solution (credit: Lieber Research Group, Harvard University)

Becoming one with AI — a good thing?

Neuralink’s goal its to create a “digital tertiary layer” to augment the brain’s current cortex and limbic layers — a radical high-bandwidth, long-lasting, biocompatible, bidirectional communicative, non-invasively implanted system made up of micron-size (millionth of a meter) particles communicating wirelessly via the cloud and internet to achieve super-fast communication speed and increased bandwidth (carrying more information).

“We’re going to have the choice of either being left behind and being effectively useless or like a pet — you know, like a house cat or something — or eventually figuring out some way to be symbiotic and merge with AI. … A house cat’s a good outcome, by the way.”

Thin, flexible electrodes mounted on top of a biodegradable silk substrate could provide a better brain-machine interface, as shown in this model. (credit: University of Illinois at Urbana-Champaign)

But machine intelligence is already vastly superior to human intelligence in specific areas (such as Google’s Alpha Go) and often inexplicable. So how do we know superintelligence has the best interests of humanity in mind?

“Just an engineering problem”

Musk’s answer: “If we achieve tight symbiosis, the AI wouldn’t be ‘other’  — it would be you and with a relationship to your cortex analogous to the relationship your cortex has with your limbic system.” OK, but then how does an inferior intelligence know when it’s achieved full symbiosis with a superior one — or when AI goes rogue?

Brain-to-brain (B2B) internet communication system: EEG signals representing two words were encoded into binary strings (left) by the sender (emitter) and sent via the internet to a receiver. The signal was then encoded as a series of transcranial magnetic stimulation-generated phosphenes detected by the visual occipital cortex, which the receiver then translated to words (credit: Carles Grau et al./PLoS ONE)

And what about experts in neuroethics, psychology, law? Musk says it’s just “an engineering problem. … If we can just use engineering to get neurons to talk to computers, we’ll have done our job, and machine learning can do much of the rest.”

However, it’s not clear how we could be assured our brains aren’t hacked, spied on, and controlled by a repressive government or by other humans — especially those with a more recently updated software version or covert cyborg hardware improvements.

NIRS/EEG brain-computer interface system using non-invasive near-infrared light for sensing “yes” or “no” thoughts, shown on a model (credit: Wyss Center for Bio and Neuroengineering)

In addition, the devices mentioned in WaitButWhy all require some form of neurosurgery, unlike Facebook’s research project to use non-invasive near-infrared light, as shown in this experiment, for example.** And getting implants for non-medical use approved by the FDA will be a challenge, to grossly understate it.

“I think we are about 8 to 10 years away from this being usable by people with no disability,” says Musk, optimistically. However, Musk does not lay out a technology roadmap for going further, as MIT Technology Review notes.

Nonetheless, Neuralink sounds awesome — it should lead to some exciting neuroscience breakthroughs. And Neuralink now has 16 San Francisco job listings here.

* Other experts: Vanessa Tolosa, Lawrence Livermore National Laboratory, one of the world’s foremost researchers on biocompatible materials; Max Hodak, who worked on the development of some groundbreaking BMI technology at Miguel Nicolelis’s lab at Duke University, Ben Rapoport, Neuralink’s neurosurgery expert, with a Ph.D. in Electrical Engineering and Computer Science from MIT; Tim Hanson, UC Berkeley post-doc and expert in flexible Electrodes for Stable, Minimally-Invasive Neural Recording; Flip Sabes, professor, UCSF School of Medicine expert in cortical physiology, computational and theoretical modeling, and human psychophysics and physiology; and Tim Gardner, Associate Professor of Biology at Boston University, whose lab works on implanting BMIs in birds, to study “how complex songs are assembled from elementary neural units” and learn about “the relationships between patterns of neural activity on different time scales.”

** This binary experiment and the binary Brain-to-brain (B2B) internet communication system mentioned above are the equivalents of the first binary (dot–dash) telegraph message, sent May 24, 1844: ”What hath God wrought?”

Global night-time lights provide unfiltered data on human activities and socio-economic factors

Night-time lights seen from space correlate to everything from electricity consumption and CO2 emissions, to gross domestic product, population and poverty. (credit: NASA)

Researchers from the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS) and the Environmental Defense Fund (EDF) have developed an online tool that incorporates 21 years of night-time lights data to understand and compare changes in human activities in countries around the world.

The research is published in PLOS One.

The tool compares the brightness of a country’s night-time lights with the corresponding electricity consumption, GDP, population, poverty, and emissions of CO2, CH4, N2O, and F-gases since 1992, without relying on national statistics with often differing methodologies and motivations by those collecting them.

Consistent with previous research, the team found the highest correlations between night-time lights and GDP, electricity consumption, and CO2 emissions. Correlations with population, N2O, and CH4 emissions were still slightly less pronounced and, as expected, there was an inverse correlation between the brightness of lights and of poverty.

“This is the most comprehensive tool to date to look at the relationship between night-time lights and a series of socio-economic indicators,” said Gernot Wagner, a research associate at SEAS and coauthor of the paper.

The data source is the Defense Meteorological Satellite Program (DMSP) dataset, providing 21 years worth of night-time data. The researchers also use Google Earth Engine (GEE), a platform recently made available to researchers that allows them to explore more comprehensive global aggregate relationships at national scales between DMSP and a series of economic and environmental variables.


Abstract of Night-time lights: A global, long term look at links to socio-economic trends

We use a parallelized spatial analytics platform to process the twenty-one year totality of the longest-running time series of night-time lights data—the Defense Meteorological Satellite Program (DMSP) dataset—surpassing the narrower scope of prior studies to assess changes in area lit of countries globally. Doing so allows a retrospective look at the global, long-term relationships between night-time lights and a series of socio-economic indicators. We find the strongest correlations with electricity consumption, CO2 emissions, and GDP, followed by population, CH4 emissions, N2O emissions, poverty (inverse) and F-gas emissions. Relating area lit to electricity consumption shows that while a basic linear model provides a good statistical fit, regional and temporal trends are found to have a significant impact.

Brain-imaging headband measures how our minds mirror a speaker when we communicate

A cartoon image of brain “coupling” during communication (credit: Drexel University)

Drexel University biomedical engineers and Princeton University psychologists have used a wearable brain-imaging device called functional near-infrared spectroscopy (fNIRS) to measure brain synchronization when humans interact. fNIRS uses light to measure neural activity in the cortex of the brain (based on blood-oxygenation changes) during real-life situations and can be worn like a headband.

(KurzweilAI recently covered research with a fNIRS brain-computer interface that allows completely locked-in patients to communicate.)

A fNIRS headband (credit: Wyss Center for Bio and Neuroengineering)

Mirroring the speaker’s brain activity

The researchers found that a listener’s brain activity (in brain areas associated with speech comprehension) mirrors the speaker’s brain when he or she is telling a story about a real-life experience, with about a five-second delay. They also found that higher coupling is associated with better understanding.

The researchers believe the system can be used to offer important information about how to better communicate in many different environments, such as how people learn in classrooms and how to improve business meetings and doctor-patient communication. They also mentioned uses in analyzing political rallies and how people handle cable news.

“We now have a tool that can give us richer information about the brain during everyday tasks — such as person-to-person communication — that we could not receive in artificial lab settings or from single brain studies,” said Hasan Ayaz, PhD, an associate research professor in Drexel’s School of Biomedical Engineering, Science and Health Systems, who led the research team.

Traditional brain imaging methods like fMRI have limitations. In particular, fMRI requires subjects to lie down motionlessly in a noisy scanning environment. With this kind of setup, it’s not possible to simultaneously scan the brains of multiple individuals who are speaking face-to-face. Which is why the Drexel researchers turned to a portable fNIRS system, which could probe brain-to-brain coupling question in natural settings.

For their study, a native English speaker and two native Turkish speakers told an unrehearsed, real-life story in their native language. Their stories were recorded and their brains were scanned using fNIRS. Fifteen English speakers then listened to the recording, in addition to a story that was recorded at a live storytelling event.

The researchers targeted the prefrontal and parietal areas of the brain, which include cognitive and higher order areas that are involved in a person’s capacity to discern beliefs, desires, and goals of others. They hypothesized that a listener’s brain activity would correlate with the speaker’s only when listening to a story they understood (the English version). A second objective of the study was to compare the fNIRS results with data from a similar study that had used fMRI to compare the two methods.

They found that when the fNIRS measured the oxygenation and deoxygenation of blood cells in the test subject’s brains, the listeners’ brain activity matched only with the English speakers.* These results also correlated with the previous fMRI study.

The researchers believe the new research supports fNIRS as a viable future tool to study brain-to-brain coupling during social interaction. One can also imagine possible invasive uses in areas such as law enforcement and military interrogation.

The research was published in open-access Scientific Reports on Monday, Feb. 27.

* “During brain-to-brain coupling, activity in areas of prefrontal [in the speaker] and parietal cortex [in the listeners] previously reported to be involved in sentence comprehension were robustly correlated across subjects, as revealed in the inter-subject correlation analysis. As these are task-related (active listening) activation periods (not resting, etc.), the correlations reflect modulation of these regions by the time-varying content of the narratives, and comprise linguistic, conceptual and affective processing.” — Yichuan Liu et al./Scientific Reports)


Abstract of Measuring speaker–listener neural coupling with functional near infrared spectroscopy

The present study investigates brain-to-brain coupling, defined as inter-subject correlations in the hemodynamic response, during natural verbal communication. We used functional near-infrared spectroscopy (fNIRS) to record brain activity of 3 speakers telling stories and 15 listeners comprehending audio recordings of these stories. Listeners’ brain activity was significantly correlated with speakers’ with a delay. This between-brain correlation disappeared when verbal communication failed. We further compared the fNIRS and functional Magnetic Resonance Imaging (fMRI) recordings of listeners comprehending the same story and found a significant relationship between the fNIRS oxygenated-hemoglobin concentration changes and the fMRI BOLD in brain areas associated with speech comprehension. This correlation between fNIRS and fMRI was only present when data from the same story were compared between the two modalities and vanished when data from different stories were compared; this cross-modality consistency further highlights the reliability of the spatiotemporal brain activation pattern as a measure of story comprehension. Our findings suggest that fNIRS can be used for investigating brain-to-brain coupling during verbal communication in natural settings.

Trump considering libertarian reformer to head FDA

The Seasteading Institute wants to create new societies at sea, away from FDA (and other government) regulations. (credit: Seasteading Institute)

President-elect Donald Trump’s transition team is considering libertarian Silicon Valley investor Jim O’Neill, a Peter Thiel associate, to head the Food and Drug Administration, Bloomberg Politics has reported.

O’Neill, the Managing Director of Mithril Capital Management LLC, doesn’t have a medical background, but served in the George W. Bush administration as principal associate deputy secretary at the Department of Health and Human Services. He’s also a board member of the Seasteading Institute, a Thiel-backed venture to create new societies at sea, away from existing governments.

“We should reform FDA so there is approving drugs after their sponsors have demonstrated safety — and let people start using them, at their own risk, but not much risk of safety,” O’Neill said in a speech at the August 2014 Rejuvenation Biotechnology conference. “O’Neill also advocated anti-aging medicine in that speech, saying he believed it was scientifically possible to develop treatments that would reverse aging,” said Bloomberg.

O’Neill’s prospective nomination could also bring about “significant changes to medical cannabis policy and potentially address the regulations that have prevented medical cannabis research,” Mike Liszewski, the director of government affairs at Americans for Safe Access, told ATTN:.

Scott Gottlieb, M.D., a former FDA official and now at the American Enterprise Institute (AEI), is also reportedly under consideration, according to The Hill.

In a recent related announcement, Trump has selected Rep. Tom Price, M.D. (R., Ga.), a leader in the efforts to replace ObamaCare, to be his secretary of Health and Human Services. “His most frequent objection to [the Affordable Care Act] is that it interferes with the ability of patients and doctors to make medical decisions,” The New York Times notes. Price also proposes to deregulate the market for medical services, according to the AEI.

 

Terasem Colloquium in Second Life

The 2016 Terasem Annual Colloquium on the Law of Futuristic Persons will take place in Second Life  in  ”Terasem sim” on Saturday, Dec. 10, 2016 at noon EDT. The main themes: “Legal Aspects of Futuristic Persons: Cyber-Humans” and “A Tribute to the ‘Father of Artificial Intelligence,’ Marvin Minsky, PhD.”

Each year on December 10th, International Human Rights Day, Terasem conducts a Colloquium on the Law of Futuristic Persons. The event seeks to provide the public with informed perspectives regarding the legal rights and obligations of “futuristic persons” via VR events with expert presentations and discussions. Terasem hopes to facilitate development of a body of law covering the rights and obligations of entities that transcend, and yet encompass, conventional conceptions of humanness,” according to Terasem Movement, Inc.

12:10–12:30PM —How Marvin Minsky Inspired Me To Have a Mindclone Living on An O’Neill Space Habitat
Martine Rothblatt, JD, PhD
Co-Founder, Terasem Movement, Inc.
Space Coast, FL
Avatar name: Vitology Destiny

12:30–12:50PM — Formal Interaction

12:50–1:10PM — The Emerging Law of Cyborgs
Woodrow “Woody” Barfield, PhD, JD, LLM
Author: Cyber-Humans: Our Future with Machines
Chapel Hill, NC
Avatar name: WoodyBarfield

1:10–1:30PM — Formal Interaction

1:30–1:50PM — Cyborgs and Family Law Challenges
Rich Lee
Human Enhancement & Augmentation
St. George, UT
Avatar name: RichLee78

1:50–2:10PM — Formal Interaction

2:10–2:30PM — Synthetic Brain Simulations and Mens Rea*
Stephen Thaler, PhD.
President & CEO, Imagination-engines, Inc.
St. Charles, MO
Avatar name: SteveThaler

* Mens Rea refers to criminal intent. Moreover, it is the state of mind indicating culpability which is required by statute as an element of a crime. — Cornell University Legal Information Institute

 

A deep-learning system to alert companies before litigation

(credit: Intraspexion, Inc.)

Imagine a world with less litigation.

That’s the promise of a deep-learning system developed by Intraspexion, Inc. that can alert company or government attorneys to forthcoming risks before getting hit with expensive litigation.

“These risks show up in internal communications such as emails,” said CEO Nick Brestoff. “In-house attorneys have been blind to these risks, so they are stuck with managing the lawsuits.”

Example of employment discrimination indicators buried in emails (credit: Intraspexion, Inc.)

Intraspexion’s first deep learning model has been trained to find the risks of employment discrimination. “What we can do with employment discrimination now we can do with other litigation categories, starting with breach of contract and fraud, and then scaling up to dozens more,” he said.

Brestoff claims that deep learning enables a huge paradigm shift for the legal profession. “We’re going straight after the behemoth of litigation. This shift doesn’t make attorneys better able to know the law; it makes them better able to know the facts, and to know them early enough to do something about them.”

And to prevent huge losses. “As I showed in my book, Preventing Litigation: An Early Warning System), using 10 years of cost (aggregated as $1.6 trillion) and caseload data (about 4 million lawsuits – federal and state — for that same time frame), the average cost per case was at least about $350,000,” Brestoff explained to KurzweilAI in an email.

Brestoff, who studied engineering at Cal Tech before attending law school at USC, will present Intraspexion’s deep learning system in a talk at the AI World Conference & Exposition 2016, November 7–9 in San Francisco.

 

Will AI replace judges and lawyers?

(credit: iStock)

An artificial intelligence method developed by University College London computer scientists and associates has predicted the judicial decisions of the European Court of Human Rights (ECtHR) with 79% accuracy, according to a paper published Monday, Oct. 24 in PeerJ Computer Science.

The method is the first to predict the outcomes of a major international court by automatically analyzing case text using a machine-learning algorithm.*

“We don’t see AI replacing judges or lawyers,” said Nikolaos Aletras, who led the study at UCL Computer Science, “but we think they’d find it useful for rapidly identifying patterns in cases that lead to certain outcomes. It could also be a valuable tool for highlighting which cases are most likely to be violations of the European Convention on Human Rights.”

Judgments correlated with facts rather than legal arguments

(credit: European Court of Human Rights)

In developing the method, the team found that judgments by the ECtHR are highly correlated to non-legal (real-world) facts, rather than direct legal arguments, suggesting that judges of the Court are, in the jargon of legal theory, “realists” rather than “formalists.”

This supports findings from previous studies of the decision-making processes of other high level courts, including the U.S. Supreme Court.

The team of computer and legal scientists extracted case information published by the ECtHR in its publically accessible database (applications made to the court were not available), explained UCL co-author Vasileios Lampos, PhD.

They identified English-language data sets for 584 cases relating to Articles 3, 6 and 8** of the Convention and applied an AI algorithm to find patterns in the text. To prevent bias and mislearning, they selected an equal number of violation and non-violation cases.

Predictions based of analysis of text

The most reliable factors for predicting the court’s decision were found to be the language used as well as the topics and circumstances mentioned in the case text (the “circumstances” section of the text includes information about the case factual background). By combining the information extracted from the abstract “topics” that the cases cover and “circumstances” across data for all three articles, an accuracy of 79% was achieved.

“Previous studies have predicted outcomes based on the nature of the crime, or the policy position of each judge, so this is the first time judgments have been predicted using analysis of text prepared by the court. We expect this sort of tool would improve efficiencies of high level, in demand courts, but to become a reality, we need to test it against more articles and the case data submitted to the court,” added Lampos.

Researchers at the University of Sheffield and the University of Pennsylvania where also involved in the study.

* “We define the problem of the ECtHR case prediction as a binary classification task. We utilise textual features, i.e., N-grams and topics, to train Support Vector Machine (SVM) classifiers. We apply a linear kernel function that facilitates the interpretation of models in a straightforward manner.” — Authors of PeerJ Computer Science paper.

** Article 3 prohibits torture and inhuman and degrading treatment (250 cases); Article 6 protects the right to a fair trial (80 cases); and Article 8 provides a right to respect for one’s “private and family life, his home and his correspondence” (254 cases).


Abstract of Predicting Judicial Decisions of the European Court of Human Rights: A Natural Language Processing Perspective

Recent advances in Natural Language Processing and Machine Learning provide us with the tools to build predictive models that can be used to unveil patterns driving judicial decisions. This can be useful, for both lawyers and judges, as an assisting tool to rapidly identify cases and extract patterns which lead to certain decisions. This paper presents the first systematic study on predicting the outcome of cases tried by the European Court of Human Rights based solely on textual content. We formulate a binary classification task where the input of our classifiers is the textual content extracted from a case and the target output is the actual judgment as to whether there has been a violation of an article of the convention of human rights. Textual information is represented using contiguous word sequences, i.e. N-grams, and topics. Our models can predict the court’s decisions with a strong accuracy (79% on average). Our empirical analysis indicates that the formal facts of a case are the most important predictive factor. This is consistent with the theory of legal realism suggesting that judicial decision-making is significantly affected by the stimulus of the facts. We also observe that the topical content of a case is another important feature in this classification task and explore this relationship further by conducting a qualitative analysis.

Will AI replace judges and lawyers?

(credit: iStock)

An artificial intelligence method developed by University College London computer scientists and associates has predicted the judicial decisions of the European Court of Human Rights (ECtHR) with 79% accuracy, according to a paper published Monday, Oct. 24 in PeerJ Computer Science.

The method is the first to predict the outcomes of a major international court by automatically analyzing case text using a machine-learning algorithm.*

“We don’t see AI replacing judges or lawyers,” said Nikolaos Aletras, who led the study at UCL Computer Science, “but we think they’d find it useful for rapidly identifying patterns in cases that lead to certain outcomes. It could also be a valuable tool for highlighting which cases are most likely to be violations of the European Convention on Human Rights.”

Judgments correlated with facts rather than legal arguments

(credit: European Court of Human Rights)

In developing the method, the team found that judgments by the ECtHR are highly correlated to non-legal (real-world) facts, rather than direct legal arguments, suggesting that judges of the Court are, in the jargon of legal theory, “realists” rather than “formalists.”

This supports findings from previous studies of the decision-making processes of other high level courts, including the U.S. Supreme Court.

The team of computer and legal scientists extracted case information published by the ECtHR in its publically accessible database (applications made to the court were not available), explained UCL co-author Vasileios Lampos, PhD.

They identified English-language data sets for 584 cases relating to Articles 3, 6 and 8** of the Convention and applied an AI algorithm to find patterns in the text. To prevent bias and mislearning, they selected an equal number of violation and non-violation cases.

Predictions based of analysis of text

The most reliable factors for predicting the court’s decision were found to be the language used as well as the topics and circumstances mentioned in the case text (the “circumstances” section of the text includes information about the case factual background). By combining the information extracted from the abstract “topics” that the cases cover and “circumstances” across data for all three articles, an accuracy of 79% was achieved.

“Previous studies have predicted outcomes based on the nature of the crime, or the policy position of each judge, so this is the first time judgments have been predicted using analysis of text prepared by the court. We expect this sort of tool would improve efficiencies of high level, in demand courts, but to become a reality, we need to test it against more articles and the case data submitted to the court,” added Lampos.

Researchers at the University of Sheffield and the University of Pennsylvania where also involved in the study.

* “We define the problem of the ECtHR case prediction as a binary classification task. We utilise textual features, i.e., N-grams and topics, to train Support Vector Machine (SVM) classifiers. We apply a linear kernel function that facilitates the interpretation of models in a straightforward manner.” — Authors of PeerJ Computer Science paper.

** Article 3 prohibits torture and inhuman and degrading treatment (250 cases); Article 6 protects the right to a fair trial (80 cases); and Article 8 provides a right to respect for one’s “private and family life, his home and his correspondence” (254 cases).


Abstract of Predicting Judicial Decisions of the European Court of Human Rights: A Natural Language Processing Perspective

Recent advances in Natural Language Processing and Machine Learning provide us with the tools to build predictive models that can be used to unveil patterns driving judicial decisions. This can be useful, for both lawyers and judges, as an assisting tool to rapidly identify cases and extract patterns which lead to certain decisions. This paper presents the first systematic study on predicting the outcome of cases tried by the European Court of Human Rights based solely on textual content. We formulate a binary classification task where the input of our classifiers is the textual content extracted from a case and the target output is the actual judgment as to whether there has been a violation of an article of the convention of human rights. Textual information is represented using contiguous word sequences, i.e. N-grams, and topics. Our models can predict the court’s decisions with a strong accuracy (79% on average). Our empirical analysis indicates that the formal facts of a case are the most important predictive factor. This is consistent with the theory of legal realism suggesting that judicial decision-making is significantly affected by the stimulus of the facts. We also observe that the topical content of a case is another important feature in this classification task and explore this relationship further by conducting a qualitative analysis.