A living programmable biocomputing device based on RNA

“Ribocomputing devices” ( yellow) developed by a team at the Wyss Institute can now be used by synthetic biologists to sense and interpret multiple signals in cells and logically instruct their ribosomes (blue and green) to produce different proteins. (credit: Wyss Institute at Harvard University)

Synthetic biologists at Harvard’s Wyss Institute for Biologically Inspired Engineering and associates have developed a living programmable “ribocomputing” device based on networks of precisely designed, self-assembling synthetic RNAs (ribonucleic acid). The RNAs can sense multiple biosignals and make logical decisions to control protein production with high precision.

As reported in Nature, the synthetic biological circuits could be used to produce drugs, fine chemicals, and biofuels or detect disease-causing agents and release therapeutic molecules inside the body. The low-cost diagnostic technologies may even lead to nanomachines capable of hunting down cancer cells or switching off aberrant genes.

Biological logic gates

Similar to a digital circuit, these synthetic biological circuits can process information and make logic-guided decisions, using basic logic operations — AND, OR, and NOT. But instead of detecting voltages, the decisions are based on specific chemicals or proteins, such as toxins in the environment, metabolite levels, or inflammatory signals. The specific ribocomputing parts can be readily designed on a computer.

E. coli bacteria engineered to be ribocomputing devices output a green-glowing protein when they detect a specific set of programmed RNA molecules as input signals (credit: Harvard University)

The research was performed with E. coli bacteria, which regulate the expression of a fluorescent (glowing) reporter protein when the bacteria encounter a specific complex set of intra-cellular stimuli. But the researchers believe ribocomputing devices can work with other host organisms or in extracellular settings.

Previous synthetic biological circuits have only been able to sense a handful of signals, giving them an incomplete picture of conditions in the host cell. They are also built out of different types of molecules, such as DNAs, RNAs, and proteins, that must find, bind, and work together to sense and process signals. Identifying molecules that cooperate well with one another is difficult and makes development of new biological circuits a time-consuming and often unpredictable process.

Brain-like neural networks next

Ribocomputing devices could also be freeze-dried on paper, leading to paper-based biological circuits, including diagnostics that can sense and integrate several disease-relevant signals in a clinical sample, the researchers say.

The next stage of research will focus on the use of RNA “toehold” technology* to produce neural networks within living cells — circuits capable of analyzing a range of excitatory and inhibitory inputs, averaging them, and producing an output once a particular threshold of activity is reached. (Similar to how a neuron averages incoming signals from other neurons.)

Ultimately, researchers hope to induce cells to communicate with one another via programmable molecular signals, forming a truly interactive, brain-like network, according to lead author Alex Green, an assistant professor at Arizona State University’s Biodesign Institute.

Wyss Institute Core Faculty member Peng Yin, Ph.D., who led the study, is also Professor of Systems Biology at Harvard Medical School.

The study was funded by the Wyss Institute’s Molecular Robotics Initiative, a Defense Advanced Research Projects Agency (DARPA) Living Foundries grant, and grants from the National Institute of Health (NIH), the Office of Naval Research (ONR), the National Science Foundation (NSF) and the Defense Threat Reduction Agency (DTRA).

* The team’s approach evolved from its previous development of “toehold switches” in 2014 — programmable hairpin-like nano-structures made of RNA. In principle, RNA toehold wwitches can control the production of a specific protein: when a desired complementary “trigger” RNA, which can be part of the cell’s natural RNA repertoire, is present and binds to the toehold switch, the hairpin structure breaks open. Only then will the cell’s ribosomes get access to the RNA and produce the desired protein.


Wyss Institute | Mechanism of the Toehold Switch


Abstract of Complex cellular logic computation using ribocomputing devices

Synthetic biology aims to develop engineering-driven approaches to the programming of cellular functions that could yield transformative technologies. Synthetic gene circuits that combine DNA, protein, and RNA components have demonstrated a range of functions such as bistability, oscillation, feedback, and logic capabilities. However, it remains challenging to scale up these circuits owing to the limited number of designable, orthogonal, high-performance parts, the empirical and often tedious composition rules, and the requirements for substantial resources for encoding and operation. Here, we report a strategy for constructing RNA-only nanodevices to evaluate complex logic in living cells. Our ‘ribocomputing’ systems are composed of de-novo-designed parts and operate through predictable and designable base-pairing rules, allowing the effective in silico design of computing devices with prescribed configurations and functions in complex cellular environments. These devices operate at the post-transcriptional level and use an extended RNA transcript to co-localize all circuit sensing, computation, signal transduction, and output elements in the same self-assembled molecular complex, which reduces diffusion-mediated signal losses, lowers metabolic cost, and improves circuit reliability. We demonstrate that ribocomputing devices in Escherichia coli can evaluate two-input logic with a dynamic range up to 900-fold and scale them to four-input AND, six-input OR, and a complex 12-input expression (A1 AND A2 AND NOT A1*) OR (B1 AND B2 AND NOT B2*) OR (C1 AND C2) OR (D1 AND D2) OR (E1 AND E2). Successful operation of ribocomputing devices based on programmable RNA interactions suggests that systems employing the same design principles could be implemented in other host organisms or in extracellular settings.

How to ‘talk’ to your computer or car with hand or body poses

Researchers at Carnegie Mellon University’s Robotics Institute have developed a system that can detect and understand body poses and movements of multiple people from a video in real time — including, for the first time, the pose of each individual’s fingers.

The ability to recognize finger or hand poses, for instance, will make it possible for people to interact with computers in new and more natural ways, such as simply pointing at things.

That will also allow robots to perceive you’re doing, what moods you’re in, and whether you can be interrupted, for example. Your self-driving car could get an early warning that a pedestrian is about to step into the street by monitoring your body language. The technology could also be used for behavioral diagnosis and rehabilitation for conditions such as autism, dyslexia, and depression, the researchers say.

This new method was developed at CMU’s NSF-funded Panoptic Studio, a two-story dome embedded with 500 video cameras, but the researchers can now do the same thing with a single camera and laptop computer.

The researchers have released their computer code. It’s already being widely used by research groups, and more than 20 commercial groups, including automotive companies, have expressed interest in licensing the technology, according to Yaser Sheikh, associate professor of robotics.

Tracking multiple people in real time, particularly in social situations where they may be in contact with each other, presents a number of challenges. Sheikh and his colleagues took a bottom-up approach, which first localizes all the body parts in a scene — arms, legs, faces, etc. — and then associates those parts with particular individuals.

Sheikh and his colleagues will present reports on their multiperson and hand-pose detection methods at CVPR 2017, the Computer Vision and Pattern Recognition Conference, July 21–26 in Honolulu.

Radical new vertically integrated 3D chip design combines computing and data storage

Four vertical layers in new 3D nanosystem chip. Top (fourth layer): sensors and more than one million carbon-nanotube field-effect transistor (CNFET) logic inverters; third layer, on-chip non-volatile RRAM (1 Mbit memory); second layer, CNFET logic with classification accelerator (to identify sensor inputs); first (bottom) layer, silicon FET logic. (credit: Max M. Shulaker et al./Nature)

A radical new 3D chip that combines computation and data storage in vertically stacked layers — allowing for processing and storing massive amounts of data at high speed in future transformative nanosystems — has been designed by researchers at Stanford University and MIT.

The new 3D-chip design* replaces silicon with carbon nanotubes (sheets of 2-D graphene formed into nanocylinders) and integrates resistive random-access memory (RRAM) cells.

Carbon-nanotube field-effect transistors (CNFETs) are an emerging transistor technology that can scale beyond the limits of silicon MOSFETs (conventional chips), and promise an order-of-magnitude improvement in energy-efficient computation. However, experimental demonstrations of CNFETs so far have been small-scale and limited to integrating only tens or hundreds of devices (see earlier 2015 Stanford research, “Skyscraper-style carbon-nanotube chip design…”).

The researchers integrated more than 1 million RRAM cells and 2 million carbon-nanotube field-effect transistors in the chip, making it the most complex nanoelectronic system ever made with emerging nanotechnologies, according to the researchers. RRAM is an emerging memory technology that promises high-capacity, non-volatile data storage, with improved speed, energy efficiency, and density, compared to dynamic random-access memory (DRAM).

Instead of requiring separate components, the RRAM cells and carbon nanotubes are built vertically over one another, creating a dense new 3D computer architecture** with interleaving layers of logic and memory. By using ultradense through-chip vias (electrical interconnecting wires passing between layers), the high delay with conventional wiring between computer components is eliminated.

The new 3D nanosystem can capture massive amounts of data every second, store it directly on-chip, perform in situ processing of the captured data, and produce “highly processed” information. “Such complex nanoelectronic systems will be essential for future high-performance, highly energy-efficient electronic systems,” the researchers say.

How to combine computation and storage

Illustration of separate CPU (bottom) and RAM memory (top) in current computer architecture (images credit: iStock)

The new chip design aims to replace current chip designs, which separate computing and data storage, resulting in limited-speed connections.

Separate 2D chips have been required because “building conventional silicon transistors involves extremely high temperatures of over 1,000 degrees Celsius,” explains lead author Max Shulaker, an assistant professor of electrical engineering and computer science at MIT and lead author of a paper published July 5, 2017 in the journal Nature. “If you then build a second layer of silicon circuits on top, that high temperature will damage the bottom layer of circuits.”

Instead, carbon nanotube circuits and RRAM memory can be fabricated at much lower temperatures: below 200 C. “This means they can be built up in layers without harming the circuits beneath,” says Shulaker.

Overcoming communication and computing bottlenecks

As applications analyze increasingly massive volumes of data, the limited rate at which data can be moved between different chips is creating a critical communication “bottleneck.” And with limited real estate on increasingly miniaturized chips, there is not enough room to place chips side-by-side.

At the same time, embedded intelligence in areas ranging from autonomous driving to personalized medicine is now generating huge amounts of data, but silicon transistors are no longer improving at the historic rate that they have for decades.

Instead, three-dimensional integration is the most promising approach to continue the technology-scaling path set forth by Moore’s law, allowing an increasing number of devices to be integrated per unit volume, according to Jan Rabaey, a professor of electrical engineering and computer science at the University of California at Berkeley, who was not involved in the research.

Three-dimensional integration “leads to a fundamentally different perspective on computing architectures, enabling an intimate interweaving of memory and logic,” he says. “These structures may be particularly suited for alternative learning-based computational paradigms such as brain-inspired systems and deep neural nets, and the approach presented by the authors is definitely a great first step in that direction.”

The new 3D design provides several benefits for future computing systems, including:

  • Logic circuits made from carbon nanotubes can be an order of magnitude more energy-efficient compared to today’s logic made from silicon.
  • RRAM memory is denser, faster, and more energy-efficient compared to conventional DRAM (dynamic random-access memory) devices.
  • The dense through-chip vias (wires) can enable vertical connectivity that is 1,000 times more dense than conventional packaging and chip-stacking solutions allow, which greatly improves the data communication bandwidth between vertically stacked functional layers. For example, each sensor in the top layer can connect directly to its respective underlying memory cell with an inter-layer via. This enables the sensors to write their data in parallel directly into memory and at high speed.
  • The design is compatible in both fabrication and design with today’s CMOS silicon infrastructure.

Shulaker next plans to work with Massachusetts-based semiconductor company Analog Devices to develop new versions of the system.

This work was funded by the Defense Advanced Research Projects Agency, the National Science Foundation, Semiconductor Research Corporation, STARnet SONIC, and member companies of the Stanford SystemX Alliance.

* As a working-prototype demonstration of the potential of the technology, the researchers took advantage of the ability of carbon nanotubes to also act as sensors. On the top layer of the chip, they placed more than 1 million carbon nanotube-based sensors, which they used to detect and classify ambient gases for detecting signs of disease by sensing particular compounds in a patient’s breath, says Shulaker. By layering sensing, data storage, and computing, the chip was able to measure each of the sensors in parallel, and then write directly into its memory, generating huge bandwidth in just one device, according to Shulaker. The top layer could be replaced with additional computation or data storage subsystems, or with other forms of input/output, he explains.

** Previous R&D in 3D chip technologies and their limitations are covered here, noting that “in general, 3D integration is a broad term that includes such technologies as 3D wafer-level packaging (3DWLP); 2.5D and 3D interposer-based integration; 3D stacked ICs (3D-SICs), monolithic 3D ICs; 3D heterogeneous integration; and 3D systems integration.” The new Stanford-MIT nanosystem design significantly expands this definition.


Abstract of Three-dimensional integration of nanotechnologies for computing and data storage on a single chip

The computing demands of future data-intensive applications will greatly exceed the capabilities of current electronics, and are unlikely to be met by isolated improvements in transistors, data storage technologies or integrated circuit architectures alone. Instead, transformative nanosystems, which use new nanotechnologies to simultaneously realize improved devices and new integrated circuit architectures, are required. Here we present a prototype of such a transformative nanosystem. It consists of more than one million resistive random-access memory cells and more than two million carbon-nanotube field-effect transistors—promising new nanotechnologies for use in energy-efficient digital logic circuits and for dense data storage—fabricated on vertically stacked layers in a single chip. Unlike conventional integrated circuit architectures, the layered fabrication realizes a three-dimensional integrated circuit architecture with fine-grained and dense vertical connectivity between layers of computing, data storage, and input and output (in this instance, sensing). As a result, our nanosystem can capture massive amounts of data every second, store it directly on-chip, perform in situ processing of the captured data, and produce ‘highly processed’ information. As a working prototype, our nanosystem senses and classifies ambient gases. Furthermore, because the layers are fabricated on top of silicon logic circuitry, our nanosystem is compatible with existing infrastructure for silicon-based technologies. Such complex nano-electronic systems will be essential for future high-performance and highly energy-efficient electronic systems.

Graphene-based computer would be 1,000 times faster than silicon-based, use 100th the power

How a graphene-based transistor would work. A graphene nanoribbon (GNR) is created by unzipping (opening up) a portion of a carbon nanotube (CNT) (the flat area, shown with pink arrows above it). The GRN switching is controlled by two surrounding parallel CNTs. The magnitudes and relative directions of the control current, ICTRL (blue arrows) in the CNTs determine the rotation direction of the magnetic fields, B (green). The magnetic fields then control the GNR magnetization (based on the recent discovery of negative magnetoresistance), which causes the GNR to switch from resistive (no current) to conductive, resulting in current flow, IGNR (pink arrows) — in other words, causing the GNR to act as a transistor gate. The magnitude of the current flow through the GNR functions as the binary gate output — with binary 1 representing the current flow of the conductive state and binary 0 representing no current (the resistive state). (credit: Joseph S. Friedman et al./Nature Communications)

A future graphene-based transistor using spintronics could lead to tinier computers that are a thousand times faster and use a hundredth of the power of silicon-based computers.

The radical transistor concept, created by a team of researchers at Northwestern University, The University of Texas at Dallas, University of Illinois at Urbana-Champaign, and University of Central Florida, is explained this month in an open-access paper in the journal Nature Communications.

Transistors act as on and off switches. A series of transistors in different arrangements act as logic gates, allowing microprocessors to solve complex arithmetic and logic problems. But the speed of computer microprocessors that rely on silicon transistors has been relatively stagnant since around 2005, with clock speeds mostly in the 3 to 4 gigahertz range.

Clock speeds approaching the terahertz range

The researchers discovered that by applying a magnetic field to a graphene ribbon (created by unzipping a carbon nanotube), they could change the resistance of current flowing through the ribbon. The magnetic field — controlled by increasing or decreasing the current through adjacent carbon nanotubes — increased or decreased the flow of current.

A cascading series of graphene transistor-based logic circuits could produce a massive jump, with clock speeds approaching the terahertz range — a thousand times faster.* They would also be smaller and substantially more efficient, allowing device-makers to shrink technology and squeeze in more functionality, according to Ryan M. Gelfand, an assistant professor in The College of Optics & Photonics at the University of Central Florida.

The researchers hope to inspire the fabrication of these cascaded logic circuits to stimulate a future transformative generation of energy-efficient computing.

* Unlike other spintronic logic proposals, these new logic gates can be cascaded directly through the carbon materials without requiring intermediate circuits and amplification between gates. That would result in compact circuits with reduced area that are far more efficient than with CMOS switching, which is limited by charge transfer and accumulation from RLC (resistance-inductance-capacitance) interconnect delays.


Abstract of Cascaded spintronic logic with low-dimensional carbon

Remarkable breakthroughs have established the functionality of graphene and carbon nanotube transistors as replacements to silicon in conventional computing structures, and numerous spintronic logic gates have been presented. However, an efficient cascaded logic structure that exploits electron spin has not yet been demonstrated. In this work, we introduce and analyse a cascaded spintronic computing system composed solely of low-dimensional carbon materials. We propose a spintronic switch based on the recent discovery of negative magnetoresistance in graphene nanoribbons, and demonstrate its feasibility through tight-binding calculations of the band structure. Covalently connected carbon nanotubes create magnetic fields through graphene nanoribbons, cascading logic gates through incoherent spintronic switching. The exceptional material properties of carbon materials permit Terahertz operation and two orders of magnitude decrease in power-delay product compared to cutting-edge microprocessors. We hope to inspire the fabrication of these cascaded logic circuits to stimulate a transformative generation of energy-efficient computing.

High-speed light-based systems could replace supercomputers for certain ‘deep learning’ calculations

(a) Optical micrograph of an experimentally fabricated on-chip optical interference unit; the physical region where the optical neural network program exists is highlighted in gray. A programmable nanophotonic processor uses a field-programmable gate array (similar to an FPGA integrated circuit ) — an array of interconnected waveguides, allowing the light beams to be modified as needed for a specific deep-learning matrix computation. (b) Schematic illustration of the optical neural network program, which performs matrix multiplication and amplification fully optically. (credit: Yichen Shen et al./Nature Photonics)

A team of researchers at MIT and elsewhere has developed a new approach to deep learning systems — using light instead of electricity, which they say could vastly improve the speed and efficiency of certain deep-learning computations.

Deep-learning systems are based on artificial neural networks that mimic the way the brain learns from an accumulation of examples. They can enable technologies such as face- and voice-recognition software, or scour vast amounts of medical data to find patterns that could be useful diagnostically, for example.

But the computations these systems carry out are highly complex and demanding, even for supercomputers. Traditional computer architectures are not very efficient for calculations needed for neural-network tasks that involve repeated multiplications of matrices (arrays of numbers). These can be computationally intensive for conventional CPUs or even GPUs.

Programmable nanophotonic processor

Instead, the new approach uses an optical device that the researchers call a “programmable nanophotonic processor.” Multiple light beams are directed in such a way that their waves interact with each other, producing interference patterns that “compute” the intended operation.

The optical chips using this architecture could, in principle, carry out dense matrix multiplications (the most power-hungry and time-consuming part in AI algorithms) for learning tasks much faster, compared to conventional electronic chips. The researchers expect a computational speed enhancement of at least two orders of magnitude over the state-of-the-art and three orders of magnitude in power efficiency.

“This chip, once you tune it, can carry out matrix multiplication with, in principle, zero energy, almost instantly,” says Marin Soljacic, one of the MIT researchers on the team.

To demonstrate the concept, the team set the programmable nanophotonic processor to implement a neural network that recognizes four basic vowel sounds. Even with the prototype system, they were able to achieve a 77 percent accuracy level, compared to about 90 percent for conventional systems. There are “no substantial obstacles” to scaling up the system for greater accuracy, according to Soljacic.

The team says is will still take a lot more time and effort to make this system useful. However, once the system is scaled up and fully functioning, the low-power system should find many uses, especially for situations where power is limited, such as in self-driving cars, drones, and mobile consumer devices. Other uses include signal processing for data transmission and computer centers.

The research was published Monday (June 12, 2017) in a paper in the journal Nature Photonics (open-access version available on arXiv).

The team also included researchers at Elenion Technologies of New York and the Université de Sherbrooke in Quebec. The work was supported by the U.S. Army Research Office through the Institute for Soldier Nanotechnologies, the National Science Foundation, and the Air Force Office of Scientific Research.


Abstract of Deep learning with coherent nanophotonic circuits

Artificial neural networks are computational network models inspired by signal processing in the brain. These models have dramatically improved performance for many machine-learning tasks, including speech and image recognition. However, today’s computing hardware is inefficient at implementing neural networks, in large part because much of it was designed for von Neumann computing schemes. Significant effort has been made towards developing electronic architectures tuned to implement artificial neural networks that exhibit improved computational speed and accuracy. Here, we propose a new architecture for a fully optical neural network that, in principle, could offer an enhancement in computational speed and power efficiency over state-of-the-art electronics for conventional inference tasks. We experimentally demonstrate the essential part of the concept using a programmable nanophotonic processor featuring a cascaded array of 56 programmable Mach–Zehnder interferometers in a silicon photonic integrated circuit and show its utility for vowel recognition.

Princeton/Adobe technology will let you edit voices like text

Technology developed by Princeton University computer scientists may do for audio recordings of the human voice what word processing software did for the written word and Adobe Photoshop did for images.

“VoCo” software, still in the research stage, makes it easy to add or replace a word in an audio recording of a human voice by simply editing a text transcript of the recording. New words are automatically synthesized in the speaker’s voice — even if they don’t appear anywhere else in the recording.

The system uses a sophisticated algorithm to learn and recreate the sound of a particular voice. It could one day make editing podcasts and narration in videos much easier, or in the future, create personalized robotic voices that sound natural, according to co-developer Adam Finkelstein, a professor of computer science at Princeton. Or people who have lost their voices due to injury or disease might be able to recreate their voices through a robotic system, but one that sounds natural.

An earlier version of VoCo was announced in November 2016. A paper describing the current VoCo development will be published in the July issue of the journal Transactions on Graphics (an open-access preprint is available).


How it works (technical description)

VoCo allows people to edit audio recordings with the ease of changing words on a computer screen. The system inserts new words in the same voice as the rest of the recording. (credit: Professor Adam Finkelstein)

VoCo’s user interface looks similar to other audio editing software such as the podcast editing program Audacity, with a waveform of the audio track and cut, copy and paste tools for editing. But VoCo also augments the waveform with a text transcript of the track and allows the user to replace or insert new words that don’t already exist in the track by simply typing in the transcript. When the user types the new word, VoCo updates the audio track, automatically synthesizing the new word by stitching together snippets of audio from elsewhere in the narration.

VoCo is is based on an optimization algorithm that searches the voice recording and chooses the best possible combinations of phonemes (partial word sounds) to build new words in the user’s voice. To do this, it needs to find the individual phonemes and sequences of them that stitch together without abrupt transitions. It also needs to be fitted into the existing sentence so that the new word blends in seamlessly. Words are pronounced with different emphasis and intonation depending on where they fall in a sentence, so context is important.

Advanced VoCo editors can manually adjust pitch profile, amplitude and snippet duration. Novice users can choose from a predefined set of pitch profiles (bottom), or record their own voice as an exemplar to control pitch and timing (top). (credit: Professor Adam Finkelstein)

For clues about this context, VoCo looks to an audio track of the sentence that is automatically synthesized in artificial voice from the text transcript — one that sounds robotic to human ears. This recording is used as a point of reference in building the new word. VoCo then matches the pieces of sound from the real human voice recording to match the word in the synthesized track — a technique known as “voice conversion,” which inspired the project name, VoCo.

In case the synthesized word isn’t quite right, VoCo offers users several versions of the word to choose from. The system also provides an advanced editor to modify pitch and duration, allowing expert users to further polish the track.

To test how effective their system was a producing authentic sounding edits, the researchers asked people to listen to a set of audio tracks, some of which had been edited with VoCo and other that were completely natural. The fully automated versions were mistaken for real recordings more than 60 percent of the time.

The Princeton researchers are currently refining the VoCo algorithm to improve the system’s ability to integrate synthesized words more smoothly into audio tracks. They are also working to expand the system’s capabilities to create longer phrases or even entire sentences synthesized from a narrator’s voice.


Fake news videos?

Disney Research’s FaceDirector allows for editing recorded facial expressions and voice into a video (credit: Disney Research)

A key use for VoCo might be in intelligent personal assistants like Apple’s Siri, Google Assistant, Amazon’s Alexa, and Microsoft’s Cortana, or for using movie actors’ voices from old films in new ones, Finkelstein suggests.

But there are obvious concerns about fraud. It might even be possible to create a convincing fake video. Video clips with different facial expressions and lip movements (using Disney Research’s FaceDirector, for example) could be edited in and matched to associated fake words and other audio (such as background noise and talking), along with green screen to create fake backgrounds.

With billions of people now getting their news online and unfiltered, augmented-reality coming, and hacking way out of control, things may get even weirder. …

Zeyu Jin, a Princeton graduate student advised by Finkelstein, will present the work at the Association for Computing Machinery SIGGRAPH conference in July. The work at Princeton was funded by the Project X Fund, which provides seed funding to engineers for pursuing speculative projects. The Princeton researchers collaborated with scientists Gautham Mysore, Stephen DiVerdi, and Jingwan Lu at Adobe Research. Adobe has not announced availability of a commercial version of VoCo, or plans to integrate VoCo into Adobe Premiere Pro (or FaceDirector).


Abstract of VoCo: Text-based Insertion and Replacement in Audio Narration

Editing audio narration using conventional software typically involves many painstaking low-level manipulations. Some state of the art systems allow the editor to work in a text transcript of the narration, and perform select, cut, copy and paste operations directly in the transcript; these operations are then automatically applied to the waveform in a straightforward manner. However, an obvious gap in the text-based interface is the ability to type new words not appearing in the transcript, for example inserting a new word for emphasis or replacing a misspoken word. While high-quality voice synthesizers exist today, the challenge is to synthesize the new word in a voice that matches the rest of the narration. This paper presents a system that can synthesize a new word or short phrase such that it blends seamlessly in the context of the existing narration. Our approach is to use a text to speech synthesizer to say the word in a generic voice, and then use voice conversion to convert it into a voice that matches the narration. Offering a range of degrees of control to the editor, our interface supports fully automatic synthesis, selection among a candidate set of alternative pronunciations, fine control over edit placements and pitch profiles, and even guidance by the editors own voice. The paper presents studies showing that the output of our method is preferred over baseline methods and often indistinguishable from the original voice.

Best of MOOGFEST 2017

The Moogfest four-day festival in Durham, North Carolina next weekend (May 18 — 21) explores the future of technology, art, and music. Here are some of the sessions that may be especially interesting to KurzweilAI readers. Full #Moogfest2017 Program Lineup.

Culture and Technology

(credit: Google)

The Magenta by Google Brain team will bring its work to life through an interactive demo plus workshops on the creation of art and music through artificial intelligence.

Magenta is a Google Brain project to ask and answer the questions, “Can we use machine learning to create compelling art and music? If so, how? If not, why not?” It’s first a research project to advance the state-of-the art and creativity in music, video, image and text generation and secondly, Magenta is building a community of artists, coders, and machine learning researchers.

The interactive demo will go through a improvisation along with the machine learning models, much like the Al Jam Session. The workshop will cover how to use the open source library to build and train models and interact with them via MIDI.

Technical reference: Magenta: Music and Art Generation with Machine Intelligence


TEDx Talks | Music and Art Generation using Machine Learning | Curtis Hawthorne | TEDxMountainViewHighSchool


Miguel Nicolelis (credit: Duke University)

Miguel A. L. Nicolelis, MD, PhD will discuss state-of-the-art research on brain-machine interfaces, which make it possible for the brains of primates to interact directly and in a bi-directional way with mechanical, computational and virtual devices. He will review a series of recent experiments using real-time computational models to investigate how ensembles of neurons encode motor information. These experiments have revealed that brain-machine interfaces can be used not only to study fundamental aspects of neural ensemble physiology, but they can also serve as an experimental paradigm aimed at testing the design of novel neuroprosthetic devices.

He will also explore research that raises the hypothesis that the properties of a robot arm, or other neurally controlled tools, can be assimilated by brain representations as if they were extensions of the subject’s own body.

Theme: Transhumanism


Dervishes at Royal Opera House with Matthew Herbert (credit: ?)

Andy Cavatorta (MIT Media Lab) will present a conversation and workshop on a range of topics including the four-century history of music and performance at the forefront of technology. Known as the inventor of Bjork’s Gravity Harp, he has collaborated on numerous projects to create instruments using new technologies that coerce expressive music out of fire, glass, gravity, tiny vortices, underwater acoustics, and more. His instruments explore technologically mediated emotion and opportunities to express the previously inexpressible.

Theme: Instrument Design


Berklee College of Music

Michael Bierylo (credit: Moogfest)

Michael Bierylo will present his Modular Synthesizer Ensemble alongside the Csound workshops from fellow Berklee Professor Richard Boulanger.

Csound is a sound and music computing system originally developed at MIT Media Lab and can most accurately be described as a compiler or a software that takes textual instructions in the form of source code and converts them into object code which is a stream of numbers representing audio. Although it has a strong tradition as a tool for composing electro-acoustic pieces, it is used by composers and musicians for any kind of music that can be made with the help of the computer and has traditionally being used in a non-interactive score driven context, but nowadays it is mostly used in in a real-time context.

Michael Bierylo serves as the Chair of the Electronic Production and Design Department, which offers students the opportunity to combine performance, composition, and orchestration with computer, synthesis, and multimedia technology in order to explore the limitless possibilities of musical expression.


Berklee College of Music | Electronic Production and Design (EPD) at Berklee College of Music


Chris Ianuzzi (credit: William Murray)

Chris Ianuzzi, a synthesist of Ciani-Musica and past collaborator with pioneers such as Vangelis and Peter Baumann, will present a daytime performance and sound exploration workshops with the B11 braininterface and NeuroSky headset–a Brainwave Sensing Headset.

Theme: Hacking Systems


Argus Project (credit: Moogfest)

The Argus Project from Gan Golan and Ron Morrison of NEW INC is a wearable sculpture, video installation and counter-surveillance training, which directly intersects the public debate over police accountability. According to ancient Greek myth, Argus Panoptes was a giant with 100 eyes who served as an eternal watchman, both for – and against – the gods.

By embedding an array of camera “eyes” into a full body suit of tactical armor, the Argus exo-suit creates a “force field of accountability” around the bodies of those targeted. While some see filming the police as a confrontational or subversive act, it is in fact, a deeply democratic one.  The act of bearing witness to the actions of the state – and showing them to the world – strengthens our society and institutions. The Argus Project is not so much about an individual hero, but the Citizen Body as a whole. In between one of the music acts, a presentation about the project will be part of the Protest Stage.

Argus Exo Suit Design (credit: Argus Project)

Theme: Protest


Found Sound Nation (credit: Moogfest)

Democracy’s Exquisite Corpse from Found Sound Nation and Moogfest, an immersive installation housed within a completely customized geodesic dome, is a multi-person instrument and music-based round-table discussion. Artists, activists, innovators, festival attendees and community engage in a deeply interactive exploration of sound as a living ecosystem and primal form of communication.

Within the dome, there are 9 unique stations, each with their own distinct set of analog or digital sound-making devices. Each person’s set of devices is chained to the person sitting next to them, so that everybody’s musical actions and choices affect the person next to them, and thus affect everyone else at the table. This instrument is a unique experiment in how technology and the instinctive language of sound can play a role in the shaping of a truly collective unconscious.

Theme: Protest


(credit: Land Marking)

Land Marking, from Halsey Burgund and Joe Zibkow of MIT Open Doc Lab, is a mobile-based music/activist project that augments the physical landscape of protest events with a layer of location-based audio contributed by event participants in real-time. The project captures the audioscape and personal experiences of temporary, but extremely important, expressions of discontent and desire for change.

Land Marking will be teaming up with the Protest Stage to allow Moogfest attendees to contribute their thoughts on protests and tune into an evolving mix of commentary and field recordings from others throughout downtown Durham. Land Marking is available on select apps.

Theme: Protest


Taeyoon Choi (credit: Moogfest)

Taeyoon Choi, an artist and educator based in New York and Seoul, who will be leading a Sign Making Workshop as one of the Future Thought leaders on the Protest Stage. His art practice involves performance, electronics, drawings and storytelling that often leads to interventions in public spaces.

Taeyoon will also participate in the Handmade Computer workshop to build a1 Bit Computer, which demonstrates how binary numbers and boolean logic can be configured to create more complex components. On their own these components aren’t capable of computing anything particularly useful, but a computer is said to be Turing complete if it includes all of them, at which point it has the extraordinary ability to carry out any possible computation. He has participated in numerous workshops at festivals around the world, from Korea to Scotland, but primarily at the School for Poetic Computation (SFPC) — an artist run school co-founded by Taeyoon in NYC. Taeyoon Choi’s Handmade Computer projects.

Theme: Protest


(credit: Moogfest)

irlbb from Vivan Thi Tang, connects individuals after IRL (in real life) interactions and creates community that otherwise would have been missed. With a customized beta of the app for Moogfest 2017, irlbb presents a unique engagement opportunity.

Theme: Protest


Ryan Shaw and Michael Clamann (credit: Duke University)

Duke Professors Ryan Shaw, and Michael Clamann will lead a daily science pub talk series on topics that include future medicine, humans and anatomy, and quantum physics.

Ryan is a pioneer in mobile health—the collection and dissemination of information using mobile and wireless devices for healthcare–working with faculty at Duke’s Schools of Nursing, Medicine and Engineering to integrate mobile technologies into first-generation care delivery systems. These technologies afford researchers, clinicians, and patients a rich stream of real-time information about individuals’ biophysical and behavioral health in everyday environments.

Michael Clamann is a Senior Research Scientist in the Humans and Autonomy Lab (HAL) within the Robotics Program at Duke University, an Associate Director at UNC’s Collaborative Sciences Center for Road Safety, and the Lead Editor for Robotics and Artificial Intelligence for Duke’s SciPol science policy tracking website. In his research, he works to better understand the complex interactions between robots and people and how they influence system effectiveness and safety.

Theme: Hacking Systems


Dave Smith (credit: Moogfest)

Dave Smith, the iconic instrument innovator and Grammy-winner, will lead Moogfest’s Instruments Innovators program and host a headlining conversation with a leading artist revealed in next week’s release. He will also host a masterclass.

As the original founder of Sequential Circuits in the mid-70s and Dave designed the Prophet-5––the world’s first fully-programmable polyphonic synth and the first musical instrument with an embedded microprocessor. From the late 1980’s through the early 2000’s he has worked to develop next level synths with the likes of the Audio Engineering Society, Yamaha, Korg, Seer Systems (for Intel). Realizing the limitations of software, Dave returned to hardware and started Dave Smith Instruments (DSI), which released the Evolver hybrid analog/digital synthesizer in 2002. Since then the DSI product lineup has grown to include the Prophet-6, OB-6, Pro 2, Prophet 12, and Prophet ’08 synthesizers, as well as the Tempest drum machine, co-designed with friend and fellow electronic instrument designer Roger Linn.

Theme: Future Thought


Dave Rossum, Gerhard Behles, and Lars Larsen (credit: Moogfest)

EM-u Systems Founder Dave Rossum, Ableton CEO Gerhard Behles, and LZX Founder Lars Larsen will take part in conversations as part of the Instruments Innovators program.

Driven by the creative and technological vision of electronic music pioneer Dave Rossum, Rossum Electro-Music creates uniquely powerful tools for electronic music production and is the culmination of Dave’s 45 years designing industry-defining instruments and transformative technologies. Starting with his co-founding of E-mu Systems, Dave provided the technological leadership that resulted in what many consider the premier professional modular synthesizer system–E-mu Modular System–which became an instrument of choice for numerous recording studios, educational institutions, and artists as diverse as Frank Zappa, Leon Russell, and Hans Zimmer. In the following years, worked on developing Emulator keyboards and racks (i.e. Emulator II), Emax samplers, the legendary SP-12 and SP-1200 (sampling drum machines), the Proteus sound modules and the Morpheus Z-Plane Synthesizer.

Gerhard Behles co-founded Ableton in 1999 with Robert Henke and Bernd Roggendorf. Prior to this he had been part of electronic music act “Monolake” alongside Robert Henke, but his interest in how technology drives the way music is made diverted his energy towards developing music software. He was fascinated by how dub pioneers such as King Tubby ‘played’ the recording studio, and began to shape this concept into a music instrument that became Ableton Live.

LZX Industries was born in 2008 out of the Synth DIY scene when Lars Larsen of Denton, Texas and Ed Leckie of Sydney, Australia began collaborating on the development of a modular video synthesizer. At that time, analog video synthesizers were inaccessible to artists outside of a handful of studios and universities. It was their continuing mission to design creative video instruments that (1) stay within the financial means of the artists who wish to use them, (2) honor and preserve the legacy of 20th century toolmakers, and (3) expand the boundaries of possibility. Since 2015, LZX Industries has focused on the research and development of new instruments, user support, and community building.


Science

ATLAS detector (credit: Kaushik De, Brookhaven National Laboratory)

ATLAS @ CERN. The full ATLAS @ CERN program will be led by Duke University Professors Mark Kruse andKatherine Hayles along with ATLAS @ CERN Physicist Steven Goldfarb.

The program will include a “Virtual Visit” to the Large Hadron Collider — the world’s largest and most powerful particle accelerator — via a live video session,  a ½ day workshop analyzing and understanding LHC data, and a “Science Fiction versus Science Fact” live debate.

The ATLAS experiment is designed to exploit the full discovery potential and the huge range of physics opportunities that the LHC provides. Physicists test the predictions of the Standard Model, which encapsulates our current understanding of what the building blocks of matter are and how they interact – resulting in one such discoveries as the Higgs boson. By pushing the frontiers of knowledge it seeks to answer to fundamental questions such as: What are the basic building blocks of matter? What are the fundamental forces of nature? Could there be a greater underlying symmetry to our universe?

“Atlas Boogie” (referencing Higgs Boson):

ATLAS Experiment | The ATLAS Boogie

(credit: Kate Shaw)

Kate Shaw (ATLAS @ CERN), PhD, in her keynote, titled “Exploring the Universe and Impacting Society Worldwide with the Large Hadron Collider (LHC) at CERN,” will dive into the present-day and future impacts of the LHC on society. She will also share findings from the work she has done promoting particle physics in developing countries through her Physics without Frontiers program.

The ATLAS experiment is designed to exploit the full discovery potential and the huge range of physics opportunities that the LHC provides. Physicists test the predictions of the Standard Model, which encapsulates our current understanding of what the building blocks of matter are and how they interact – resulting in one such discoveries as the Higgs boson. By pushing the frontiers of knowledge it seeks to answer to fundamental questions such as: What are the basic building blocks of matter? What are the fundamental forces of nature? Could there be a greater underlying symmetry to our universe?

Theme: Future Thought


Arecibo (credit: Joe Davis/MIT)

In his keynote, Joe Davis (MIT) will trace the history of several projects centered on ideas about extraterrestrial communications that have given rise to new scientific techniques and inspired new forms of artistic practice. He will present his “swansong” — an interstellar message that is intended explicitly for human beings rather than for aliens.

Theme: Future Thought


Immortality bus (credit: Zoltan Istvan)

Zoltan Istvan (Immortality Bus), the former U.S. Presidential candidate for the Transhumanist party and leader of the Transhumanist movement, will explore the path to immortality through science with the purpose of using science and technology to radically enhance the human being and human experience. His futurist work has reached over 100 million people–some of it due to the Immortality Bus which he recently drove across America with embedded journalists aboard. The bus is shaped and looks like a giant coffin to raise life extension awareness.


Zoltan Istvan | 1-min Hightlight Video for Zoltan Istvan Transhumanism Documentary IMMORTALITY OR BUST

Theme: Transhumanism/Biotechnology


(credit: Moogfest)

Marc Fleury and members of the Church of Space — Park Krausen, Ingmar Koch, and Christ of Veillon — return to Moogfest for a second year to present an expanded and varied program with daily explorations in modern physics with music and the occult, Illuminati performances, theatrical rituals to ERIS, and a Sunday Mass in their own dedicated “Church” venue.

Theme: Techno-Shamanism

#Moogfest2017

A deep-learning tool that lets you clone an artistic style onto a photo

The Deep Photo Style Transfer tool lets you add artistic style and other elements from a reference photo onto your photo. (credit: Cornell University)

“Deep Photo Style Transfer” is a cool new artificial-intelligence image-editing software tool that lets you transfer a style from another (“reference”) photo onto your own photo, as shown in the above examples.

An open-access arXiv paper by Cornell University computer scientists and Adobe collaborators explains that the tool can transpose the look of one photo (such as the time of day, weather, season, and artistic effects) onto your photo, making it reminiscent of a painting, but that is still photorealistic.

The algorithm also handles extreme mismatch of forms, such as transferring a fireball to a perfume bottle. (credit: Fujun Luan et al.)

“What motivated us is the idea that style could be imprinted on a photograph, but it is still intrinsically the same photo, said Cornell computer science professor Kavita Bala. “This turned out to be incredibly hard. The key insight finally was about preserving boundaries and edges while still transferring the style.”

To do that, the researchers created deep-learning software that can add a neural network layer that pays close attention to edges within the image, like the border between a tree and a lake.

The software is still in the research stage.

Bala, Cornell doctoral student Fujun Luan, and Adobe collaborators Sylvian Paris and Eli Shechtman will present their paper at the Conference on Computer Vision and Pattern Recognition on July 21–26 in Honolulu.

This research is supported by a Google Faculty Re-search Award and NSF awards.


Abstract of Deep Photo Style Transfer

This paper introduces a deep-learning approach to photographic style transfer that handles a large variety of image content while faithfully transferring the reference style. Our approach builds upon the recent work on painterly transfer that separates style from the content of an image by considering different layers of a neural network. However, as is, this approach is not suitable for photorealistic style transfer. Even when both the input and reference images are photographs, the output still exhibits distortions reminiscent of a painting. Our contribution is to constrain the transformation from the input to the output to be locally affine in colorspace, and to express this constraint as a custom fully differentiable energy term. We show that this approach successfully suppresses distortion and yields satisfying photorealistic style transfers in a broad variety of scenarios, including transfer of the time of day, weather, season, and artistic edits.


Future ‘lightwave’ computers could run 100,000 times faster

TeraHertz pulses in semiconductor crystal (credit: Fabian Langer, Regensburg University)

Using extremely short pulses of teraHertz (THz) radiation instead of electrical currents could lead to future computers that run ten to 100,000 times faster than today’s state-of-the-art electronics, according to an international team of researchers, writing in the journal Nature Photonics.

In a conventional computer, electrons moving through a semiconductor occasionally run into other electrons, releasing energy in the form of heat and slowing them down. With the proposed “lightwave electronics” approach, electrons could be guided by ultrafast THz pulses (the part of the electromagnetic spectrum between microwaves and infrared light). That means the travel time can be so short that the electrons would be statistically unlikely to hit anything, according to senior author Rupert Huber, a professor of physics at the University of Regensburg who led the experiment.

In the experiment, the researchers shined THz pulses into a crystal of the semiconductor gallium selenide.* These pulses were ultra-short (less than 100 femtoseconds, or 100 quadrillionths of a second). Each pulse popped electrons in the semiconductor into a higher energy level — which meant that they were free to move around.

When the electrons emitted light as they came down from the higher energy level, they emitted much shorter pulses than the electromagnetic radiation going in — just a few femtoseconds long — quick enough to read and write information to electrons at ultra-high speed.

But first, researchers need to be able to control electrons in a semiconductor. This work takes a step toward this by mobilizing groups of electrons inside a semiconductor crystal.

Quantum computation

Because femtosecond pulses are fast enough to trap an electron between being put into an excited state and coming down from that state, they can potentially also be used for quantum computations, using electrons in excited states as qubits. The researchers managed to launch one electron simultaneously via two excitation pathways, which is not classically possible.

An electron is small enough that it behaves like a wave as well as a particle, and when it is in an excited state, its wavelength changes. Because the electron was in two excited states at once, those two waves interfered with one another and left a fingerprint in the femtosecond pulse that the electron emitted.

The research is funded by the European Research Council and the German Research Foundation.

* “We generated high harmonics by irradiating a 40-μm-thick crystal of gallium selenide with intense, multi-THz pulses. These pulses were obtained by difference frequency mixing of two phase-correlated near-infrared pulse trains from a dual optical parametric amplifier pumped by a titanium sapphire amplifier. … The centre frequency was tunable and set to 33 THz in the experiments.” — F. Langer et al./Nature Photonics

Abstract of Symmetry-controlled temporal structure of high-harmonic carrier fields from a bulk crystal

High-harmonic (HH) generation in crystalline solids marks an exciting development, with potential applications in high-efficiency attosecond sources, all-optical bandstructure reconstruction and quasiparticle collisions. Although the spectral and temporal shape of the HH intensity has been described microscopically, the properties of the underlying HH carrier wave have remained elusive. Here, we analyse the train of HH waveforms generated in a crystalline solid by consecutive half cycles of the same driving pulse. Extending the concept of frequency combs to optical clock rates, we show how the polarization and carrier-envelope phase (CEP) of HH pulses can be controlled by the crystal symmetry. For certain crystal directions, we can separate two orthogonally polarized HH combs mutually offset by the driving frequency to form a comb of even and odd harmonic orders. The corresponding CEP of successive pulses is constant or offset by π, depending on the polarization. In the context of a quantum description of solids, we identify novel capabilities for polarization- and phase-shaping of HH waveforms that cannot be accessed with gaseous sources.

Brain has more than 100 times higher computational capacity than previously thought, say UCLA scientists

Neuron (blue) with dendrites (credit: Shelley Halpain/UC San Diego)

The brain has more than 100 times higher computational capacity than was previously thought, a UCLA team has discovered.

Obsoleting neuroscience textbooks, this finding suggests that our brains are both analog and digital computers and could lead to new approaches for treating neurological disorders and developing brain-like computers, according to the researchers.

Illustration of neuron and dendrites. Dendrites receive electrochemical stimulation (via synapses, not shown here) from neurons (not shown here), and propagate that stimulation to the neuron cell body (soma). A neuron sends electrochemical stimulation via an axon to communicate with other neurons via telodendria (purple, right) at the end of the axon and synapses (not shown here). (credit: Quasar/CC).

Dendrites have been considered simple passive conduits of signals. But by working with animals that were moving around freely, the UCLA team showed that dendrites are in fact electrically active — generating nearly 10 times more spikes than the soma (neuron cell body).

Fundamentally changes our understanding of brain computation

The finding, reported in the March 9 issue of the journal Science, challenges the long-held belief that spikes in the soma are the primary way in which perception, learning and memory formation occur.

“Dendrites make up more than 90 percent of neural tissue,” said UCLA neurophysicist Mayank Mehta, the study’s senior author. “Knowing they are much more active than the soma fundamentally changes the nature of our understanding of how the brain computes information.”

“This is a major departure from what neuroscientists have believed for about 60 years,” said Mehta, a UCLA professor of physics and astronomy, of neurology and of neurobiology.

Because the dendrites are nearly 100 times larger in volume than the neuronal centers, Mehta said, the large number of dendritic spikes taking place could mean that the brain has more than 100 times the computational capacity than was previously thought.

Study with moving rats made discovery possible

Previous studies have been limited to stationary rats, because scientists have found that placing electrodes in the dendrites themselves while the animals were moving actually killed those cells. But the UCLA team developed a new technique that involves placing the electrodes near, rather than in, the dendrites.

Using that approach, the scientists measured dendrites’ activity for up to four days in rats that were allowed to move freely within a large maze. Taking measurements from the posterior parietal cortex, the part of the brain that plays a key role in movement planning, the researchers found far more activity in the dendrites than in the somas — approximately five times as many spikes while the rats were sleeping, and up to 10 times as many when they were exploring.

Looking at the soma to understand how the brain works has provided a framework for numerous medical and scientific questions — from diagnosing and treating diseases to how to build computers. But, Mehta said, that framework was based on the understanding that the cell body makes the decisions, and that the process is digital.

“What we found indicates that such decisions are made in the dendrites far more often than in the cell body, and that such computations are not just digital, but also analog,” Mehta said. “Due to technological difficulties, research in brain function has largely focused on the cell body. But we have discovered the secret lives of neurons, especially in the extensive neuronal branches. Our results substantially change our understanding of how neurons compute.”

Funding was provided by the University of California.

Complete neuron cell diagram (credit: LadyofHats/CC)


Abstract of Dynamics of cortical dendritic membrane potential and spikes in freely behaving rats

Neural activity in vivo is primarily measured using extracellular somatic spikes, which provide limited information about neural computation. Hence, it is necessary to record from neuronal dendrites, which generate dendritic action potentials (DAP) and profoundly influence neural computation and plasticity. We measured neocortical sub- and suprathreshold dendritic membrane potential (DMP) from putative distal-most dendrites using tetrodes in freely behaving rats over multiple days with a high degree of stability and sub-millisecond temporal resolution. DAP firing rates were several fold larger than somatic rates. DAP rates were modulated by subthreshold DMP fluctuations which were far larger than DAP amplitude, indicting hybrid, analog-digital coding in the dendrites. Parietal DAP and DMP exhibited egocentric spatial maps comparable to pyramidal neurons. These results have important implications for neural coding and plasticity.