Technology and innovation: artificial intelligence and the creative process

The creative process and human history are closely connected. Humans express their creativity by bringing to life what can be imagined. It would nonetheless seem that machines are capable of a certain form of creativity as well.

The question of machine creativity is nothing new. During the last 50 years, all manner of experimentation has taken place. However, 2018 will be remembered as a year when this question was at the forefront, as important developments in the field were achieved. How should the audiovisual industry respond to these developments in the medium to long term?

Whether related to still images, moving images, or sound, current experimentation is triggering a redefinition of the boundaries of creativity, both human and machine, and inciting us to work collaboratively. Consumers can already take part in the momentum building around explorations in co-creation through interaction with smart speakers. Could this be the first steps toward a democratization of these processes?

The boundaries of creativity are becoming blurred

The first auction of a painting produced using artificial intelligence (AI) rekindled the debate surrounding the role of technology in creation.

The Portrait of Edmond de Belamy was created by French art collective Obvious using a GAN (generative adversarial network) algorithm inspired by a database of 15,000 canvases painted between the 14th and 20th centuries.

Like all others created using GAN algorithms, this work is part of a new approach that applies artificial intelligence to art. In the past, an artist would program a computer based on a desired result. Today, thanks to machine learning, we can reference thousands of examples of a certain kind of work to train an algorithm to generate new works based on the same aesthetic codes.

Although we may be tempted to believe that a computer can create on its own, it actually still requires human intervention, as points out Ahmed Elgammal, director of Rutgers University’s Art and Artificial Intelligence Laboratory. Humans choose the examples of artworks that allow the algorithms to ‘learn.’ In the case of the Portrait of Edmond de Belamy, the result is a portrait inspired by the classical forms used as examples. An additional human intervention is needed to categorize the results and establish which are the most appropriate.

So, is it really art? For Elgammal, there is no doubt that the artistic value lies more in the process than in the final result. All choices, human and non-human, lead to a result that is considered art. And who is the artist: the human who programs the computer or the machine that generates a new image based on past examples? Incidentally, in the case of the Portrait of Edmond de Belamy, the artist Robbie Barrat maintains that he developed the algorithm and the examples used to generate the work.

Machines refining the art of storytelling

A team of researchers at the University of California, Santa Barbara recently developed a neural network capable of generating abstract stories from photo streams. The artificial intelligence they developed is able to make inferences based on a proposed image and go beyond simple description. Additionally, the AI can imagine a story that is not immediately evident when considering the image. According to researchers, it passes the Turing test three times out of five, meaning that it is impossible to determine if the work was generated by a human or a machine.

It bears repeating that this technology is in its infancy: it still cannot equal the human imagination in the telling of complex stories. However, future developments— such as a better understanding of human social structures—could change things.

Machine-made images: from still to moving pictures

We’ve seen what machines can do with still images, but what about moving images?

When it comes to AI, the trials thus far have not been very conclusive. For example, in 2016, Oscar Sharp and Ross Goodwin’s Benjamin algorithm failed to impress with its first work, Sunspring. The following year, It’s No Game did not fare much better.

In contrast, the pair’s 2018 work, Zone Out, garnered considerably more attention while generating a certain amount of concern. Created using the same type of neural network as the Portrait of Edmond de Belamy, which sold at auction, Benjamin wrote the script, selected the scenes from thousands of films and gaming sequences and placed actors’ faces on the appropriate characters (using face-swapping technology in the same way deepfake was used to produce fake pornographic videos with the faces of celebrities.)

Benjamin’s creators are looking to artificial intelligence to increase human capacity, as opposed to trying to replace it. Their film represents one more step toward the automation of video production. Is a work created entirely by artificial intelligence legitimate? Could it compete against human-made works? Will we be able to tell the difference between a film made by a machine and one made by a human? Is this even an important question? Experimentation in creative AI raises a multitude of questions.

 

WHAT IS DEEPFAKE?

The term deepfake is used to describe videos created using GAN algorithms. The principle involves generating new data from pre-existing sources. For example, by using thousands of videos featuring the same individual, the computer generates something similar, without it being a copy of any of the pre-existing videos. Up until the end of 2017, the technique was mainly used in research settings, until a Reddit user decided to produce a fake pornographic video where porn stars’ faces were replaced with those of celebrities. Now that the general public has access to this constantly improving technology, some people are concerned that it will become more and more difficult to differentiate real videos from forgeries.

 

Artificial intelligence in audio production

The same type of experiments involving the use of AI to generate stories is being used for podcasts that contain no visual references. Sheldon County is a podcast produced by PhD student James Ryan from the University of California, Santa Cruz. Through the use of artificial intelligence, he has been able to generate an infinite number of procedural narratives. The process itself is not new, having already been used in the video game industry for games like No Man’s Sky to automatically generate new environments to explore.

The principle behind Sheldon County is the same but used in an audio-only application. Set in a fictitious American city over a period of 150 years, the podcast tells the stories of the city’s residents. Each fabricated county is populated by its own AI-generated characters, each with their own stories and motivations. From the outset, a county is assigned to the listener and the podcast series’ intrigue is based on that selection of characters. In theory, every listener has access to a different story.

Ryan is more interested in the public’s participation than the underlying technology, since without an audience, the stories do not exist. On October 31 2017, MIT released Shelley, an AI horror story generator which uses the same kind of human-machine collaboration. Starting from a “random seed”, an arbitrary number used to start each adventure, Shelley initiated stories that were then elaborated using ideas collected on Twitter.

In both examples, human contribution—whether at the concept stage of a project, to set creative parameters and select pre-existing works, or judge the value of the result—is essential to machine creativity. What if this technology could be used to help more of us create, based on our individual imaginations? This vision of increased collaboration with computers as part of the creative process strikes us as a trend that is essentially focused on machines enhancing human capacity. Content producers can push the boundaries of human creativity through technology while focusing less on perceiving machines as a threat. Creators will need to adapt to the reality of this type of collaboration, although still at the experimental stage, as it becomes more and more prevalent.

 

Interactive audio content: a renaissance in group listening

Back to the future of sound

Audio content continues to carve out its place in the habits of Canadian consumers. In last year’s trends report, we made an observation on the increasing popularity of podcasts in the United States and Canada.

On this side of the border, 61% of adults are familiar with the term “podcast”, compared to slightly more reported in the United States, 64% (Edison Research). In Canada, it’s no great surprise that 18–34-year-olds show the most enthusiasm for this type of audio content. 41% of the members of this age group listen to a podcast every month. The Canadian average stands at 28%.

However, a comparison between the United States and Canada shows that we do not consume podcasts in the same way.

Many Canadians still prefer to listen to podcasts on their computers (40%), although their preferred method is via a mobile device (57%). Americans have a preference for listening to podcasts on their mobile devices (68%), while nearly 30% of fans use their computers to listen to them. Our differences can also be seen in the locations we choose for listening. More Canadians (61%) than Americans (49%) prefer to do their listening at home, where new devices have made remarkable progress during the last year. Indeed, among American consumers who frequently listen to podcasts, 24% regularly listen to this type of programming using a smart speaker. It is important to point out that contrary to Canadians, they have had access to this technology since 2014.

Smart speakers and creative content

Canadians have embraced smart speakers.

Available in Canada since June 2017, already 8% of the national population says they own one (Edison Research). Moreover, half of smart speaker owners keep the device in their living room (MTM), which may give rise to consuming more than just practical content (like asking for a weather forecast, a general knowledge question, or listening to the news), such as collective listening and interactive creative content.

The use of smart speakers is reminiscent of the beginnings of radio, whose history is closely tied to collective listening. Whether in a public setting or a more intimate family gathering, radio broadcasting allowed multi-generational groups to gather around a single technology. Through smart speakers, audio seems to have maintained its power to unite.

Our 2018 report described several different interactive storytelling initiatives developed for Amazon’s Alexa. The industry has continued to experiment in this area by launching interactive productions related to series and films. Netflix promoted its Lost in Space series via an interactive audio adventure designed for Google Home.

Interactive audio stories targeting the kids’ market

The interactive audio content offer for children has exploded over the last year.

Amazon has even launched a children’s version of its Echo smart speaker with integrated parental controls. BBC Kids, LEGO and Nickelodeon are among the companies that have produced content for this type of device.

In the last year, Universal and Earplay also launched Jurassic World Revealed for Alexa. Earplay, in coproduction with American public radio station WBUR, released You and the Beanstalk, a tale for 6–12-year-olds.

In Canada, Groupe Média TFO launched its Boukili Audio application, an interactive game to help 4–8-year-olds learn French. Additionally, Toronto’s Storyflow launched a platform and hub for interactive voice entertainment and started developing choose-your-own-adventure style interactive stories for children. Targeted at families, the stories let kids interact with characters and choose how the action unfolds.

Carefully controlled content

Of course, the use of voice-activated assistants in general and smart speakers in particular only requires us to use our voices.

In contrast with many other technologies that demand minimum levels of competency in reading and writing or a certain fluency with screens and keyboards, one needs only to be able to speak to use a smart speaker. Consequently, they are accessible to a very large client base, which includes children.

However, we still do not know what effects this type of content will have on young audiences. Sara DeWitt, vice president of digital at PBS Kids, wonders if children truly understand to whom or to what they are speaking. It is therefore crucial that care be taken in the creation of interactive audio stories targeting children and youth.

Beyond these important considerations, the increasing popularity of these devices translates into numerous opportunities for experimentation in content creation.