Andre Torsvik, Vizrt VP of product marketing, analyses possible uses for AI in TV production

AI image 1

Let’s start with a confession: Not a single word in this article was written by an AI. Not a sentence, syllable or comma were put on the page by a cold, sensible, algorithm-based, logical but oh-so-mystical brain in the clouds. Perhaps unfortunately, they were all written by me. 

What makes this interesting is this: The inflection point for AI comes when the natural question after that statement is: “But why not?” Why would you deny yourself access to something perfectly suited to the task at hand? Why would you waste time creating a structure for an argument, risk being inaccurate, risk using less than SEO perfect language? Why, in short, would you not pick up and use the right tool, for the right task?

And this is the crux of the matter; not until we move past the hype of AI, the faffing about with six-fingered hand pictures in perfect Leibowitz style, Hamlet rewritten by punk poets or Forrest Gump recast as John Travolta (OK, the latter might have mileage in a future where we get to see all the might have beens), only when we get past those will we get to the real meat of AI as a tool. Boon or bane, miracle or threat are irrelevant – the operative word for AI is tool.

And an interesting and perhaps surprising example of this is broadcasting and media technology. We are not always the most forward-leaning of industries: we’re not bad, but there’s still an awful lot of copper in the walls, and the ceiling, and the floor. Even so, the industry has already been deploying needle-pinpointed applications of AI and ML to solve specific problems, for some time.

To be clear, I am talking here about the application of Artificial Narrow Intelligence – a set of tools deployed to address a specific issue. This sidesteps the entire discussion of Artificial General Intelligence – the existence of which is predicted to happen all the way from 2030 to 2300 and what it would mean for something like media production; simply because it does not exist and we cannot know – not truly – until it does.

Where applications of AI in media production have been most successful, is when we have been able to utilize those things traditionally done well by machines. Speed, accuracy (at the same time!), the ability to ignore irrelevant data and the ability to process vast amounts of data, to name a few. Let’s look at a few examples that come straight from the world of live production – not from editing and the use of AI to search databases, a useful tool for sure, but rather from the world of live, visual-based content production.

Keying:

Keying is hard. Everyone who has ever worked in a studio, let alone in an outside broadcast attempting to get a good key, know this. And in the latter case – let’s say you are trying to add some virtual elements to a sports field. You must contend with changing light conditions, moving people, rain, snow, sleet, fog or even, in a now-infamous example, the bald head of a referee. For a human operator to constantly analyze and adjust the entire picture is a tall order. It also takes time, and in that time, errors happen. The quality of the production suffers, and important ad revenues may be lost.

Vizrt has been working on AI-driven keying for years, and we have applied the tool of machine training to teach an AI to do this accurately, quickly and repeatedly. Look at the difference in how clean an AI keyer can get – compared to a manually operated colour keyer.

AI colour keying

Tracking:

Tracking is also hard. That is why so many ways of achieving it exists – from mechanical tracking through optical to image-based. All in order to follow an object in virtual space so you can place graphics accurately and quickly – either in relation to or onto an item or person in your shot. It is one thing if you can tag people with GPS trackers or similar, quite another if you must rely on image-based work. If you then have multiple similar objects moving quickly through an image, add in occlusions, poor image conditions and other disturbing factors, it’s hardly surprising that we last year introduced the Object Tracker. Object Tracker is an ML-based system that lets you track pretty much anything on screen and add graphics to it if you teach the AI to recognize it first.

Replacement:

And talking about hard – it’s one thing to key and track existing objects to highlight them, quite another to, in a complex visual environment, completely confuse one object or surface with another. This combines the two preceding challenges to some degree, and adds in what might be termed a moral dilemma – is reality real, once you replace parts of it and the audience are none the wiser? For the past several years, this has been possible utilizing different technologies, including Vizrt visual tech.

Field-side advertising boards have been replaceable – in any weather and light conditions, without anything more than image-based technology and in real-time. This is an application of AI to a specific challenge – how to make ads relevant to viewers, even when those ads are part of the real world. 

And what’s next? There are some brilliant opportunities out there to leverage AI across multiple workflows in media production. And all these things are in some level of development and/or real-world deployment already.

Auto generation of 3D models and textures

The creation of virtual sets can be incredibly time consuming. And for them to make sense, I don’t think it makes sense to leave their creation entirely to StableDiffusion. However, what if we could automate and speed up parts of the job? What if we could do prompt-based generation of a variety of 3D objects and textures, created from samples in a vast library?

This could save hours of searching, creation and heartache, and allow the artist to focus on what they should with a virtual world – the bigger picture, and how it can be used to tell a story.

OpenAI already released Point-E last year – and while the results are not up to those created by creative genius using cutting edge tools – they are done in a fraction of the time. What if they are seen as drafts – getting a smorgasbord of options to choose from before you decide where to spend energy actually creating?

Accurate predictive analysis applied to sports graphics

Already tested in the field, this should be taking center stage in sports production – soon. Nielsen say 51% of fans check stats while watching live sports. What if – based on statistics, conditions on the field, weather reports, and any number of other factors, AI could give us a likely estimate of a team’s next move, a player’s play or a referee’s decision? Imagine the possibilities for interactions with the audience, for sports betting, for engagement.

Automatic translation and re-dubbing of content

This is already available in the infrastructure and technology – Amazon and many others offer tools for auto-translation and text-to-speech to use in your deployments, featuring amazingly lifelike voices offering tempo, timing, timbre well beyond what was possible only a few years ago. What if this got the power of natural language processing on the level of ChatGPT? Content could be repurposed for international consumption literally at the click of a button.

So, why did I not use AI to write this article? Beyond making a point, beyond the fact that maybe some of the creation is in the process, it probably would not have helped me all that much. I am referring to a relatively narrow field and postulating theoretical ideas. Adding ML to such a case would not have given me speed or accuracy.

And if it was a black box using “common sources” then I am not sure I am confident, it would not commit plagiarism. I do not have readily available a tool I fully trust; one I know to do better than me and one that does not currently suffer from quite severe stigma in some circles. If I did have such a tool, I would have used it. In the world of broadcast, many of these tools already exist, and I do not cease to be amazed at how many people are not already using them.

Andre Torsvik Headshot

Andre Torsvik is VP of product marketing at Vizrt