We all interact with deepfakes in our current life, says Laura Lopez Fuentes, a Spanish research associate at the University of Luxembourg’s Computer Vision, Imaging and Machine Intelligence (CVI2) Research Group.
Can you explain what are deepfakes, and how common they are?
When people hear about deepfakes they think of artificially generated or modified faces, however the term is much broader than that. A deepfake is any digital content that has been modified or fully generated with deep learning, which is a very popular artificial intelligence algorithm. It is true that the most well-known deepfake algorithms are the ones that specialise on faces but there is also research on generating fake cat and dog pictures or generating fake sound or voices or using real voices to generate fake content. All these other examples are also deepfakes.
Although some people use them for malicious purposes, deepfakes also have a lot of other non-malicious applications. For example, this technology is used to dub videos to different languages automatically, which saves a lot of money and resources. Deepfakes are also used in research to artificially generate new images when there is a shortage of training data to train other artificial intelligence algorithms.
Both deepfakes, and fake news to a certain extent can be cracked by certain types of software – can you tell us a bit more about this technology?
In research we talk about two branches: one which tries to improve the generation of deepfakes and another which tries to determine if digital content has been manipulated or generated with deep learning. At the CVI2 research group of SnT, we are working on the branch that tries to detect deepfake data. Deepfakes are generated with deep learning, a popular artificial intelligence algorithm, and most of the algorithms for deepfake detection are also based on deep learning. These algorithms try to exploit the subtle defects that the generation algorithms have when generating deepfakes or find the traces that these algorithms can leave in the data. In most cases, to train the detection algorithms they are exposed to both deepfakes and real data to find which characteristics differentiate these two types of data.
“If we are able to design a detector that responds well in all cases it would be much easier to use and be extended to the general public.” Laura Lopez Fuentes
What other tools exist to tackle the threat of deepfakes?
Fake news is not something new, we have been surrounded by fake news for years. We all have a WhatsApp group where we receive surprising or hard to believe news shared by people that we know and who are sharing this news because they believed it.
The first thing is to use critical thinking. If it sounds too good or too bad to be true it might just as well not be true. The second thing is to do some research. For example, even if it is a screenshot of a popular and trustworthy newspaper, we could enter the official webpage of the newspaper to check if we can find that piece of news there. Also, since this fake news spreads very fast there is generally some information on the internet about its veracity.
Trust in journalism is in a state of crisis because of fake news. How do you think these tools might help media companies as well as consumers to separate the truth from the lie – to be able to trust our senses?
As far as I know, at the moment most fake news content is not generated with artificial intelligence and these types of fake news have been around for a long time. However, as the deepfake technology gets better and more widely available, this technology is increasingly being used to generate more credible fake news by supporting the information with images or videos. In many cases people do not question a piece of information if it comes with some visual support. Also, the fact that fake news tends to spread very fast makes it possible to receive the same piece of information from several sources, which makes it also more credible. Thankfully people are becoming more aware about the fact that this type of information can be manipulated. I believe it is the perfect time for fake news detection applications to emerge since a lot of people would use them, especially journalists who are interested in knowing whether information is true or not. Some of these tools are already available, although most of the general public do not know them yet.
Describe to us the process of developing these tools, and what it is that you and your team do. What is the goal of your work?
When developing these tools, the first step is always to do research on what has been done on deepfake generation and detection, what are the current limitations of the deepfake detection algorithms, and what are our industrial partners’ needs.
In our case, we are working directly with POST Luxembourg, InTECH and CoDare in a project funded by ESA with the goal of restoring trust in digital assets by certifying its proof of existence and integrity to provide a label of trust to any digital content. Due to the nature of the project, we are specifically interested in developing an algorithm that is able to detect deepfakes regardless of the type of algorithm that generated them and the type of digital content that we are working with (images, videos, text or sound).
In a similar line, we are working on another project funded by the FNR and in collaboration with POST in which, among other things, we combine cross-modal information (sound and video) to detect deepfake videos. We believe that exploring this direction is very relevant since most deepfake detection algorithms at the moment are very specific to one type of deepfake generation algorithm and fail when exposed to other types of deepfake.
So, if we are able to design a detector that responds well in all cases it would be much easier to use and be extended to the general public. Moreover, this algorithm would also be more robust to future deepfake generation algorithms, so it would not become obsolete so quickly.
“[…] in the future I think that a lot of social media platforms will have this technology integrated and posts will automatically be labelled as containing deepfake data.” Laura Lopez Fuentes
How much does AI and machine learning have to do with this–how would you break down the interconnectedness?
There is a clear connection between artificial intelligence and deepfakes since deepfakes by definition are generated with deep learning, which is a very popular artificial intelligence algorithm. However, the relationship between fake news and artificial intelligence is not that clear since most fake news are not generated using deepfakes.
How widespread is fake news/deepfakes?
I think everyone interacts with deepfakes in their current life even if they do not know it. When we take pictures with our smartphones, all the effects that we can apply to the images or videos fall under the definition of deepfakes because they are images that have been modified through deep learning. The same happens with a lot of smartphone applications like TikTok or Snapchat where we can add sophisticated filters to our faces or even switch our faces with the face of someone else.
When it comes to fake news, I think that any person that has social media or instant message applications has been exposed to fake news even if they were not necessarily generated with deep learning.
Do you see your/this type of work helping fields such as journalism or other sciences in the future?
Yes, I think that this type of work will help journalism as well as the general public to determine if a certain digital content is truthful or not. Most deepfake algorithms are open source and thus anyone can generate a deepfake but similarly, most deepfake detection algorithms are also open source so in theory they could be used by journalists to improve their due diligence. However, in practice it is not that easy because these open-source algorithms do not have a user-friendly interface and there are many different algorithms so some prior-knowledge of which algorithm to use is needed. I think that soon easier-to-use platforms will be developed either by companies or the scientific community, such as the one we are currently developing in our group in our project FakeDeTer, which we are developing in collaboration with POST with funding by the FNR.
Further in the future I think that a lot of social media platforms will have this technology integrated and posts will automatically be labelled as containing deepfake data. We can already see some examples of how we are moving into that direction on some social media platforms. For example, during the pandemic the amount of fake news, rumours and conspiracy theories dramatically increased, and Twitter created a “covid-19 misleading information policy” where tweets containing false information about covid-19 could be removed or labelled with a warning message. Even Donald Trump had a tweet labelled as containing incorrect information about covid-19.
Interview courtesy of our content partner Silicon Luxembourg