Artificial Intelligence and Automation

Introduction

According to researchers Andreas Kaplan and Michael Haenlein, “artificial intelligence” (AI) refers to a system’s ability to correctly interpret external data, learn from those data, and apply those learnings to achieve specific goals through flexible adaptation.

As their definition suggests, the intelligence in AI is based on some form of “learning” that usually comes from data covering events or instances that occurred in the past. AI will then use that past learning and adapt to some related phenomenon in the present. And, it will then conduct itself in the present in a way that we would consider to be “intelligent” if that same task had been performed by a human being. Finally, AI generally continues to learn as the present becomes the past, and continues to adapt as the future becomes the present.

Artificial intelligence plays a hidden but significant role in our daily lives. When you type half a phrase into the Google search bar and Google suggests several ways you might finish it, a predictive algorithm is at play. When you look for flights online, an intelligent (well, not always) algorithm may try to guess your budget based on your browsing history and suggest a nearby hotel within that price range. When Facebook recommends tagging yourself in a photo because it recognizes your face, it’s because an intelligent algorithm has already analyzed your face many times. Moreover, we are increasingly seeing applications of AI in many different facets of journalism.

AI in Journalism

As with other segments of everyday life, artificial intelligence is no longer just some niche idea within the news industry. AI is already being used, in varying ways and to different extents, within every major stage of news production, from helping journalists find ‘the story’ to automating personalized distribution of news content.

Generating Story Ideas

Coming up with a unique and compelling story idea is one of the most creative parts of journalism. The world is filled with newsworthy events — each of which can be approached through a large array of story angles — but journalists only have the capacity to tackle a relatively small amount of them. Moreover, audiences have limited time and attention, and can only consume a sliver of the content that is already out there. To stand out, journalists must identify the most interesting and informative angles for the most interesting and important events. While AI still struggles to come up with unique and compelling ideas of its own (and is generally limited to phenomena that have been digitized and structured), it is nevertheless being used by journalists to manage information overload and to help them find ‘the story’ in large troves of documents.

For example, in 2016, the Atlanta Journal-Constitution published a series of stories that examined sexual abuse of patients by their doctors. While a human journalist decided that this was an important story to explore — it would go on to be a Pulitzer Prize finalist — intelligent algorithms still played a major role in helping the journalist find what was most interesting and significant about the story. The algorithms reviewed a set of disciplinary documents that were manually identified (by a human) as involving sexual misconduct and learned about the sorts of words and phrases that were commonly used in such documents to describe sexual misconduct. (After all, there usually wasn’t a clear sentence in the disciplinary documents that read, “this person sexually abused someone.”) That algorithm then applied that learning to more than 100,000 documents, which had not been reviewed by a human, and gave each document a probability that it involved a case of physician sexual misconduct. This helped point the journalists at the Atlanta Journal-Constitution toward doctors, clinics, and areas where that abuse was most prevalent or disproportionate.

As another example, in 2019, a whistleblower secretly leaked documents from a law firm in Mauritius to a group of investigative journalists. The documents provided a rare look at how multinational companies avoided paying taxes when they did business in Africa, the Middle East, and Asia. There were more than 200,000 documents in that leak, with some of those documents being hundreds of pages long. To help journalists make sense of that information, the digital news outlet Quartz built a machine learning algorithm that helped journalists locate the most relevant portions of the most relevant documents. Specifically, when the journalist found a document of interest, the algorithm would analyze it and link the journalist to other documents in the leaked trove that had similar kinds of information or involved the same individuals and organizations.

Sourcing and Verifying Information

As journalists flesh out their story ideas, they frequently turn to different sources of information, from expert human sources to databases. When sourcing information, journalists typically look for the most authoritative source. That authority may come by virtue of someone’s position (like the CEO of a company), their area of expertise (like an academic who studies a very specific thing), or from someone’s proximity to a phenomenon (like a person who saw the car crash). However, there are often multiple sources who can speak with authority about some topic. For example, there are several academics who study police funding — and even though they’re all experts, each may bring a different perspective to the table.

News organizations are already using AI to help them not only identify potential sources but to correct systematic sourcing biases that may arise in the course of reporting. For example, in 2018, the Financial Times began using AI to review stories and warn journalists if they were relying too heavily on male sources. The Financial Times’ technology was not particularly sophisticated; it mostly guessed the gender based on a person’s name and the use of pronouns. But it was enough to make a journalist reflect because several biases operate at a subconscious level — and having something prompt the journalist to reflect is sometimes enough to mitigate the worst effects. A more sophisticated technology produced by the Ryerson School of Journalism, JeRI, similarly aims to score the institutional power of sources by weighing factors including their placement in a story and frequency of attribution. This helps journalists see if some people, organizations, and locations are receiving too much attention.

The use of AI isn’t just limited to sourcing. It is also being used to help journalists both interview and make use of interviews. For example, United Robots, a Swedish tech company, offers technology that can automate the process of conducting (simple) interviews. In the case of a soccer match, the technology is able to analyze a game recap, identify potential questions for the winning and losing coaches, send those coaches text messages with the questions, and integrate the coaches’ responses into the game recap. All of this can occur with limited human supervision. AI has also been used by journalists to parse interview transcripts (some of which are computer-generated) and identify the most interesting aspects of the interview, or to link a source’s present remarks with things they’ve said elsewhere in the past.

News organizations are also using AI to fact-check information, either by intelligently linking new stories and claims to existing, human-led fact-checks or by attempting to intelligently rate a claim by independently looking up information in knowledge bases. For example, Storyful, a company that specializes in fact-checking information on social media, partnered with Google in 2020 to create Source, a tool that automatically looks up an image’s public history (to ascertain its origin), checks it for artifacts suggestive of digital manipulation, and extracts and translates any text (e.g., from a meme). This helps journalists more quickly ascertain if an image is likely to be a hoax or part of a disinformation campaign.

Producing Stories

A great deal of newswriting is formulaic. The inverted pyramid remains the dominant narrative structure for breaking news, and journalists often adhere to, and repeat, certain sets of words, phrases, and syntax (e.g., “said” and “according to”). Unsurprisingly, AI is therefore already used by some leading news organizations (e.g., Bloomberg News in the United States, The Canadian Press in Canada, and Helsingin Sanomat in Finland) to produce tens of thousands of seemingly human-produced news stories each year with limited supervision.

For example, The Associated Press uses AI to review hundreds of thousands of public filings to the Securities and Exchange Commission and turn them into news stories. The process is largely automated. It begins by accessing the SEC’s system and downloading all new filings, such as a quarterly earnings report. It then translates a filing into a structured dataset, looking for expected markers like the company’s reported revenue for that quarter and trying to make sense of unstructured information (e.g., the company’s forecast for the next quarter or its expected threats). It then analyzes that filing, sometimes comparing it to previous data (e.g., the previous quarter’s revenue and forecast) to identify the most interesting data points. It then inserts those data points into one of many potential story templates previously written by human journalists. (You can think of these templates as a Mad Libs of sorts.) Finally, it publishes the story on The Associated Press’ newswire. The entire process is largely unsupervised — after setting up the algorithm and writing the templates, the AP journalists can take their hands off the wheel.

AI is also used to personalize segments of human-produced stories, such as by localizing or rewriting portions of a human-crafted narrative to appeal to each individual user. For example, The New York Times has used AI to identify a reader’s location and personalize a human-produced story about air quality around the world by adding a computer-generated paragraph about the air quality in the reader’s location.

Additionally, journalistic outlets like the British Broadcasting Corporation (BBC) have used AI to summarize stories written by humans and machines alike, producing either bullet-point rundowns or shortened versions of those stories. Those summaries can serve as the basis for companion products, such as an auto-generated news roundup for a voice assistant like Amazon’s Alexa. The BBC has even experimented with automating the transformation of a text story into a multi-panel visual story that can be instantly shared on visually oriented platforms like Instagram.

AI is not just being used in text-oriented journalism, though. News organizations have experimented with using “deep fake” technology to automate broadcasts featuring both semi-human and entirely synthetic anchors. For example, Reuters has prototyped a fully automated sports news summary system that pairs automated game summaries with photographs of the key moments being described, all of which is presented by synthetic footage of a real news presenter (i.e., a “deep fake” based on pre-recorded footage of the presenter). Put another way, Reuters’ technology is able to combine the words describing the event with relevant pictures, and present the package through a realistic-looking anchor. China’s Xinhua News Agency has prototyped similar technology that is able to operate outside the realm of sports and that uses an entirely computer-generated anchor.

These technologies represent advances on existing uses of AI in multimedia journalism. For example, AI is already frequently used to help editors ‘tag’ audiovisual content (e.g., label the people, objects, and locations in a picture) in order to help them more easily find relevant photos later on.

Distributing Stories

News consumers regularly engage with news distributed via intelligent algorithms. For example, when a person visits a news outlet’s website, they often encounter several widgets on the sides of the article that direct the person to other articles they might want to read next. Those recommendations may be tailored specifically to that person, based on the stories they have previously read on that outlet’s site, or even elsewhere on the Web. For example, if that person usually reads political news stories or watches videos about the Boston Celtics, the journalistic outlet’s algorithms may point the person toward more stories about politics or the Celtics.

Such recommendation algorithms are used by digitally native and legacy outlets alike. For example, even The New York Times, an organization that takes great pride in its journalistic expertise and editorial stewardship, has relinquished some of its agenda-setting power by adopting personalized distribution via the “For You” section of its app. Some organizations, including publications owned by Sweden’s Bonnier News Local, allow intelligent algorithms to take the lead in organizing the news content appearing on those publications’ home pages, leaving human editors to play more of a supervisory role (i.e., to occasionally overrule the algorithm’s editorial decision-making).

More broadly, in an information ecosystem that is arguably oversupplied with content, there is an economic imperative for some news organizations to employ AI to intelligently distribute their products in tailored ways across platforms and to multiple market segments. This includes adapting their existing content to fit the expectations of different platforms, such as by creating a shorter automated news summary for TikTok than what might appear on YouTube. It also involves adapting promotional messages to take advantage of different platforms’ technical affordances, such as by identifying trending hashtags and automatically applying the relevant ones to the journalistic outlet’s work.

AI is not some neutral thing. AI is not only shaped by its creators but it shapes individual and collective human behaviors via the ways it is put to use. It is thus crucial to take stock of the fact that market-leading technologies at the intersection of AI and journalism are being developed by people and companies with backgrounds outside of journalism, and to wrestle with the positive and negative implications of that. For example, AI technologies are often benchmarked through notions of efficiency and scalability. This is a stark contrast to the ideals that shape understandings of ‘quality’ journalism.

AI technologies can also become highly biased: depending on how (and by whom) the AI is developed and what it is trained upon, it can adopt and replicate (at scale) a number of human flaws. For example, Amazon created an AI-based hiring tool that ended up disproportionately rejecting female applicants. A 2021 investigation by The Markup found that an algorithm used by a large government agency was more likely to suggest denying home loans to people of color than to White people with similar characteristics. That was just one of many investigations by The Markup, a non-profit data-driven journalistic outlet, that have repeatedly shown that ‘color-blind’ algorithms are anything but.

Within the context of journalism, AI can unintentionally reproduce problematic depictions and promote inaccurate stereotypes. For example, an automated story about a decrease in the number of immigrants entering the United States may automatically embed a stock photo of immigrants being detained by immigration authorities. That, in turn, might promote and perpetuate the association between immigration and criminality, simply because the algorithm has learned that previous stories about immigration tended to focus on elements of legality and crime. Journalists and their outlets must therefore remain mindful of how they are employing artificial intelligence, and how such applications may advance or detract from their mission to represent truth.

On the legal front, the United States legal landscape remains unclear about key considerations involving the applications of AI in public communication. For example, legal standards in the U.S. make it difficult to hold algorithms (and their creators) liable for libel, leaving unclear who can be held legally responsible for defamation when the communication is enacted by a machine. Moreover, even established case law is now being reexamined as AI proliferates. This includes safe-harbor provisions that have shielded operators of digital infrastructure (and portions of news websites) from certain liabilities.

AI and the Future of Journalism

The future of journalism will involve greater human-machine interaction. While humans will likely remain at the center of news production, the work they do and the ways they go about it will look different. This will invariably result in the downsizing of certain roles and aspects of the job, but it will also create new roles and possibilities — from developing new technologies to managing knowledge systems to specializing in new beats and formats.

At the same time, it is important to recognize that although the aforementioned applications of AI in journalism show promise, the technology is still limited in important ways. For example, although algorithms already produce hundreds of thousands of business and sports news stories, their applications remain largely limited to news briefs and game recaps. An algorithm would have a much harder time writing a story that gets at how a CEO’s sex scandal might affect their company, or produce a feature on why a player decided to sit out the season due to health concerns. Put another way, only a small amount of the news stories people regularly consume can be produced using the present AI technology, and those stories generally rely on either pre-existing data or phenomena that can be easily translated into structured data. And, products resulting from AI-led processes are usually relatively basic: they still use fairly simple expressions to convey information and can only offer surface-level analyses. They do not feature compelling leads or writerly panache, and they don’t win Pulitzers on their own.

Nevertheless, there is good reason for journalistic outlets to invest in AI, and good reason to believe such investments will only increase. Journalists and journalistic outlets that fail to adapt to this changing environment will find themselves at risk of falling behind competitors who are able to leverage AI to increase efficiencies, scale upward, and even improve the quality of their work. Algorithms can generate news stories far more quickly than human journalists, and they can be useful aides for creating the first documentation of an event. Put another way, they can free up human journalists to focus on the more meaningful follow-up stories and draw some attention to stories that might otherwise not receive any coverage (for lack of human resources).

However, it is not just news producers who must adapt. The future of journalism will also likely be filled with ‘junk,’ from misinformation and disinformation to more fundamental issues involving information overload resulting from an influx of automated communication. That, in turn, will require individuals to adapt their existing media literacies and seek out their own trusted, intelligent assistants to help separate signal from noise.


Key Takeaways

  • Artificial intelligence refers to a system that is capable of learning from the past, adapting to the present, and acting in a way that would be considered “intelligent” if a human being did it.

  • AI is today used in some fashion in every stage of news production, from coming up with story ideas to distributing news content.

  • AI is not a neutral technology; it takes on values from its creators and the objects it is tasked to learn from. This introduces a number of ethical and legal dilemmas that journalistic actors must be mindful of.

  • Humans will likely remain at the center of news production. However, the work they do and the ways they go about it will look different.