VALID - Visual Analytics in Data-driven Journalism

Mastering the Information Overload. Processing complex data is essential for quality in data-driven journalism. This research project combines data and visual analytics, with a focus on journalistic needs.

Short Description

We live in a world in which it is increasingly important to understand complex socio-economical and ecological phenomena to facilitate well-informed decisions. Traditionally, journalists play an important role in this endeavor by uncovering hidden patterns and relationships to inform, enlighten, and entertain.

With the ever-growing amount and availability of data, it becomes crucial for journalists to use elements of data analysis and visualization in their work. This trend led to the advent of the emerging field of Data-Driven Journalism (DDJ), which involves computersupported data-based reasoning as well as interactive visualization./p>

The project VALiD combines Data Journalism and Visual Analytics. The core principle of Visual Analytics is the integration of the outstanding visual perception and reasoning capabilities of humans with the strengths of automated data analysis of computers. Thus it aims to make large and complex data more comprehensible and facilitate new insights.

Although news organizations such as the New York Times or the Guardian apply data journalism, the majority of journalists still face significant obstacles hampering the utilization of data for their work. Often, newsroom workflows do not cover data journalistic processes, available tools require advanced technical expertise, or dealing with complex, heterogeneous data is not supported.

Mitigating these obstacles is the main goal of the VALiD project: Following a user-centered and problem-driven research process, the involved researchers design techniques to support data journalists in dealing with complex heterogeneous data, and develop a set of guidelines and best practices for data journalism workflows. Because heterogeneous data is a large field with many different facets, we are focusing specifically on two types: First, textual data over time, such as transcripts of parliament debates, are investigated. Second, we analyze dynamic networks combined with quantitative flows, as for example data on governmental advertisement in media.