Skip to content

Sink or swim: Can we stay afloat in an endless sea of data?

    Data expert and author Dan Gaylin discusses what data democratisation means and the importance of a data-savvy society.

    We are drowning in data. That is not a statement designed for dramatics; we have – ironically – the data to prove it.

    According to Statista, the total amount of data created, captured, copied and consumed globally is projected to grow to more than 394 zettabytes by 2028.

    To put this into context, a single zettabyte translates to roughly 1,000 exabytes, or 1bn terabytes, or 1trn gigabytes.

    In the history of computer science, the time period known as the zettabyte era started around the mid-2010s. This is when global IP traffic – which refers to all digital data that has passed over an IP network – is estimated to have reached 1.2 zettabytes, which was in 2016.

    Considering how long it took to reach a single zettabyte, how quickly we reached almost 150 zettabytes in 2024, and the fact that we’re expected to more than double that in the next four years, I think drowning in data is a fair analogy.

    The explosion of automation and generative artificial intelligence (GenAI) means data creation will only go one way and while knowledge is power, the eye watering volume and the speed at which it is being created doesn’t necessarily lend itself to an automatic increase in data literacy.

    In fact, despite technological advancements, research shows the global digital divide is widening rather than narrowing.

    Dan Gaylin is president and CEO of NORC at the University of Chicago. Formerly called known as the National Opinion Research Center, NORC is a non-partisan global research organisation that uses data and analysis to study all aspects of the human experience.

    With 35 years of experience spanning government, private consulting and not-for-profit research organisations, Gaylin is an internationally recognised authority on issues related to the effective use of data and information.

    He is also the author of ‘Fact Forward: The Perils of Bad Information and the Promise of a Data-Savvy Society’, which is due to be published in April 2025.

    Image: Dan Gaylin

    In an interview with SiliconRepublic.com, Gaylin said the flow of data is no longer linear from producer to disseminator to consumer, leading to data democratisation.

    “Data democratisation comprises the proliferation of multiple types of data on virtually any subject and the greatly increased availability and access to data that the general public enjoys relative to just a few decades ago,” he said.

    “It is the direct result of digital technology. The essential outcome of data democratisation is that anyone can now play the role of a data creator, analyser, disseminator and consumer through their computer or smartphone.

    “On a wider, more societal level, this data democratisation reshapes the social construct of expertise, knowledge, and accepted truth.”

    What a digital divide looks like

    While technology advances in many parts of the world, developing countries and underprivileged communities continue to be left behind. Gaylin said this means that those with greater access to this data – and the skills to use and understand it – are likely to be better informed than those who don’t.

    “Extending that to comparisons of richer and poorer countries is straightforward. Developing countries or poor communities have much to gain from access to trustworthy and reliable data. They may be even more at risk from faulty or manipulated data,” he said.

    “For instance, access to reliable information related to civic participation could help support fledgling democracies, or how access to trustworthy public health information might help reduce the adverse effects of a health emergency.”

    He also said that regulation around said data can address privacy concerns, leading to greater trust in the broader data ecosystem. “And those government and corporate actors can foster that trust by publishing and adhering to – being transparent about – how they’re using and protecting private data. So, a culturally appropriate regulatory framework is part of having a solid data infrastructure.”

    The good, the bad…

    There was a time – again, probably in the early 2010s when the world welcomed Web 2.0 when the goal was to accrue as much data as possible. The more information the better right? Now, coming back to the notion that we’re drowning in all that data, there is a distinction between good and bad data.

    Gaylin said good data is data that is reliable, trustworthy and fit for purpose. “When we use good data to inform our important decisions, those decisions are much more likely to produce effective outcomes,” he said.

    On the other side of the coin, bad data is the opposite of this – some combination of unreliable, untrustworthy, useless and, sometimes, data that has been purposefully designed to manipulate or mislead.

    “When decision-makers are guided by ‘bad data,’ they make suboptimal or even harmful decisions,” said Gaylin. “For example, the data that led to the publication of a paper linking vaccines and autism was a tiny sample and the analysis was flawed. The paper was ultimately withdrawn, but not before it helped create a widespread mistrust of vaccines.”

    Unfortunately, as we live in a world where data is everywhere and, as Gaylin said, no longer comes at us in a linear fashion, users are forced to become much more data savvy

    With the enormous quantity of data we experience every day, each of us has a responsibility to develop data-savvy skills that enable us to differentiate between good and bad data, to use good data effectively, and to avoid being misled or manipulated by bad data.

    …and the ugly

    Not only is ‘bad data’ everywhere, but the platforms that produce it are leaning away from content moderation responsibilities that, at one time, at least attempted to protect us.

    At the beginning of this year, social media giant Meta announced it was ending third-party fact-checking on its platforms, opting instead for community notes, similar to X.

    The move will remove guardrails provided by platform, instead allowing users to write and rate community notes. The platform’s founder and CEO Mark Zuckerberg pointed to the US election as a major influence on the company’s decision and criticised “governments and legacy media” for allegedly pushing “to censor more and more”.

    Fact-checking in social media has always been a challenge, even with moderators in place because of the volume of content, not to mention that opinions and points of view are muddled with factual information.

    “When information is shared on social media, opinions can be presented as facts, and facts will get muddied with opinions. This makes for a very daunting and perhaps simply untenable situation for fact-checking. It’s inevitable that the fact-checkers themselves – their possible biases or motives – become the subject and may be labelled as censors by social media users. So, I am not surprised that Meta found itself in this pickle,” said Gaylin.

    “That said, I think the positive language that Meta used to couch the change is misleading. Meta’s argument is essentially that they are now crowd-sourcing the fact-checking to the metaverse. That sounds good, but all it really means is that Meta is letting people weigh in with their views about other posts, which is what they’ve always done,” he said.

    “In the end, it’s always the user’s responsibility to carefully review where data come from, who is analysing them and presenting conclusions and who is sharing those conclusions.”

    A data-savvy world

    It’s difficult to put the responsibility on the end user. After all, we live in a society where data creation far beyond our control and the technological advancements that are happening around us have the ability to do as much harm as good.

    Outside of the individual, he said there are several steps society needs to take to stay smart about data, including enhancing standards for data formats and privacy, developing an education system that’s geared towards data literacy, having governments produce public data and expand the areas it covers while enhancing data access and have new media improve its role in responsible data dissemination.

    With all that in mind, having data literacy skills remains vital for individuals, especially as we continue down a road where reality is often stranger than fiction and fake news can seem disturbingly real.

    Gaylin said it’s important that individuals have an understanding of where to find, how to access and how to consume data, as well as the ability to discern good data from bad data.

    “Data-savvy individuals are responsible and effective data users in a world where data is now woven into the fabric of our daily activities.”

    Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.

    www.siliconrepublic.com (Article Sourced Website)

    #Sink #swim #stay #afloat #endless #sea #data