Beware the Hype Over Big Data Analytics

I wrote this piece in 2012 as a rebuttal to Seeking Alpha's argument that investors should be "very, very wary" of big data analytics companies. I thought their skepticism was misplaced. Fourteen years later, I was right about the substance but wrong about the timeline -- and I've watched the exact same hype cycle play out again with AI.

What We Argued in 2012

Seeking Alpha's case boiled down to three points: that collecting and analyzing large quantities of data wasn't new, that organizations hadn't gotten "Moneyball" results from analytics, and that companies could do it all in-house anyway. I pushed back on each one.

On the first point, I argued that while large-scale data analysis wasn't new in concept, the economics had changed completely. Open source software like Hadoop and Cassandra, combined with on-demand cloud computing, had removed the multimillion-dollar barrier to entry. When you combine plummeting costs with rocketing demand, the handful of large companies that had been doing this before became irrelevant.

On the "where are the results?" question, I compared data to a natural resource. "You can't wear wool that you've just shaved from a sheep," I wrote, "and you can't get value by looking at raw database tables." The processing and refining step was where the industry was still figuring things out. Fair enough.

On doing it in-house, I agreed that many analytics problems could be solved by someone trained in Excel. But I pointed out that the article was supposed to be about big data -- the kind that makes Excel crash and burn.

I compared data to wool in 2012: you can't wear it straight off the sheep. That analogy holds for AI in 2026. Raw models don't solve business problems. The processing, refining, and integration work is where value gets created -- and where most companies still struggle.

The Hype Cycle Repeated Itself

Here's what's eerie. Take every argument from that 2012 Seeking Alpha article, replace "big data" with "AI," and you have a 2024 thinkpiece. The pattern is identical:

"AI isn't new" -- true, but the economics changed with transformer architectures and cloud-scale compute. "Companies aren't getting results" -- true for many, because they're still figuring out the integration work. "You can do it in-house" -- sort of, but the hard part isn't running a model, it's building the data infrastructure and operational processes around it.

Gartner's hype cycle actually predicted this. Big data hit the "Peak of Inflated Expectations" around 2012-2013, slid through the "Trough of Disillusionment" around 2015-2016, and reached the "Plateau of Productivity" around 2018-2019. Generative AI hit its own peak in 2023-2024. As I write this in early 2026, we're squarely in the trough. The companies that overpromised are getting punished. The ones that did the hard work of building operational systems are quietly winning.

What Changed (And What Didn't)

The big data market matured exactly as the fundamentals suggested it would. The global big data analytics market grew from about $15 billion in 2013 to over $300 billion by 2025. Hadoop, the technology everyone was excited about in 2012, is essentially dead -- replaced by cloud-native platforms like Snowflake, Databricks, and BigQuery. The underlying need for large-scale data analysis turned out to be very real. The specific technologies were disposable.

The talent gap I mentioned in 2012 never fully closed, either. We went from "data scientist is the sexiest job of the 21st century" (Harvard Business Review, 2012) to a shortage of ML engineers and AI specialists that looks almost identical. Companies are again paying ridiculous salaries for people with the right skills, and vendors are again promising that their AI education tools will let "anyone" do AI without specialists. Sound familiar?

What didn't change is the fundamental dynamic: technology that makes data useful is valuable, but only if you can actually integrate it into business operations. That was true for BI, true for big data, and it's true for AI. The data readiness requirements haven't gotten easier -- they've gotten more demanding.

The Real Lesson

The Seeking Alpha authors weren't wrong to question the hype. They were wrong about what it meant. Hype doesn't mean the underlying technology is worthless. It means the gap between promise and delivery is temporarily wide. That gap always narrows as companies build the operational infrastructure to turn technology into results.

If you're evaluating AI investments today with the same skepticism that article applied to big data in 2012, I'd tell you the same thing I said then: the fundamentals are real, the timeline is longer than the vendors claim, and the winners will be the companies that do the boring integration work while everyone else is chasing the hype.