The Rise of DataOps – Atlan

Knowledge is getting even greater, and conventional information administration simply doesn’t work. DataOps is on the rise, promising to tame at the moment’s chaos and context challenges.

Let’s face it — conventional information administration doesn’t work. In the present day, 75% of executives don’t belief their very own information, and solely 27% of information tasks are profitable. These are dismal numbers in what has been known as the “golden age of information”.

As information simply retains rising in dimension and complexity, we’re struggling to maintain it below management. To make issues worse, information groups and their members, instruments, infrastructure, and use instances have gotten extra various on the similar time. The result’s information chaos like we’ve by no means seen earlier than.

DataOps has been round for a number of years, however proper now it’s on fireplace as a result of it guarantees to unravel this drawback. Only a week aside, Forrester and Gartner lately made main shifts towards recognizing the significance of DataOps.

On June 23 of this yr, Forrester launched the newest model of its Wave report about information catalogs — however as a substitute of being about “Machine Studying Knowledge Catalogs” like regular, they renamed the class to “Enterprise Knowledge Catalogs for DataOps”. Every week later, on the thirtieth, Gartner launched its 2022 Hype Cycle, predicting that DataOps will absolutely penetrate the market in 2-5 years and shifting it from the far left facet of the curve to its “Peak of Inflated Expectations”.

Google Trends data for "DataOps" searches since 2015
Google Tendencies information on international searches for “DataOps” since 2015. The y-axis reveals “curiosity over time”, or a normalized model of search curiosity. 100 represents peak recognition for the time period within the given time and area.

However the rise of DataOps isn’t simply coming from analysts. At Atlan, we work with trendy information groups all over the world. I’ve personally seen DataOps go from an unknown to essential, and a few firms have even constructed whole methods, features, and even roles round DataOps. Whereas the outcomes differ, I’ve seen unbelievable enhancements in information groups’ agility, pace, and outcomes.

On this weblog, I’ll break down all the things it’s best to find out about DataOps — what it’s, why it’s best to care about it, the place it got here from, and learn how to implement it.

What’s DataOps?

The primary, and maybe most necessary, factor to find out about DataOps is that it’s not a product. It’s not a instrument. In actual fact, it’s not something you should purchase, and anybody making an attempt to inform you in any other case is making an attempt to trick you.

As an alternative, DataOps is a mindset or a tradition — a method to assist information groups and folks work collectively higher.

DataOps could be a bit exhausting to know, so let’s begin with just a few well-known definitions.

DataOps is a collaborative information administration observe targeted on bettering the communication, integration and automation of information flows between information managers and information shoppers throughout a corporation.


DataOps is the power to allow options, develop information merchandise, and activate information for enterprise worth throughout all expertise tiers from infrastructure to expertise.


DataOps is an information administration technique that emphasizes communication, collaboration, integration, automation and measurement of cooperation between information engineers, information scientists and different information professionals.

Andy Palmer

As you possibly can inform, there’s no customary definition for DataOps. Nonetheless, you’ll see that everybody talks about DataOps by way of being past tech or instruments. As an alternative, they give attention to phrases like communication, collaboration, integration, expertise, and cooperation.

In our thoughts, DataOps is absolutely about bringing at the moment’s more and more various information groups collectively and serving to them work throughout equally various instruments and processes. Its rules and processes assist groups drive higher information administration, save time, and cut back wasted effort.

Why do you have to care about DataOps?

The brief reply: It helps you tame the info chaos that each information individual is aware of all too effectively.

Now for the longer, extra private reply…

At Atlan, we began as an information crew ourselves, fixing social good issues with large-scale information tasks. The tasks have been actually cool — we started working with organizations just like the UN and Gates Basis on large-scale tasks affecting tens of millions of individuals.

However internally, life was chaos. We handled each fireplace drill that would probably exist, resulting in lengthy chains of irritating cellphone calls and hours spent making an attempt to determine what went incorrect. As an information chief myself, this was a personally susceptible time, and I knew it couldn’t proceed.

We put our minds to fixing this drawback, did a bunch of analysis, and came upon the thought of “information governance”. We have been an agile, fast-paced crew, and conventional information governance didn’t seem to be it match us. So we got here collectively, reframed our issues as “How Would possibly We” questions, and began an inner venture to unravel these questions with new tooling and practices. By bringing inspiration from various industries again to the info world, we stumbled upon what we now know as DataOps.

It was throughout this time that we noticed what the precise tooling and tradition can do for an information crew. The chaos decreased, the identical large information tasks turned exponentially quicker and simpler, and the late-night calls turned splendidly uncommon. And in consequence, we have been in a position to accomplish much more with far much less. Our favourite instance: we constructed India’s nationwide information platform, completed by an eight-member crew in simply 12 months, lots of whom had by no means pushed a line of code to manufacturing earlier than.

We later wrote down our learnings in our DataOps Tradition Code, a set of rules to assist an information crew work collectively, construct belief, and collaborate higher.

That’s finally what DataOps does, and why it’s all the trend at the moment — it helps information groups cease losing time on the infinite interpersonal and technical pace bumps that stand between them and the work they like to do. And in at the moment’s financial system, something that saves time is priceless.

Atlan's DataOps Culture Code
A part of our DataOps Tradition Code

The 4 elementary concepts behind DataOps

Some folks wish to say that information groups are identical to software program groups, and so they attempt to apply software program rules on to information work. However the actuality is that they couldn’t be extra completely different.

In software program, you will have some stage of management over the code you’re employed with. In spite of everything, a human someplace is writing it. However in an information crew, you usually can’t management your information, as a result of it comes from various supply programs in quite a lot of consistently altering codecs. If something, an information crew is extra like a producing crew, reworking a heap of unruly uncooked materials right into a completed product. Or maybe an information crew is extra like a product crew, taking that product to all kinds of inner and exterior finish shoppers.

The way in which we like to consider DataOps is, how can we take the most effective learnings from different groups and apply them to assist information groups work collectively higher? DataOps combines the most effective elements of Lean, Product Pondering, Agile, and DevOps, and making use of them to the sector of information administration.

The 4 elementary concepts behind DataOps


Key thought: Scale back waste with Worth Stream Mappings.

Although its roots return to Benjamin Franklin’s writings from the 1730s, Lean comes from Toyota’s work within the Fifties. Within the shadow of World Battle II, the auto trade — and the world as a complete — was getting again on its toes. For automobile producers in every single place, staff have been overworked, orders delayed, prices excessive, and prospects sad.

To resolve this, Toyota created the Toyota Manufacturing System, a framework for conserving sources by eliminating waste. It tried to reply the query, how are you going to ship the very best high quality good with the bottom value within the shortest time? One in all its key concepts is to remove the eight kinds of waste in manufacturing wherever doable — from overproduction, ready time, transportation, underutilized staff, and so forth — with out sacrificing high quality.

The TPS was the precursor to Lean, coined in 1988 by businessman John Krafcik and popularized in 1996 by researchers James Womack and Daniel Jones. Lean targeted on the thought of Worth Stream Mapping. Identical to you’d map a producing line with the TPS, you map out a enterprise exercise in excruciating element, establish waste, and optimize the method to keep up high quality whereas eliminating waste. If part of the method doesn’t add worth to the shopper, it’s waste — and all waste needs to be eradicated.

What does a Worth Stream Mapping really appear to be? Let’s begin with an instance in the true world.

Lean: Example Value Stream Mapping
The Worth Stream Mapping for ordering a espresso in a restaurant

Say that you simply personal a restaurant, and also you need to enhance how your prospects order a cup of espresso. Step one is to map out all the things that occurs when a buyer takes once they order a espresso: taking the order, accepting fee, making the espresso, handing it to the shopper, and many others. For every of those steps, you then clarify what can go incorrect and the way lengthy the step can take — for instance, a buyer having bother finding the place they need to order, then spending as much as 7 minutes ready in line as soon as they get there.

How does this concept apply to information groups? Knowledge groups are much like manufacturing groups. They each work with uncooked materials (i.e. supply information) till it turns into a product (i.e. the “information product”) and reaches prospects (i.e. information shoppers or finish customers).

So if a provide chain has its personal worth streams, what would information worth streams appear to be? How can we apply these similar rules to a Knowledge Worth Stream Mapping? And the way can we optimize them to remove waste and make information crew extra efficients?

Product pondering

Key thought: Ask what job your product is absolutely undertaking with the Jobs To Be Achieved framework.

The core idea in product pondering is the Jobs To Be Achieved (JTBD) framework, popularized by Anthony Ulwick in 2005.

The best strategy to perceive this concept is thru the Milkshake Idea, a narrative from Clayton Christensen. A quick meals restaurant needed to extend the gross sales of their milkshakes, so that they tried a whole lot of completely different modifications, akin to making them extra chocolatey, chewier, and cheaper than opponents. Nonetheless, nothing labored and gross sales stayed the identical.

Subsequent, they despatched folks to face within the restaurant for hours, amassing information on prospects who purchased milkshakes. This led them to appreciate that almost half of their milkshakes have been offered to single prospects earlier than 8 am. However why? Once they got here again the subsequent morning and talked to those folks, they discovered that these folks had a protracted, boring drive to work and wanted a breakfast that they may eat within the automobile whereas driving. Bagels have been too dry, doughnuts too messy, bananas too fast to eat… however a milkshake was excellent, since they take some time to drink and preserve folks full all morning.

As soon as they realized that, for these prospects, a milkshake’s objective or “job” was to offer a satisfying, handy breakfast throughout their commute, they knew they wanted to make their milkshakes extra handy and filling — and gross sales elevated.

The JTBD framework helps you construct merchandise that individuals love, whether or not it’s a milkshake or dashboard. For instance, a product supervisor’s JTBD is likely to be to prioritize completely different product options to realize enterprise outcomes.

How does this concept apply to information groups? Within the information world, there are two fundamental kinds of prospects: “inner” information crew members who have to work extra successfully with information, and “exterior” information shoppers from the bigger group who use merchandise created by the info crew.

We are able to use the JTBD framework to know these prospects’ jobs. For instance, an analyst’s JTBD is likely to be to offer the analytics and insights for these product prioritization selections. Then, when you create a JTBD, you possibly can create a listing of the duties it takes to realize it — every of which is a Knowledge Worth Stream, and might be mapped out and optimized utilizing the Worth Stream Mapping course of above.


Key thought: Enhance velocity with Scrum and prioritize MVPs over completed merchandise.

Should you’ve labored in tech or any “trendy” firm, you’ve in all probability used Agile. Created in 2001 with the Agile Software program Growth Manifesto, Agile is a framework for software program groups to plan and observe their work.

The core thought in Agile is Scrum, an iterative product administration framework primarily based on the thought of making an MVP, or minimal viable product.

Right here’s an instance: for those who needed to construct a automobile, the place do you have to begin? You possibly can begin with conducting interviews, discovering suppliers, constructing and testing prototypes, and so forth… however that may take a very long time, throughout which the market and world may have modified, and chances are you’ll find yourself creating one thing that individuals don’t really like.

Agile: Benefits of a Minimum Viable Product (MVP)
Six ways in which an MVP helps enhance improvement

An MVP is about shortening the event course of. To create an MVP, you ask what the JTBD is — is it actually about making a automobile, or is it about offering transportation? The primary, quickest product to unravel this job could possibly be a motorcycle quite than a automobile.

The purpose of Scrum is to create one thing as fast as doable that may be taken to market and be used to collect suggestions from customers. Should you give attention to discovering the minimal answer, quite than creating the best or dream answer, you possibly can be taught what customers really need once they check your MVP — as a result of they often can’t specific what they really need in interviews.

How does this concept apply to information groups? Many information groups work in a silo from the remainder of the group. When they’re assigned a venture, they’ll usually work for months on an answer and roll it out to the corporate solely to be taught that their answer was incorrect. Perhaps the issue assertion they got was incorrect, or they didn’t have the context they wanted to design the precise answer, or perhaps the group’s wants modified whereas they have been constructing their answer.

How can information groups use the MVP strategy to scale back this time and are available to a solution faster? How can they construct a transport mindset and get early, frequent suggestions from stakeholders?

Agile can be utilized to open up siloed information groups and enhance how they work with finish information shoppers. It could actually assist information groups discover the precise information, deliver information fashions into manufacturing and launch information merchandise quicker, permitting them to get suggestions from enterprise customers and iteratively enhance and adapt their work as enterprise wants change.


Key thought: Enhance collaboration with launch administration, CI/CD, and monitoring.

DevOps was born in 2009 on the Velocity Convention Motion, the place engineers John Allspaw and Paul Hammond introduced about bettering “dev & ops cooperation”.

The normal pondering on the time was that software program moved in a linear circulation — the event crew’s job is so as to add new options, then the operations crew’s job is to maintain the options and software program secure. Nonetheless, this discuss launched a brand new thought: each dev and ops’ job is to allow the enterprise.

DevOps turned the linear improvement circulation right into a round, interconnected one which breaks down silos between these two groups. It helps groups work collectively throughout two various features by way of a set course of. Concepts like launch administration (implementing set “transport requirements” to make sure high quality), and operations and monitoring (creating monitoring programs to alert when issues break), and CI/CD (steady integration and steady supply) make this doable.

DevOps toolchain
The DevOps toolchain (created by Kharnagy on Wikipedia)

How does this concept apply to information groups? Within the information world, it’s straightforward for information engineers and analysts to perform independently — e.g. engineers handle information pipelines, whereas analysts construct fashions — and blame one another when issues inevitably break. As an alternative of options, this simply results in bickering and resentment. As an alternative, it’s necessary to deliver them collectively below a typical purpose — making the enterprise extra data-driven.

For instance, your information scientists could rely upon both engineering or IT now to deploy their fashions—from exploratory information evaluation to deploying machine studying algorithms. With DataOps, they’ll deploy their fashions themselves and carry out evaluation shortly — no extra dependencies.

Going from DevOps to DataOps
Going from DevOps to DataOps

Word: I can not emphasize this sufficient — DataOps isn’t simply DevOps with information pipelines. The issue that DevOps solves is between two extremely technical groups, software program improvement and IT. DataOps solves complicated issues to assist an more and more various set of technical and enterprise groups create complicated information merchandise, all the things from a pipeline to a dashboard or documentation. Study extra.

How do you really implement DataOps?

Each different area at the moment has a targeted enablement perform. For instance, SalesOps and Gross sales Enablement give attention to bettering productiveness, ramp time, and success for a gross sales crew. DevOps and Developer Productiveness Engineering groups are targeted on bettering collaboration between software program groups and productiveness for builders.

Why don’t we’ve got an analogous perform for information groups? DataOps is the reply.

Establish the tip shoppers

Somewhat than executing information tasks, the DataOps crew or perform helps the remainder of the group obtain worth from information. It focuses on creating the precise instruments, processes, and tradition to assist different folks achieve success at their work.

Key consumers of DataOps
The important thing shoppers of DataOps

Create a devoted DataOps perform

A DataOps technique is only when it has a devoted crew or perform behind it. There are two key personas on this perform:

  • DataOps Enablement Lead: They perceive information and customers, and are nice at cross-team collaboration and bringing folks collectively. DataOps Enablement Leads usually come from backgrounds like Data Architects, Knowledge Governance Managers, Library Sciences, Knowledge Strategists, Knowledge Evangelists, and even extroverted Knowledge Analysts and Engineers.
  • DataOps Enablement Engineer: They’re the automation mind within the DataOps crew. Their key power is sound data of information and the way it flows between programs/groups, appearing as each advisors and executors on automation. They’re usually former Builders, Knowledge Architects, Knowledge Engineers, and Analytics Engineers.
How WeWork structured their DataOps function around 2 personas
How WeWork structured their DataOps perform round two key personas

Map out worth streams, cut back waste, and enhance collaboration

At the start of an organization’s DataOps journey, DataOps leaders can use the JBTD framework to establish widespread information “jobs” or duties, also called Knowledge Worth Streams. Then, with Lean, they’ll do a Worth Stream Mapping train to establish and remove wasted effort and time in these processes.

In the meantime, the Scrum ideology from Agile helps information groups perceive how construct information merchandise extra effectively and successfully, whereas concepts from DevOps present how they’ll collaborate higher with the remainder of the group on these information merchandise.

Learnings from the 4 movements behind DataOps
What information groups can be taught from the 4 actions behind DataOps

Making a devoted DataOps technique and performance is way from straightforward. However for those who do it proper, DataOps has the potential to unravel a few of at the moment’s greatest information challenges, save time and sources throughout the group, and improve the worth you get from information.

In our subsequent blogs, we’ll dive deeper into the “how” of implementing a DataOps technique, primarily based on greatest practices we’ve seen from the groups we’ve labored with — learn how to establish information worth streams, learn how to construct a transport mindset, learn how to create a greater information tradition, and extra. Keep tuned, and let me know in case you have any burning questions I ought to cowl!

To get future DataOps blogs in your inbox, join my e-newsletter: Metadata Weekly

Header photograph by Chris Liverani on Unsplash

Latest articles

Related articles

Leave a reply

Please enter your comment!
Please enter your name here