While the term “big data” has been tossed around in the IT industry since 1997, it’s only just beginning to surface in the energy world. That’s because massive amounts of data are ubiquitous in the oil and gas industry (e.g., 3-D seismic data, well logs, GIS data, simulations, and the copious detail collected at a drillsite). New advances in oilfield data acquisition like wired drillpipe only make the data volumes and rates that much larger.

Technology research firm Gartner defines big data as “high-volume, high-velocity, and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.” Big data isn’t new, but there are new approaches to handling the data and innovative data-processing techniques that are different.

The oil and gas industry allocates a lot of resources to “structured” data – data held in relational databases and other easily queried structures – and does a pretty good job of managing them. The industry’s record for “unstructured” data is not quite as good; companies typically have document-management systems to store and retrieve documents and data too large or complex to be broken apart and stored as records in a database (e.g., seismic data files or well log files).

Some of these unstructured data have been scanned and turned into a resource that can be queried more easily, but the amount of processing one can do in place over the entire dataset is limited.

The industry’s record for using data the IT world calls “semi-structured” is really poor. These are data such as seismic observer sheets, drilling morning reports, or archives of raw XML data files that would require a blend of structured and unstructured methods that aren’t often applied. The big data approach would change the rules for oil and gas industry data management. It tells us to:

  • Analyze the data where they lie; don’t move them to the analysis stage. If you can easily move the data to your PC to analyze, was the dataset really that big?;
  • Not wait until every “i” is dotted. Value can be obtained now, before you’ve mapped and loaded it into that big relational database (you can still get additional value later);
  • Never throw anything away! New techniques are emerging every day that will make “useless” data a competitive advantage;
  • Bring other concepts into the mix. Bring data into your analysis that you’re not sure will have any bearing on what you’re doing because the new techniques can handle the load; and
  • Let the data do the talking. There is a new generation of data scientists who can drive these applications that will add significant value.

Some of the processing techniques used to deal with big data have been pioneered by the oil and gas industry – or at least used extensively in it (specialized hardware, visualization, and simulation, for example), but some are new to us. One method is complex event processing (CEP), which recognizes patterns from multiple data sources in real time. The goal of CEP is to infer patterns based on trends across multiple input channels and to respond to them as quickly as possible, which requires identifying patterns in advance.

There also is a new category of software called search-based applications that mines search indexes to discover relationships among information that would be impractical to subject to statistical correlation due to the size or constantly changing nature of the data.

Operators and oilfield service companies are at the forefront of new data-intensive technologies. Embracing big data and all it has to offer is the best hope for managing and analyzing these data to get immediate value.