To perform seismic analysis on the scale required today, geoscientists must be more than experts in geological formations. They must also learn low-level software programming to quickly analyze large datasets on high-performance computing (HPC) resources. High-level programming environments such as MATLAB enable geoscientists to prototype sophisticated seismic analysis algorithms on their desktops. More importantly, within the same environment, geoscientists can exploit the power of computer clusters by making easy changes to their code. This results in code that is easy to understand and maintain and reduces the time required to take an idea from exploration to implementation.

Seismic analysis challenges
Recent advances in seismic interferometry and accurate depth migration algorithms have

MATLAB code implementation of the Kirchhoff migration algorithm (left) and the same code with PARFOR (right) are shown for parallel execution on multicore systems or clusters. (Images courtesy of MathWorks)

enabled geoscientists to image complex geological formations, including faults, folds, and salt bodies. The rapidly evolving field of exploration seismology is marked by continued algorithmic advancements and the processing and analysis of large datasets that push the limits of HPC.

The challenges in employing seismic analysis to map and qualify geological formations are well-known to exploration geoscientists. First, seismic mapping requires the processing of numerous large shot record data files. In addition,

A MATLAB script shows the process for running on a single desktop (left) or on a 128-node cluster using the matlabpool command (right).

modern migration algorithms are complex and growing as they evolve to exploit new computational resources and incorporate advances in imaging and inversion methods. Efficiently running algorithms on large volumes of data requires access to HPC systems and expertise in programming them. The selection of hardware platform must be considered carefully because the platform can require hardware-specific programming to realize meaningful performance benefits. Successfully addressing these challenges often requires expertise in seismic imaging theory as well as highly specialized software programming skills such as the ability to write the low-level message passing interface routines needed for parallel math computation on large datasets.

Seismic analysis performance
A key to efficient data analysis is to leverage the performance advantages of high-performance systems without

With a simplified process requiring few code annotations for HPC optimizations, geoscientists can develop the complete application in a high-level computing language.

imposing undue programming complexity or an expensive recoding step. The programming environment should support scalability from a simple multicore desktop to cluster and grid configurations with minimal modifications to the application. This facilitates code maintainability and portability and enables scientists to rapidly update applications to take advantage of HPC architecture innovations and resources as they become available.

Many computing languages offer parallel computing capabilities. For example, MATLAB offers functionality ranging from implicit multithreading for multicore computers to explicit techniques that require code annotations but offer more control for optimized parallelism. In MATLAB, an application can be parallelized with a single code edit by changing an FOR loop to a PARFOR loop. Because it uses PARFOR for the time-intensive loop, the parallelized code can run on multiple computing cores (or a single desktop, if desired).

The MATLAB matlabpool command abstracts the complexity of the underlying parallel execution environment, enabling scientists to connect to a variety of HPC resources without requiring custom coding. This streamlines the transition from a desktop, single-core application to a 128-node cluster application.

Streamlined seismic analysis workflow
With a simplified process requiring few code annotations for HPC optimizations, geoscientists can develop the complete

The original 2-D image of a salt formation (left) is compared with a 2-D seismic image reconstruction using Kirchhoff migration in MATLAB (right).

application in a high-level computing language. This more flexible HPC programming paradigm enables a new, streamlined workflow in which the recoding step is eliminated. The workflow offers several advantages, including code that is less complex, more maintainable, and easily ported to other HPC environments.

The Benchmark Salt survey dataset is a synthetic dataset used to model several challenging geological formations. Using MATLAB, scientists can image geological formations and seismic reconstructions using Kirchhoff migration from this dataset in a single environment.

This programming environment provides capabilities that streamline the identification of salt bodies and the oil and gas deposits often found nearby. For example, a flexible SEG Y reader data object creates a reference to the shot record files that can be loaded into a cluster’s memory as a distributed array or mapped to the file and read in only when used in the analysis. The lengthy records are accessed as an array, abstracting the physical location of data in distributed memory or file read/write access and allowing the geophysicist to focus on the physics of the problem rather than the underlying computing infrastructure.

Evolving environments
Technical computing languages and development environments are evolving to take full advantage of HPC processing speed and scale without the need to write complex and hardware architecture-specific code. Using advanced tools, geoscientists can focus on their seismic models and algorithms – not low-level programming – and scale their applications from desktops to clusters without rewriting code for specific HPC architectures.