How many of us seismologist end up feeling like accountants instead of scientists when faced with the analysis of a large dataset of waveforms? Much of our time is devoted to the bookkeeping task of finding and accessing waveforms and the resulting reduced parameters. Once some analysis has been done and additional processing complete, what happens when one wants to go back and review different sections or compare results based on different initial analysis parameters? For the Electronic Seismologist (ES) this can turn into a nightmare of saved intermediate directories and files. Most of us have a set of programs, file formats, and conventions which help keep track of our original and reduced data. These are usually highly dependent on to specific analysis tools used and need significant adaptation when a new tool is added to the analysis arsenal.
There are some nice analysis packages, such as the venerable SAC program from LLNL, which solve part of the bookkeeping issue by including reduced parameters directly in the waveform files and using built in analysis tools. A different approach is taken by the Datascope Seismic Application Package from the IRIS JSPC, where a set of analysis tools have been written as part of an integrated database-like file organization system. Both of these very powerful analysis systems have been written and maintained by small talented teams who developed most parts of their systems themselves. The ES hopes to provide future columns on these two packages, as well as others which catch his attention. In the meantime, he is impressed with a slightly different approach taken by a group of researchers in Germany who wanted to use a set of well-known and easily available analysis tools developed by others but hooked together with a good bookkeeping system. To the ES this sounded like a giant of a task.
While the ES has no practice experience with this system (yet), his review of the very complete and detailed manual peaked his interest. At first glance, the only detriment which he readily sees is "GIANT"'s dependence on a commercial database management system which, probably because of his ignorance and tight budgets, the ES mostly tries to avoid. In any case, it's a significant and clever mix of analysis tools, and others may also be interested in it. The ES convinced the authors of the "GIANT" system to be the guest authors for this issues column.
The GIANT Analysis System (Graphical Interactive Aftershock Network Toolbox)
Andreas Rietbrock FR Geophysik, Freie Universität Berlin, Malteserstr. 74-100, D-12249, Berlin, Germany Tel: +49-30-830-433, Fax: +49-30-7758056, e-mail: email@example.com.
Frank Scherbaum Institut für Geowissenschaften, Universität Potsdam, POB 601553, D-14415 Potsdam, Germany, Tel.: +49-331-977-2681, Fax +49-331-977-2087, e-mail: firstname.lastname@example.orgINTRODUCTION
State of the art seismic networks consist of increasingly large numbers (>50) of digital seismic stations. Since most of them are equipped with high dynamic range data acquisition systems recording at high sampling rates, large amounts of digital data are produced on a daily basis. Analyzing large data volumes is an increasingly demanding task. Without efficient organization of the recorded waveform one can easily spend a lot of time simply searching for a particular trace.
The GIANT system was designed primarily for the analysis of large scale portable experiments in seismology such as aftershock monitoring campaigns. Naturally, it can also be used for the analysis of permanent networks. Originally, the motivation for its development came from the attempt to analyze the very heterogeneous Loma Prieta aftershock data set which was collected by various agencies. At the time conventional processing was out of the question. The main goal for the design of the GIANT system was to find a reasonable way to deal quickly and thoroughly with large volumes of heterogeneous portable network data. The following is a short overview of the basis concepts of GIANT. A complete manual with further information as well as the software itself is available on the Web (http://www.geo.uni-potsdam.de/Software/service_g.htm). The GIANT system is currently running on SunOS, Solaris and Linux and uses the X11 windowing system.BASIC DESIGN CONSIDERATIONS
An aftershock experiment can rapidly generate hundreds of Megabytes or even Gigabytes of digital waveform data. Under normal circumstance it is impossible for a single individual to analyze such volumes by him/herself. Commonly detailed analysis is done by a group or even several groups of seismologists, each focusing on those parts of the analysis that concern their special interests. On the left side of Figure 1, the kind of parameters that are typically extracted from aftershock or local earthquake datasets are shown. The right side shows the kind of models that are usually derived from them, such as the fault structure, the stress tensor, a velocity model, and attenuation. These analyses are often completely decoupled and might even be done by different groups. For example, group A may determine the locations and focal mechanisms using one crustal model, group B may be performing spectral analysis with a different model, and group C may be doing the coda analysis with assumptions of their own. Meanwhile group D may have developed a 3D velocity model and obtained different locations for the earthquakes than group A. This would in principle require that all the focal mechanisms be redone by group A using the new model.
With this kind of decoupled approach, it is very hard to obtain consistent results for different crustal properties within the region of interest. Usually it is very hard to tell if the differences in the results are pure artifacts of the analysis procedure or provide insights into the physics of the earth. While this problem may be insignificant for certain questions, it is definitely not for all, especially not if we are interested in high resolution analysis of small-scale features.
Being able to obtain consistent models for a wide variety of questions was the driving force and the primary goal for the design of our new analysis system. To illustrate some of the mutual interdependencies we look at the simple task of locating earthquakes and determining fault-plane solutions. In the first step of any location procedure a velocity model is assumed. This can result from prior analysis or from some educated guess. During the course of the analysis the velocity model might be refined according to observed travel-time residuals. Subsequently, focal mechanisms will be computed. Very often wrong polarities will be found, which may be caused by poor locations or even by an inappropriate velocity model. For the determination of improved fault plane solutions it is necessary to refine the velocity model. This, however, would require locating the events once again, which in turn could lead to different hypocentral coordinates and therefore to different fault plane solutions. These backward/forward dependencies are often found during the analysis of large earthquake datasets.
During a complete waveform analysis, no matter what individual steps are involved, a heap of parameters will evolve which provides constraints for the subsequent analysis steps. With data volumes on the order of Gigabytes, this requires a clever way of interaction between the data, the parameter heap and the analyst. It is desirable to do this completely automatically. Although automated processing has come a long way already, one cannot rely on an automated approach for all parts of data processing. On the other hand, completely interactive processing for huge volumes of data volumes is out of question also. For these reasons we used a semi-automated approach based on the assumption that only a small number of interactively determined parameters are sufficient to start the rest of an in-depth analysis which needs little manual interaction. The main goal was to minimize the number of interactive analysis steps to the absolute minimum. In order to be able to efficiently deal with huge amounts of data, we choose a database approach for organizing the recorded waveforms based on the dbVista Data Base Manager (Raima Corporation, 1991). Quick extraction of the digital seismograms and all the complex organizational tasks are performed through a graphical user interface, which acts as the root window of the GIANT processing system. Under control of GIANT a number of well established analysis tools are incorporated and communicate through GIANT with the waveform database and the parameter heap (Figure 2).DETERMINATION OF SOURCE AND WAVEFORM PARAMETERS
To minimize the need for manual interactive analysis we attempt to determine the most essential parameters during the first step, which is the 1D-location of events. Location parameters are subsequently used as starting values for further analysis processes. In this context the recorded seismogram is described as a set of seismic phases which can be characterized by a few simple parameters. These parameters, which are determined interactively, are the start time of the phase and its corresponding uncertainties, the maximum and minimum amplitudes with their corresponding times, the rise time of the phase, and the phase end time. Since most of the subsequent analysis steps are based on these parameters, consistent determination is essential at this point. This is achieved by checking if they give consistent 1D locations and, if possible, consistent fault-plane solutions. In other words, it is useful to check the location and the determined wavelet parameters against synthetic travel times and against a fault plane solution. Both P-wave and S-wave phases can be used, with the onset of the S-wave being determined on the horizontal components rotated into a ray-based coordinate system.
In practice the analyst performs this set of tasks under visual control (Figure 3). Displayed on the analyst's screen are the waveforms and determined phase parameters, the synthetic arrival times, the station geometry relative to the epicenter, the velocity model with source depth, and the fault-plane solution. In the left upper panel a typical GIANT main window is shown for an event which has just been located and for which the fault plane solution was determined. In this window all available seismic waveforms are shown sorted by time (X-axis) and station name (Y-axis). Each symbol represents a recorded waveform. The length of the waveform is indicated by the length of the individual black lines. The vertical lines indicate the selected time range. A selection of the waveforms can be performed by simple mouse actions. In the middle upper panel the fault plane solution as computed with FPFIT is shown. The upper right panel displays the corresponding station geometry. The lower left panel shows two seismograms of the selected event. The upper part of this window displays whole seismograms, while in the lower part only the time window around the first P-wave arrival is shown (zoom window). Superimposed are the picked onset times for the P-wave (indicated by a colon) and the theoretical onset times (without a colon). The number of traces which will be shown together can be arbitrarily selected. The same holds for the length of the zoom window. In the right lower panel the actual velocity model is displayed with focal depth superimposed.
After the first determination of waveform parameters (onset times, polarities, amplitudes, ...) any subsequent changes cause a re-computation of parameters derived from them. The earthquake may be relocated, synthetic travel-times recomputed and all plots updated to reflect any changes. Observing large residuals can be used to possibly correct interpretation errors, potentially caused by phase misinterpretations. Waveforms can be easily visualized and further processed if it is desired inside PITSA. Figure 4 displays the flowchart for this "consistent 1D localization" procedure and the corresponding quality control scheme.RECENT AND FUTURE DEVELOPMENTS
Since we have started to work with very large datasets, the need for more automated processing has become apparent. For seismic networks containing more than 50 stations and continuous recording in regions of high seismicity the semi-automated process had to be augmented by some automated pre-processing. As a rapid solution to our particular needs, we now precede interactive analysis with a very simple automated event detection scheme. For the future we are planning to interface more closely to other existing automated systems such as the "Earthworm" system. Our goal would be to let such a system do more and more of the routine processing and use the GIANT system as the front end for interactive quality control and for more sophisticated and flexible waveform analysis.ACKNOWLEDGMENTS
Part of this work was financed by the Deutsche Forschungs Gemeinschaft and the GeoForschungsZentrum Potsdam. We also would like to thank M. Ohrnberger and J. Wassermann for their contributions to some of the tools and especially for the long debugging hours.REFERENCES
SRL encourages guest columnists to contribute to the "Electronic Seismologist." Please contact Steve Malone with your ideas. His e-mail address is email@example.com.
Posted: 26 February 1998