No matter who you are, seismologist or regular person on the street, when you feel the Earth move you want to know what's going on. Was it an earthquake? Where was the earthquake? How big was it? As a grad student, many moons ago, when the Earth moved, the Electronic Seismologist (ES) was known to immediately turn on the "AM/FM-Automatic-Earthquake-Locator." Before the seismograms could be pulled off the photographic drums, developed, and read and an "official" hypocenter determined (using a large map and a piece of string to swing arcs), the radio would usually have reported a location. Individuals feeling the earthquake would have called radio and TV stations (not to mention the police, newspapers, and sometimes the seismograph station), reported feeling something, and described what it was like. Reporters taking these calls got pretty good at estimating roughly where the event was, and they sometimes came up with a fairly good estimate of the magnitude. This seat-of-the-pants radio-seismology is fast becoming a lost art. Reporters now race to their computers and point their Web browsers at the nearest seismic network where they can count on finding, within minutes, an automatic but "official" location and magnitude for the earthquake.
Of course, it has taken seismologists a while to get to this point for much of the U.S. Determining the location and size of the event rapidly is one thing, but communicating that information to the public in a reliable, consistent, and clear way is another. Looking around the world for Web-based catalogs and maps one can see a huge variety of types and presentation styles. Some are clever and cute or have very attractive shaded relief maps or animation or zooming/panning capabilities. Various bells and whistles are fun to play with and sometimes provide useful extra information.
Unfortunately, if there are many users for such pages, many may get nothing because of an overload on Web servers. Even if the page is successfully delivered for the casual or new user, it may be hard to understand the flashy information provided. Not only is there the confusion of different presentation styles and information content, but additional confusion occurs when two different Web sites show different locations and magnitudes for the same event. When an earthquake occurs between two regional networks, it's likely there will be some differences in the hypocenter parameters determined. It's quite possible a curious user might find two or even three (a national or global network might report the event as well) sets of parameters. When the press finds this kind of confusion, the resulting story often has a strong component of the "silly seismologists can't figure out what is really going on."
There is hope, however. Three "packages", originally developed fairly independently, have been combined to help solve all of these problems. From simple, reliable delivery of the basic information to sorting out multiple versions of an event to a clear simple Web presentation, it's all here. The ES is happy to host a guest column written by the authors, coordinators, and guiding lights of these packages.
RAPID DISTRIBUTION OF EARTHQUAKE INFORMATION FOR EVERYBODY
On 18 April 1996, for the 90th anniversary of the Great San Francisco Earthquake, the USGS continued its attempts to provide the public with current earthquake information by unveiling a redesigned World Wide Web site. This site was the subject of several newspaper articles that morning and was then severely tested by the response to two magnitude ~4 earthquakes that were widely felt throughout the southern San Francisco Bay area. The Web server briefly served requests at a rate of 4,500 hits/hour and then crashed. This unfortunate event suggested that Mother Nature was warning us to rethink our Web server strategy. In fact, the strategy for serving up near-real-time earthquake information for much of the U.S. needed to be overhauled.
In ancient days (the 1970's and before), earthquake locations were determined by hand from analog records, an error-prone and time-consuming operation. In the 1980's, as the world went digital, software was developed that allowed much quicker, even automatic, determination of hypocenters. This development encouraged the rapid distribution of information through a variety of media such as e-mail, pagers, facsimile, and the Internet. There was also a proliferation of many different, incompatible formats and data transmission protocols as each seismic network did its own thing and much effort was spent trying to interface with another network's way of doing it.
Another problem is that this patchwork of systems was confusing to outside users of seismological information. In order to avoid confusing these users the seismological community needs to provide a single location and magnitude for each earthquake. Attempts to do this are complicated by the quilt of sometimes overlapping regional networks and the U.S. National Network.
At the 1997 annual meeting of the Council of the National Seismic System (CNSS) a resolution (CNSS Resolution No. 97-02) was passed "that all member institutions adopt the goals of (A) coordinated rapid earthquake reporting among regional seismic networks and the USNSN/NEIS and (B) the unified distribution of earthquake information via the World Wide Web by all CNSS networks." Achieving this goal required four basic steps: (1) choosing a standardized data format, (2) developing a method for transmitting the information between networks and to others, (3) dealing with redundant information when multiple networks record a single earthquake, and (4) developing a method for displaying the information on the World Wide Web.
Starting in the early 1990's, the "Caltech/USGS Broadcast of Earthquakes" (CUBE) broadcast hypocenter information within minutes of origin time to pagers attached to personal computers. Epicenters were displayed on a map along with a list of important event parameters. Due to its success, CUBE spread to other regional networks. The data formats developed by the CUBE project include earthquake summary information (origin time, location, magnitude) and text comments about an event, among many others. We adopted this format because it was already in use by a variety of regional networks and because the format included almost all of the key information we required. Each network can provide updated information for an event by submitting a new summary with a new version number. Event delete messages can also be issued to remove an event from a network's catalog. To the standard CUBE formats we have added a new message that allows World Wide Web links to be sent as additional information about an earthquake.
The goal of the data distribution system is to allow many seismic networks to contribute information and have this information reliably transmitted to an even larger number of information users. Early prototypes of the system relied on the Unix remote copy and shell programs (rcp and rsh) or on e-mail to achieve these goals. These attempts were both unreliable, restricted to a small number of operating systems, and would not scale well to a large number of users.
In the summer of 1998, a new Quake Data Distribution System (QDDS) was designed using the relatively new Java technology, which was well suited to Internet data exchange. Since QDDS is written in Java, it can be run, with no changes or recompilation, on many operating systems.
QDDS is a distributed hub-leaf system. Each seismic network operates a leaf, which sends information on earthquakes to the hubs. The hubs, in turn, distribute this information to all of the leaves. Some leaves only receive data, such as those run by information users that are not also seismic networks. To achieve reliability through redundancy, multiple independent hubs can be used, which can result in multiple copies of each message reaching the leaves.
So that QDDS can be scaled to support many leaves, it was necessary to minimize the use of computer resources. For instance, instead of using the common TCP (Transaction Control Protocol), QDDS uses UDP (User Datagram Protocol). A TCP connection can be compared to a telephone call where both the sender and recipient are connected at all times. On the other hand, a UDP is like a telegram. It is sent off with no way to know if different messages arrive out of order or if they arrive at all. In fact, the colloquial name for UDP is "Unreliable Data Packet." This unreliable method was turned into a reliable one by assigning an ID number to each packet. Recipients keep track of the packet numbers, and if one is missing for a few minutes, the recipient requests a resend.
At present, for redundancy, one hub is running at Menlo Park and a second one with the same information is running at USGS headquarters in Reston, Virginia. There are now nine permanent leaves and four transient ones, although the system can easily support many more. Hypocenter information is being provided through this system from the Southern and Northern California Seismic Networks, the Pacific Northwest Seismic Network, and the U.S. National Seismic Network. Anyone with an Internet connection (unless behind a firewall) can become a leaf by attaching to either or both of these hubs to obtain events. On average, 50 to 60 messages per day are currently being distributed, but this number may jump a factor of 100 during large aftershock sequences. Distributed information is typically received at all leaves within a minute of being sent.
Merging Earthquake Information
With the CUBE formats and QDDS there is now a fast, reliable, common format and protocol for distributing earthquake summary and auxiliary information such as comments and World Wide Web links. However, this system makes it common to get multiple versions of earthquake information from a variety of seismic networks, providing increased opportunities to spread confusion at greatly increased speeds. The solution to this problem derives from an earlier project of the CNSS. A method was devised for generating a composite catalog from individual historical catalogs submitted by CNSS member networks. Such a composite catalog has been available since 1996 and has proven to be an effective summary of seismicity in the U.S.
The basic problem in producing a merged earthquake catalog is that an earthquake may be reported by multiple networks but we only want the most accurate set of information about each earthquake in the merged catalog. The CNSS merging process works by merging the catalogs at the summary information (location and magnitude) level. First, a set of "authoritative regions" is designated for each network. These regions show where each network will be the primary source of earthquake information. The combined earthquake catalogs are then scanned for "duplicate events", which are currently defined as those that have origin times within 16 seconds and locations within 100 km of each other. If duplicate information for an event is found, then the best information is selected by following these rules: 1. If reported, information from the authoritative network is used. 2. If none of the locations is "authoritative", the report with the largest magnitude is used. To apply this existing process to the real-time catalogs a new rule needed to be implemented: 3. Nonauthoritative information is not used until 10 minutes after origin time. This gives the authoritative network a chance to provide the first information on an event.
While only a small modification was required to apply this process to real-time data, many other changes were made to make the process efficient enough to work during times of high seismicity, when the catalogs may need to be merged every minute. The real-time version of the earthquake merging system (CNSSM) begins by maintaining a history file for each network containing time-stamped copies of all earthquake summary information submitted by that network. These records are also marked to show if the location is within the network's authoritative region. Next CNSSM produces a current catalog for each network with only the most recently received copy of the highest version number summary information for each earthquake. This step removes duplicate copies created by multiple QDDS hubs. The various network earthquake catalogs are then concatenated together and sorted by origin time, and duplicate events are removed by the rules discussed above to produce a merged earthquake catalog.
CNSSM scans the QDDS output directory for new data files and places a flag file in an output directory when the merged catalog changes. Comments about earthquakes and links for more information are also placed in the output directory for use by downstream software. Because CNSSM communicates with the other packages solely by looking in one directory and placing files in another, it is very modular and can work with other event distribution systems or other downstream display packages.
The current version is written for Solaris and utilizes many Unix system utilities, and includes an installation script that can usually get the software up and running in a few minutes. All user-defined variables are kept in a single settings file to make operation simple. A more portable Java version has been developed and is being tested.
World Wide Web Display
Together QDDS and CNSSM provide a rapidly updated and authoritative multinetwork catalog. Now, how best to provide that information to the public? The Web is a popular medium that allows us to present both graphic images, including maps and waveform data, and text about earthquakes, so an efficient way of producing effective Web pages was needed.
Based on the problems of serving earthquake information to a large and ever-growing audience following a felt earthquake, it was decided that the appearance of the pages was less important than efficiently getting the basic information out. Three principles guided the design of these new pages: (1) Keep pages and maps simple so that they can be updated quickly, (2) keep pages and maps small (in bytes) to reduce the load on the server after felt events, and (3) keep HTML usage simple and standard to accommodate the variety of browsers used by folks to access the pages.
A number of curious constraints also arose as experimental page designs were tried. There are only a limited number of "browser safe" colors which remain solid (without dithering) when they are displayed on computer monitors, and, of these, there are only a few that are light enough to make suitable backgrounds for maps. We also quickly discovered that earthquakes colored by age using red and green were indistinguishable to a color-blind seismologist, leaving even fewer colors with which to work.
To provide complete coverage and a zoom-in ability, the Web site starts with an overview map of the area covered on a page with links to a variety of earthquake lists and background information about the display. The user can then zoom in by clicking on the index map in order to reach one of a set of uniformly distributed, overlapping 2º maps. These pages also include a list of the larger earthquakes on the map and a link to a complete listing. Arrows at the edges of the maps allow the user to pan through the 2º maps. Clicking on an earthquake brings up a text page about the event that is generated from the earthquake summary information and presents this information in a readable format. These text pages also contain any text comments or links to auxiliary "add on" information, including focal mechanisms, waveforms, shaking maps, or aftershock warnings submitted by the network. Thus, this system provides both standardized information on each event and the ability to lead users to a wide variety of other information that may be generated for individual events. The update software (Bourne and Perl scripts) prepares a set of drawing instructions used to plot the earthquakes on the gif base maps using the GD software package of Thomas Boutell. Text descriptions of events and other add-on information are linked by the update program to appropriate pages, which can then be sent to a list of servers by remote copy. Selected maps and earthquake lists can be e-mailed to other Web servers such as Yahoo for display on their sites.
The affected pages are updated every time the catalog changes or new information is provided about an event. Typically, pages and maps describing an event can be made available within one minute of its origin time. Larger events require some additional time (4-5 minutes) before an accurate magnitude can be reported on updated pages. Every hour, all maps and pages containing earthquakes are updated, regardless of whether or not an event has occurred, so that the timestamps demonstrate that the information is current.
For initial installation some effort is needed to customize maps and associated text for the "Recenteqs" (REQS) software, but it has been successfully installed on several servers, including ones in northern California, southern California, the Pacific Northwest, and Utah. A Perl script is available for making the two-degree base maps given digital files of coastal outlines, roads, faults, lakes, etc., and a list of place names and coordinates. Typically these maps and the data files used to make them need some cleaning and editing, which can be rather laborious. It is then relatively straightforward to get an earthquake catalog to plot on the base maps, although considerable customization remains on FAQ files, credits pages, additional links, etc.
Activity at Long Valley in December 1997 provided an extended stress test for the new system in northern California. At times there were 2,400 earthquakes on some 2º maps, and the Long Valley pages on our server received as many as 80,000 hits per day. As Internet usage continues to grow, the new Web site now routinely handles 45,000 inquiries/hour following felt events in urban areas.
Current earthquake information is now being provided through these three systems by the Southern (http://pasadena.wr.usgs.gov/recenteqs) and Northern (http://quake.wr.usgs.gov/recenteqs) California seismic networks, the Pacific Northwest Seismic network (http://www.ess.washington.edu/recenteqs/), University of Utah Seismograph Station (http://www.seis.utah.edu/recenteqs), and the U.S. National Seismic Network (http://wwwneic.cr.usgs.gov/current_seismicity.shtml). Other networks are expected to join the effort soon. By separating the task into these three modular packages, it should be relatively easy to replace any one of them for customized uses or as better approaches are developed. Thus, we hope that this provides a foundation on which reliable earthquake information can be provided for years to come. Information about all three systems (QDDS, CNSSM, REQS), including how to obtain and install them, is available on the CNSS Web pages at http://www.cnss.org/eq_distribute.
Many individuals have contributed to the design and integration of these packages. Some of the most important contributions are the following:
Initial development and implementation of the historical CNSS composite catalog: Doug Neuhauser and Lind Gee
CNSS Composite Catalog, http://quake.geo.berkeley.edu/cnss.
CNSS Resolution No. 97-02, http://www.cnss.org/resolution.97-02.html.
Malone, Steve, Dave Oppenheimer, Lind Gee, and Doug Neuhauser (1996). The Council of the National Seismic System and a composite earthquake catalog for the United States, IRIS Newsletter XV(1) 6-9.
SRL encourages guest columnists to contribute to the "Electronic Seismologist." Please contact Steve Malone with your ideas. His e-mail address is firstname.lastname@example.org.
Posted: 19 May 2000