====== W-phase documentation ======

In addition to this page, you should probably have a look to [[wphase:tutorial|this tutorial page]].

===== Installation =====

==== Getting the code ====

Currently, the W-phase package is hosted as a github repository. To check out the W-phase repository:
<code>
git clone https://github.com/eost/wphase.git wphase_package
</code>

To update your W-phase repository (pull changes)
<code>
cd /to/the/wphase/directory/
git pull origin master
</code>

For more details on using git for W-phase, [[wphase:repository|you can read this page.]]

==== Dependencies ====

The w-phase package have only been tested on Unix and Linux computers. You will need the following:
  - csh shell
  - gcc and gfortran
  - python2.7 (or later)
  - You have to install numpy, matplotlib, cartopy and netCDF4 to run some python scripts which make figures.
  - rdseed

To install these dependencies on MacOs, [[wphase:macos|you can refer to this page]].
==== Building the code ====

To install the code, we must first setup a few environment variables. If you use csh or tcsh:
<code>
setenv RDSEED       /path/to/rdseed/executable
setenv GF_PATH      /path/to/greens/functions/database
setenv WPHASE_HOME  /path/to/wphase
</code>
If you use bash:
<code>
export RDSEED=/path/to/rdseed/executable
export GF_PATH=/path/to/greens/functions/database
export WPHASE_HOME=/path/to/wphase/package
</code>

These variables are necessary both at installation time and at run time. In addition to these variables, it is handy to include the wphase "bin" directory into the PATH environment variable:
<code>
setenv PATH         ${PATH}:$WPHASE_HOME/bin
</code>
or in bash:
<code>
export PATH=${PATH}:$WPHASE_HOME/bin
</code>

All theses variable assignment can be included in your .tcshrc or your .bashrc file.

Once you have downloaded the last version of the W-phase package and that the above environment variables are properly defined. Then you can proceed as follows to compile the package:
<code>
cd ${WPHASE_HOME}/src
make -B
</code>

-----

===== How to run W-phase  =====

You should also have a look to [[wphase:tutorial|the W-phase tutorial page]].

==== Preparing directories, i_master and CMTSOLUTION ====
Before each inversion, it is necessary to create a “run” directory containing two input ascii files: 
  * CMTSOLUTION: a file containing the PDE, the event centroid location and timing and optionally a “reference” moment tensor solution.
  * i_master: a file containing other parameters such as the band-pass filter parameters,  minimum and maximum distances, etc.                          

The format of these files are described in [[wphase:documentation#Data formats|the file formats section]]. 

Path to SEED file(s) must be correctly specified after the keyword 'SEED:' in the i_master file. If multiple SEED files are used for the same inversion, each of them must be referenced properly in i_master using one 'SEED:' line per file. 

-----

==== Extracting data from SEED ====
Once the i_master and CMTSOLUTION files are created, we can extract waveforms and instrument response parameters and perform a rough screening by epicentral distance. This can be done using: 
<code>${WPHASE_HOME}/bin/extract.csh</code> 
which will extract Z, N, E, 1, 2  channels in SEED file(s).

Data is extracted from SEED volumes and written in SAC format in the subdirectory 'DATA'. PZ files containing instrument responses are also located in this directory. After running an extract*.csh script, screened and windowed data files are listed in 'scr_dat_fil_list' while 'coeffs_rec_lut' contains the coefficients for the time domain deconvolution of the instrument response.

-----

==== Extracting data from a gzipped tar archive ====
As SEED volumes are no longer supported, waveforms and instrument response can also be extracted from a gzipped tar archive (hereafter referred to as "tgz") including both SAC and SAC_PZ (poles and zeros) files. 
Instead of using the "extract.csh" script to extract data from SEED, we can then extract data from a tgz archive using: 
<code>${WPHASE_HOME}/bin/tgz2DATA_org.py path_to_tgz_archive.tgz</code> 
where "path_to_tgz_archive.tgz" is the path to the tgz archive including SAC and SAC_PZ files. This script will extract SAC and SAC_PZ files a the directory named "DATA_org".

Two tgz archive formats are supported: WILBER (default) and NIED. 

Using the WILBER format, the tgz archive should include SAC files named with a ".SAC" or ".sac" extension and SAC_PZ file names starting with "SAC_PZ". If data is ordered using the [[https://ds.iris.edu/wilber3|IRIS Wilber 3 system]], this format can be obtained directly by selecting "Output format: SAC binary" (little or big-endian)  and "Bundle As: gzipped tar archive" in the final Request Data page (see below). 

{{:wilber3_screenshot.png?300|{{:wphase:wilber3_screenshot.png?300|}}

Once the tgz archive is prepared, extraction from a WILBER formatted archive is done by running:
<code>${WPHASE_HOME}/bin/tgz2DATA_org.py path_to_tgz_archive.tgz wilber</code> 
As this is the default format, the "wilber" option can be removed from the command line.

Using the NIED tgz archive format, SAC_PZ files should end with ".zp" and SAC file names can have any format (but should not end with ".zp"). Extraction from a NIED archive can be done using 
<code>${WPHASE_HOME}/bin/tgz2DATA_org.py path_to_tgz_archive.tgz nied</code> 

---

==== Calculating Synthetics, deconvolution and filtering ====
The next step is to calculate the kernel functions associated with the 6 elements of the seismic moment tensor for the stations listed in 'scr_dat_fil_list' and to convolve them with the moment rate function (MRF) specified in the CMTSOLUTION file (time shift and half duration). We must then deconvolve the instrument response from the data and band pass filter each waveform in the frequency pass band specified in i_master. This is performed using:
<code>${WPHASE_HOME}/bin/prepare_wp.csh</code> 
for Z, N, E, 1, 2 channels or
<code>${WPHASE_HOME}/bin/prepare_wp.csh Z</code> 
for Z channel only

The scripts listed above will generates i_wpinversion which is a list of sac files to be used in the inversion.  This original input file usually includes lots of noisy channels which should be removed further.  Output kernel functions will be found generally in ./GF

-----
==== Inversion ====
The inversion can then be performed using the Kernels functions in ./GF and data files listed in i_wpinversion. 

For more details on options run 
<code>${WPHASE_HOME}/bin/wpinversion -h</code>

By default, this program assumes a zero trace moment tensor (i.e. deviatoric moment tensor). This constraint can be disabled by adding the option "-nont". Another interesting option is "-dc" which enable the inversion for a pure double couple. The options "-stike", "-dip", "-rake" and "-mom" provide the possibility to run double couple inversions at fixed strike, dip, rake and/or scalar moment.

Several output files are created after running the inversion:
  * WCMTSOLUTION (ASCII) is the solution obtained after inversion.
  * p_wpinversion (postscript) which can be visualized with gv (or gs, evince ...) and printed with lp or lpr.
  * o_wpinversion (ASCII) is the output list of stations. The 1st column is the list of sac files, the 2nd column is the azimuth (degrees), the 3rd column is the epicentral distance (degrees), the 4th column is the index of the first sample of each channel in the concatenated data vector, the 5th column is the index of the first sample of the next channel, the 8th column is the rms misfit
  * fort.15* (ASCII) is a comparison between observed and predicted concatenated W-phase traces. The first column is the data vector, the second column is the predicted data vector after inversion and the (optional) third column is the data predicted from the (optional) reference solution in the CMTSOLUTION file. 
  * fort.15_LH? files (ASCII) where '?' is Z, N, E, 1 or 2 are the same concatenated traces but separated by channels. 
  * *.synth.sac (SAC files) which are the predicted data after inversion.

Since i_wpinversion includes a lot of noisy traces, the result is usually uncertain. To improve the solution, we must remove noisy data. This can be done as follows:
  - A first screening can be performed using the option -med : this will reject data with an anomalously small or large interval between maximum and minimum amplitudes.
  - After a first run of the inversion, it is possible to perform another inversion throwing out the stations with a relative mismatch exceeding 3.0 (i.e. really bad data). The file o_wpinversion which was created by the first inversion will be used an input. After inversion with the better dataset, a new o_wpinversion file will be created. For example using a threshold of 3.0 :<code>
$WPHASE_HOME/bin/wpinversion -ifil o_wpinversion -ofil o_wpinv.th3.0 -th 3.0 
</code> (in this example, the new o_wpinversion file is named o_wpinv.th3.0). This step can be repeated two or more times with different thresholds using the option -th (e.g 1.3, 0.9, etc.): <code>
$WPHASE_HOME/bin/wpinversion -ifil o_wpinv.th3.0 -ofil o_wpinv.th1.3 -th 1.3 
$WPHASE_HOME/bin/wpinversion -ifil o_wpinv.th1.3 -ofil o_wpinv.th0.9 -th 0.9 
</code>
  - Finally, it is possible to remove the channels showing a large rms amplitude ratios (observed/predicted and predicted/observed) using the option –nr (e.g. –nr 2.0): <code>
$WPHASE_HOME/bin/wpinversion -ifil o_wpinv.th0.9 -ofil o_wpinv.th1.3 -nr 2.0
</code>

At the end of the screening process, it is necessary to clean up the run directory so that o_wpinversion correspond to the optimum dataset which can be used later by grid-search and plot routines :
<code>
mv o_wpinversion o_wpinv.noth
cp o_wpinv.th0.9 o_wpinversion
</code>

-----
==== RUNA scripts ====
Data screening and inversion can all be done at once using the RUNA3_lite.csh script:
<code>
${WPHASE_HOME}/bin/RUNA3_lite.csh
</code>
This script can be used after data extraction in "DATA_org" using either "tgz2DATA_org.py" or "extract.csh" to extract data from a tgz archive or a SEED volume, respectively (see above). Once the "DATA_org" directory is populated, the "RUNA3_lite.csh" script will perform data screening and calculate a moment tensor solution after median data screening and after rejecting  traces associated with large misfit using the threshold 5.0 3.0 0.9 (i.e. -th, see 2. in the "Inversion" section above).

If data is extracted directly from SEED volume(s), data extraction/screening and inversion can be done directly by running:
<code>
${WPHASE_HOME}/bin/RUNA3.csh
</code> 
To do the inversion on "Z" components only (i.e., not using the horizontals), use:
<code>
${WPHASE_HOME}/bin/RUNA3_only_Z.csh 
</code>

These two scripts don’t perform the “ratio screening” (i.e. 3. in the "Inversion" section above). To run the same routines with an additional screening based on the rms amplitude ratio (observed/predicted and predicted/observed), the following script can be used:
<code>
${WPHASE_HOME}/bin/RUNA3r.csh 
</code>
will perform median, rms misfit and rms ratio for channels Z, N, E, 1, 2 channels,
Warning: RUNAr.csh is not yet fully tested.


-----
==== Grid searches ====
In the grid-search scheme, there is a first global rough exploration which is followed by finer samplings around minimal points. If the optimum is found near a bound, the explored space is extended.

Grid searches can be performed using the script <code>${WPHASE_HOME}/bin/wp_grid_search.py</code>. A detailed help can be found by using <code>${WPHASE_HOME}/bin/wp_grid_search.py -h</code>. The grid-search computation is parallelized using openMP (if available, it takes advantage of muliple cores).

This script run a time-shift (ts) grid search followed by a centroid position grid-search :
   * By default, the ts grid-search only shift the Green's functions to find an optimum centroid timing. At the end of the grid-search, a new inversion is performed by fixing the source half duration (hd) to the optimal value of ts. \\ Using the option -s, the grid-search is preformed while considering ts=hd which is more time consuming. \\ Grid-search results can be found in grid_search_ts_out (see note 3). A postscript file (ts_p_wpinversion) as well as a CMTSOLUTION file (ts_WCMTSOLUTION) corresponding to the optimum ts will also be generated automatically. In gereneral all the output files containing the prefix ts_are associated to the time-shift grid search.
   * Then, using the optimum value of ts, it perform a 2D centroid position grid-search (lat/lon). The centroid position grid-search generates a text file named grid_search_xy_out with a structure similar to grid_search_ts_out (see note 3). As for time-shift grid-search, you will also find the usual output files with the prefix xy_.
   * It is also possible to perform a 3D grid-search using the option –z. 

-----
==== Plot routines ====

All the plotting scripts are coded using python and the module pylab which have to be installed before using them. The module cartopy is also needed for plotting maps but it is optional (even if we recommend to install it).

There are 3 script which can be used to plot the W phase inversion results:

   * The first script is <code>${WPHASE_HOME}/bin/make_grids.py</code> which be used to display grid-searches results. Use <code>${WPHASE_HOME}/bin/make_grids.py -h</code> to have more details on available options and arguments. The 2 other scripts plot observed and synthetic seismograms after W phase inversion. 
   * In order to plot complete seismograms individually and place station on a map: <code>${WPHASE_HOME}/bin/traces.py</code> which draw complete seismograms and show station location on a map (if cartopy is available). Please use <code>${WPHASE_HOME}/bin/traces.py -h</code> for more details on available options.
   * To plot concatenated waveforms:<code>${WPHASE_HOME}/bin/make_cwp.py</code> For more details on available options and arguments see <code>${WPHASE_HOME}/bin/make_cwp.py -h</code> to see the options.

-------

===== Using ETOPO01 Global Relief Model =====

ETOPO01 is a 1 arc-minute global relief model of the Earth's surface
provided by Amante et al. (2009):
<http://www.ngdc.noaa.gov/mgg/global/global.html>

If Cartopy is installed, you can optionally draw ETOPO01 topography and
bathymetry. To do so, the path to the ETOPO01 netCDF file must be
assigned to the environment variable $ETOPOFILE which can be
done in your .tcshrc fil (or .bachrc, .cshrc, etc.). The make_grids.py
script have been tested only for grid-registered netCDF file of the
ETOPO1 bedrock file available at:
<http://www.ngdc.noaa.gov/mgg/global/relief/ETOPO1/data/bedrock/grid_registered/netcdf/ETOPO1_Bed_g_gmt4.grd.gz>


-------


===== Data formats =====

You can also have a look to [[wphase:tutorial|the W-phase tutorial page]].

=== CMTSOLUTION FILE ===

example:
<code>
 PDE 2003  9 25 19 50  6.40  41.8100  143.9100  27.0 6.9 8.1 HOKKAIDO, JP
event name:     092503C        
time shift:     19.8100
half duration:  33.5000
latitude:       42.2100
longitude:     143.8400
depth:          28.2400
Mrr:       7.770000e+27
Mtt:      -4.110000e+27
Mpp:      -3.660000e+27
Mrt:       1.320000e+28
Mrp:       2.590000e+28
Mtp:      -6.620000e+27
</code>

The first line of this file is the PDE. 
  * The first 4 characters of this line (including the first white space in the example above) generaly indicates the agency which provide the origin time, the hypocenter location and preliminary magnitude estimates. 
  * Characters from 6 to 27 correspond to the PDE origin time
  * Characters from 29 to 52 are the PDE latitude, longitude and depth. This hypocenter location will be used to define the time window in order to select the part of the waveform which will be inverted
  * Characters from 54 to 60 are PDE magnitude estimates The rest of the line provides some details on epicenter location (region, country, ...)

The second line, 'event name' correspond to the event id

Lines 3 and 4 are the parameters of the STF.

Lines 5, 6 and 7 are the 'latitude', 'longitude' and 'depth' of the centroïd wich will be used for the rotation of horizontal components and for the computation of synthetic kernel functions.

The last 6 lines are optional, they correspond to a moment tensor solution (e.g. a reference GCMT for comparision with the W phase inversion).


=== i_master file ===

This file is composed of several fields (some of them are optional):
   * EVNAME: Arbitrary event name
   * SEED:  path to SEED file. This field must be used several times if there are multiple SEED files.
   * DMAX:  Maximum distance to be used. 
   * DMIN:  Minimum distance to be used (optional).  If omitted, it is defaulted to 0.
   * CMTFILE: Path to the CMTSOLUTION file
   * filt_order: order of the butterworth filter.
   * filt_cf1:  low-frequency cut –off.
   * filt_cf2:  high-frequency cut –off.
   * filt_pass: unused parameter (let it to 1).
   * IDEC_2: the second number is the number of seconds just before P wave over which the baseline is defined (let the other parameters to 2 and 0.1) .
   * IDEC_3: Instrument response fit parameters for the deconvolution. The two first parameters define the frequency band in which the fit is measured. The third parameter is the number of samples used in this frequency range and the last parameter is a maximum misfit above which the channel is rejected.
   * WP_WIN: time window definition.  
     * If one value $B$ (eg. WP_WIN: 15) is used, then the window is defined by $[P_{tt},P_{tt}+B\times\Delta]$ where $P_{tt}$ is the P-wave travel-time.  
     * If two values $A$ $B$ are specified (e.g., WP_WIN: 0. 15.), then the window is $[P_{tt}+A\times\Delta,P_{tt}+B\times\Delta]$.  
     * If three values $A$, $B$ and $C$ are used (e.g., WP_WIN: 0. 15. 12.) the time window is defined as:
        * $[P_{tt}+A\times\Delta,P_{tt}+B\times\Delta]$ if $\Delta>C$
        * $[P_{tt}+A\times C,P_{tt}+B\times C]$ if $\Delta<C$ 
     * If Four values, $A$, $B$, $C$ and $D$ are given (e.g., WP_WIN: 0. 15. 12. 50.) then the window is:
        * $[P_{tt}+A\times \Delta,P_{tt}+B\times\Delta]$ if $C<\Delta<D$
        * $[P_{tt}+A\times C,P_{tt}+B\times C]$ if $\Delta<C$
        * $[P_{tt}+A\times D,P_{tt}+B\times D]$ if $\Delta>D$
   * GFDIR: Name of the Green's function directory (optional)

example:
<code>
EVNAME:   Tokachi-Oki_2003
SEED:    ../../WP6/SEEDS/2003_tokachi_oki_LH.SEED 
DMAX:    90. 
DMIN:    0.  
CMTFILE: CMTSOLUTION

# data deconvolution
filt_order: 4
filt_cf1  : 0.001
filt_cf2  : 0.005
filt_pass : 1
IDEC_2:  2  280   0.1
IDEC_3:  0.001   0.1  100  0.03

#inversion
WP_WIN: 15.  
GFDIR: ./GF 
</code>


=== Grid-search output files: grid_search_ts_out and grid_search_xy_out ===


grid_search_ts_out corresponds to the time-shift grid-search. 

Example:
<code>
   60.0000   0.20140377
   81.7621   0.32070120
000 000     1.0000    81.7621   -35.8500   -72.7100    44.8000   0.60186830   1.83279165
001 000     5.0000    81.7621   -35.8500   -72.7100    44.8000   0.58186860   1.60449952
002 000     9.0000    81.7621   -35.8500   -72.7100    44.8000   0.55865993   1.40553383
003 000    13.0000    81.7621   -35.8500   -72.7100    44.8000   0.53241366   1.23245105
004 000    17.0000    81.7621   -35.8500   -72.7100    44.8000   0.50339138   1.08142045
(…)
014 000    57.0000    81.7621   -35.8500   -72.7100    44.8000   0.20380139   0.31131967
015 000    61.0000    81.7621   -35.8500   -72.7100    44.8000   0.20198340   0.30827735
016 000    65.0000    81.7621   -35.8500   -72.7100    44.8000   0.21088280   0.32324637
017 000    69.0000    81.7621   -35.8500   -72.7100    44.8000   0.22888998   0.35415847
018 000    73.0000    81.7621   -35.8500   -72.7100    44.8000   0.25358477   0.39808714
(…)
039 000   157.0000    81.7621   -35.8500   -72.7100    44.8000   0.65903444   3.48513694
040 000   161.0000    81.7621   -35.8500   -72.7100    44.8000   0.65708573   3.35654281
041 000   165.0000    81.7621   -35.8500   -72.7100    44.8000   0.65410621   3.18301823
042 001    59.0000    81.7621   -35.8500   -72.7100    44.8000   0.20151331   0.30749196
043 001    63.0000    81.7621   -35.8500   -72.7100    44.8000   0.20516181   0.31360143
044 001    67.0000    81.7621   -35.8500   -72.7100    44.8000   0.21889318   0.33689015
045 002    58.0000    81.7621   -35.8500   -72.7100    44.8000   0.20231439   0.30883067
046 002    60.0000    81.7621   -35.8500   -72.7100    44.8000   0.20140377   0.30730901
047 002    62.0000    81.7621   -35.8500   -72.7100    44.8000   0.20324196   0.31038266
048 002    64.0000    81.7621   -35.8500   -72.7100    44.8000   0.20771868   0.31790207
</code>

The two first lines correspond respectively to the optimum and to the initial centroid time-shifts, the first column being the time-shift itself and the second column being the associated rms misfit.
 
The following lines provide some details on the grid-search : 
1st col. : index for this time-shift value, 
2nd col. : iteration number,
3rd col. : time-shift
4rd col. : centroid latitude
5th col. : centroid longitude
6th col. : centroid depth
7th col. : rms
8th col. : normalized rms

In the example above you can see that a global grid-search is performed from 1 to 165 sec (until computation 41) at the 0th iteration with a time step of 4sec. The 1st iteration corresponds to computations 42-44 which extend the grid-search to unexplored time-shift between 57sec and 69sec with a sampling interval of 2sec. The 2nd iteration perform an even finer sampling to guarantee a 1sec grid-search between 57sec and 65sec. Once this file is available to us (after running the time-shift grid-search), we can run
<code>
${WPHASE_HOME}/bin/make_grids.py –t 
</code>
to create the file grid_search_ts.pdf containing Fig. 1 which corresponds to the above example.


The output centroid position grid-search file grid_search_xy_out as a quite similar same structure replacing the time-shit in the 2 first lines by optimum and initial latitudes, longitudes and depth.
<code>
  -35.4500   -72.8325    25.5000   0.19118874
  -35.2500   -72.7100    25.5000   0.19314386
000 000    60.0000    60.0000   -36.4500   -74.1794    25.5000   0.24214954   0.37750789
001 000    60.0000    60.0000   -36.4500   -73.6896    25.5000   0.22150727   0.34137918
002 000    60.0000    60.0000   -36.4500   -73.1998    25.5000   0.20980594   0.32142471
003 000    60.0000    60.0000   -36.4500   -72.7100    25.5000   0.20938393   0.32071163
(…)
067 001    60.0000    60.0000   -35.6500   -72.4651    25.5000   0.19798867   0.30161961
068 001    60.0000    60.0000   -35.8500   -72.4651    25.5000   0.19830051   0.30213800
069 001    60.0000    60.0000   -35.8500   -72.7100    25.5000   0.19640545   0.29899114
070 001    60.0000    60.0000   -34.6500   -72.9549    25.5000   0.19787217   0.30142600
(…)
104 002    60.0000    60.0000   -35.3500   -72.5875    25.5000   0.19326491   0.29379393
105 002    60.0000    60.0000   -35.3500   -72.7100    25.5000   0.19296721   0.29330241
106 002    60.0000    60.0000   -35.5500   -72.7100    25.5000   0.19360157   0.29435001
</code>

The first line corresponds to the optimum centroid location and the second line indicates the a priori location specified in the CMTSOLUTION file. In these two lines, the 1st, 2nd and 3rd column correspond respectively to the centroid latitute, longitude and depth. The 4th column presents the associated rms misfits. 

The following lines provide some details on the grid-search : 
1st col. : index of the explorated centroid position, 
2nd col. : iteration number,
3rd col. : time-shift
4th col. : half duration
5th col. : centroid latitude
6th col. : centroid longitude
7th col. : centroid depth
8th col. : rms
9th col. : normalized rms

Once the grid-search is performed, the grid_search_xy_out is available to us (after running the time-shift grid-search), we can run
<code>
${WPHASE_HOME}/bin/make_grids.py –p -b 
</code>
to create the file grid_search_xy.pdf containing Fig. 1 which corresponds to the above example. The option –b is optional: it allows to use the cartopy module in order to plot coastlines and topography.