SEGYImport

A tool to scan and import a SEG-Y file to a volume data store (VDS).

Usage:

SEGYImport [OPTION...] <input file>
Option Decription
--header-format \ A JSON file defining the header format for the input SEG-Y file. The expected format is a dictonary of strings (field names) to pairs (byte position, field width) where field width can be "TwoByte" or "FourByte". Additionally, an "Endianness" key can be specified as "BigEndian" or "LittleEndian".
--header-field header_name=offset:width A single definition of a header field. The expected format is a "fieldname=offset:width" where the ":width" is optional. Its also possible to specify range: "fieldname=begin-end". Multiple header-fields is specified by providing multiple --header-field arguments.
-p, --primary-key \ The name of the trace header field to use as the primary key.
-s, --secondary-key \ The name of the trace header field to use as the secondary key.
--keep-original-order Do not reorder the data in the VDS if the primary key is sorted in descending order.
--prestack Import binned prestack data (PSTM/PSDM gathers).
--scale \ If a scale override (floating point) is given, it is used to scale the coordinates in the header instead of determining the scale factor from the coordinate scale trace header field.
--sample-unit \ A sample unit of 'ms' is used for datasets in the time domain (default), while a sample unit of 'm' or 'ft' is used for datasets in the depth domain
--sample-start \ The start time/depth/frequency (depending on the domain) of the sampling
--sample-format \ Override the data format used when reading sample data from SEGY file. Possible values are: IBMFloat, IEEEFloat, UInt32, Int32, UInt16, Int16, UInt8, Int8.
--crs-wkt \ A coordinate reference system in well-known text format can optionally be provided
-l, --little-endian Force little-endian trace headers.
--scan Generate a JSON file containing information about the input SEG-Y file.
-i, --file-info \ A JSON file (generated by the --scan option) containing information about an input SEG-Y file.
-b, --brick-size \ The brick size for the volume data store.
--lod-levels \ The number of LODs to generate.
--create-2d-lods Create 2D LODs.
--margin \ The margin size (overlap) of the bricks.
-f, --force Continue on upload error.
--ignore-warnings Ignore warnings about import parameters.
--compression-method \ Compression method. Supported compression methods are: None, RLE, Zip.
--tolerance \ This parameter specifies the compression tolerance when using the wavelet compression method. This value is the maximum deviation from the original data value when the data is converted to 8-bit using the value range. A value of 1 means the maximum allowable loss is the same as quantizing to 8-bit (but the average loss will be much much lower than quantizing to 8-bit). It is not a good idea to directly relate the tolerance to the quality of the compressed data, as the average loss will in general be an order of magnitude lower than the allowable loss.
--url \ Url with cloud vendor scheme used for target location or file name of output VDS file.
--url-connection \ Connection string used for additional parameters to the url connection
--vdsfile \ File name of output VDS file.
--single-connection Use single connection string. When specified url-connection will be used for input-connection as well
--input-connection \ Connection string used for additional parameters to the input connection
--persistentID \ A globally unique ID for the VDS, usually an 8-digit hexadecimal number.
--uniqueID Generate a new globally unique ID when scanning the input SEG-Y file.
--disable-persistentID Disable the persistentID usage, placing the VDS directly into the url location.
--json-output Enable json output.
--disable-print-text-header Disable printing the text header of the input segy file.
--attribute-name \ The name of the primary VDS channel. The name may be Amplitude (default), Attribute, Depth, Probability, Time, Vavg, Vint, or Vrms (default: Amplitude)
--attribute-unit \ The units of the primary VDS channel. The unit name may be blank (default), ft, ft/s, Hz, m, m/s, ms, or s
--2d Import 2D data.
--offset-sorted Import prestack data sorted by trace header Offset value.
--mute Enable Mutes channel in output VDS.
--azimuth Enable Azimuth channel in output VDS.
--azimuth-type \ Azimuth type. Supported azimuth types are: Azimuth (from trace header field) (default), OffsetXY (computed from OffsetX and OffsetY header fields).
--azimuth-unit \ Azimuth unit. Supported azimuth units are: Radians, Degrees (default).
--azimuth-scale \ Azimuth scale factor. Trace header field Azimuth values will be multiplied by this factor.
--respace-gathers \ Respace traces in prestack gathers by Offset trace header field. Supported options are: Off, On, Auto (default).
--segy-survey-coordinate-system-unit \ Ignore the unit defined in the SEG-Y file and use the defined unit instead. Possible values: Meters, Feet
--survey-coordinate-system-unit \ Unit used in the VDS for survey coordinate system. Possible values: Meters (default), Feet, Original
--resume Resume mode.
--flush-frequency \ Flush frequency in seconds. SEGYImport can resume imports at flush checkpoints. 0 (zero) results in never flushing. Default is 60.
-q, --quiet Disable info level output.
-Q, --very-quiet Disable warning level output.
-h, --help Print this help information.
-H, --help-connection Print help information about the connection string.
--version Print version information.

For more information about the --url and --url-connection parameter please see: https://osdu.pages.opengroup.org/platform/domain-data-mgmt-services/seismic/open-vds/connection.html

To create a valid VDS from a SEG-Y file, SEGYImport needs to scan the file to determine the extents of the dataset (e.g. number of samples, number of crosslines, number of inlines). During the scanning process we read from a number of fields in the trace headers, most importantly the primary and secondary keys that are used as the axes of the VDS.

For inline-sorted poststack data the inline number is the primary key and the crossline number is the secondary key (this is the default setting). If these are not in the ‘standard’ byte locations in the header, you can override the trace header format using a JSON file that contains definitions of the SEG-Y header fields (that are not in the standard locations) using the –header-format command line option. You can also specify the header field endianness in this file. This is an example of such a JSON file:

{
  "Endianness": "BigEndian",
  "InlineNumber":    [ 5, "FourByte"],
  "CrosslineNumber": [ 9, "FourByte"]
}

To import binned prestack data you need to pass the –prestack parameter which will allow multiple traces for each inline/crossline location and create an extra “Trace (offset)” dimension in the VDS assuming the traces in each gather are sorted by offset.

For other data types (or crossline-sorted data) it is possible to specify which trace header fields the file is sorted on by using the --primary-key and --secondary-key options.

The result of the scanning process is the ‘file info’ and can optionally be saved to a separate file using the --scan option. Such a file can be used later when importing the data by using the --file-info command line option.

If --scan is specified then --file-info argument can be used to specify the output file. If no output file is given, the file info will be printed to stdout.

When SEGYImport is either done generating a “file-info” or it is supplied with a file, it will start generating VDS chunks that will be uploaded to the destination VDS using the connection parameters.

During the scanning stage SEGYImport will also read the binary header of the SEG-Y file and extract some keys at certain predefined positions. These are not possible to override, since it’s not common practice to store these in a different place.

Name Offset Width
TracesPerEnsemble 13 2
AuxiliaryTracesPerEnsemble 15 2
SampleInterval 17 2
NumSamples 21 2
DataSampleFormatCode 25 2
EnsembleFold 27 2
TraceSortingCode 29 2
MeasurementSystem 55 2
SEGYFormatRevisionNumber 301 2
FixedLengthTraceFlag 303 2
ExtendedTextualFileHeaderCount 305 2

The default trace header fields (that can be overridden with a header format JSON file) are:

Header Field Name Aliases Offset Width
InlineNumber Inline 189 4
CrosslineNumber Crossline 193 4
EnergySourcePointNumber Shot, SP 17 4
Receiver 13 4
Offset 37 4
EnsembleNumber CDP, CMP 21 4
EnsembleXCoordinate CDPXCoordinate, CDP-X, Easting 181 4
EnsembleYCoordinate CDPYCoordinate, CDP-Y, Northing 185 4
SourceXCoordinate Source-X 73 4
SourceYCoordinate Source-Y 77 4
GroupXCoordinate Group-X, ReceiverXCoordinate, Receiver-X 81 4
GroupYCoordinate Group-Y, ReceiverYCoordinate, Receiver-Y 85 4
CoordinateScale Scalar 71 2
OffsetX 97 2
OffsetY 95 2
Azimuth 61 4
MuteStartTime 111 2
MuteEndTime 113 2

For 2D data and unbinned data additional trace position metadata is stored in the VDS. For 2D prestack, 2D poststack, and CDP gathers the X/Y coordinates are taken from the trace header fields EnsembleXCoordinate and EnsembleYCoordinate. For receiver gathers the X/Y coordinates are taken from GroupXCoordinate and GroupYCoordinate. For shot gathers the X/Y coordinates are taken from SourceXCoordinate and SourceYCoordinate.

A valid --url with an optional --connection argument, or a --vdsfile argument, must be given to specify where the output will be written. An input SEG-Y file must also be specified.

Example usage:

SEGYImport --url s3://openvds-test --header-format D:\Datasets\Australia\HeaderFormat.json D:\Datasets\Australia\shakespeare3d_pstm_Time.segy
SEGYImport --vdsfile C:\VDSdata\shakespeare3d_pstm_Time.vds --header-format D:\Datasets\Australia\HeaderFormat.json D:\Datasets\Australia\shakespeare3d_pstm_Time.segy