SEGYImport

A tool to scan and import a SEG-Y file to a volume data store (VDS).

Usage:

SEGYImport [OPTION...] <input file>

Option	Decription
–header-format <file>	A JSON file defining the header format for the input SEG-Y file. The expected format is a dictonary of strings (field names) to pairs (byte position, field width) where field width can be “TwoByte” or “FourByte”. Additionally, an “Endianness” key can be specified as “BigEndian” or “LittleEndian”.
–header-field header_name=offset:width	A single definition of a header field. The expected format is a “fieldname=offset:width” where the “:width” is optional. Its also possible to specify range: “fieldname=begin-end”. Multiple header-fields is specified by providing multiple –header-field arguments.
-p, –primary-key <field>	The name of the trace header field to use as the primary key.
-s, –secondary-key <field>	The name of the trace header field to use as the secondary key.
–keep-original-order	Do not reorder the data in the VDS if the primary key is sorted in descending order.
–prestack	Import binned prestack data (PSTM/PSDM gathers).
–scale <value>	If a scale override (floating point) is given, it is used to scale the coordinates in the header instead of determining the scale factor from the coordinate scale trace header field.
–sample-unit <string>	A sample unit of ‘ms’ is used for datasets in the time domain (default), while a sample unit of ‘m’ or ‘ft’ is used for datasets in the depth domain
–sample-start <value>	The start time/depth/frequency (depending on the domain) of the sampling
–sample-format <string>	Override the data format used when reading sample data from SEGY file. Possible values are: IBMFloat, IEEEFloat, UInt32, Int32, UInt16, Int16, UInt8, Int8.
–vds-format <string>	Override the data format used when writing sample data to VDS. Possible values are: UInt16, UInt8. Note that this option is only allowed when the SEGY sample data format is IBMFloat or IEEEFloat.
–crs-wkt <string>	A coordinate reference system in well-known text format can optionally be provided
-l, –little-endian	Force little-endian trace headers.
–scan	Generate a JSON file containing information about the input SEG-Y file.
-i, –file-info <file>	A JSON file (generated by the –scan option) containing information about an input SEG-Y file.
-b, –brick-size <value>	The brick size for the volume data store.
–lod-levels <value>	The number of LODs to generate.
–create-2d-lods	Create 2D LODs.
–margin <value>	The margin size (overlap) of the bricks.
-f, –force	Continue on upload error.
–ignore-warnings	Ignore warnings about import parameters.
–compression-method <string>	Compression method. Supported compression methods are: None, RLE, Zip.
–tolerance <value>	This parameter specifies the compression tolerance when using the wavelet compression method. This value is the maximum deviation from the original data value when the data is converted to 8-bit using the value range. A value of 1 means the maximum allowable loss is the same as quantizing to 8-bit (but the average loss will be much much lower than quantizing to 8-bit). It is not a good idea to directly relate the tolerance to the quality of the compressed data, as the average loss will in general be an order of magnitude lower than the allowable loss.
–url <string>	Url with cloud vendor scheme used for target location or file name of output VDS file.
–url-connection <string>	Connection string used for additional parameters to the url connection
–vdsfile <string>	File name of output VDS file.
–single-connection	Use single connection string. When specified url-connection will be used for input-connection as well
–input-connection <string>	Connection string used for additional parameters to the input connection
–persistentID <ID>	A globally unique ID for the VDS, usually an 8-digit hexadecimal number.
–uniqueID	Generate a new globally unique ID when scanning the input SEG-Y file.
–disable-persistentID	Disable the persistentID usage, placing the VDS directly into the url location.
–json-output	Enable json output.
–disable-print-text-header	Disable printing the text header of the input segy file.
–attribute-name <string>	The name of the primary VDS channel. The name may be Amplitude (default), Attribute, Depth, Probability, Time, Vavg, Vint, or Vrms (default: Amplitude)
–attribute-unit <string>	The units of the primary VDS channel. The unit name may be blank (default), ft, ft/s, Hz, m, m/s, ms, or s
–value-range	Set the sample data value range by giving minimum and maximum values as a colon-separated pair of values. By default the value range will be calculated from SEGY data. Using this option will not change sample data; it will only affect the value range stored in the VDS header.
–integer-scale	Set the scale and offset values used to convert 8/16-bit data to floating point by giving a colon-separated pair of values. By default this will be calculated from the sample value range. This option is only applicable for a VDS using UInt16 or UInt8 sample data format.
–2d	Import 2D data.
–2d-index-axis	Label the primary axis using values 1..N, where N is the number of ensembles (for prestack) or traces (for poststack) in the 2D line. By default the primary axis is labelled by CDP numbers.
–offset-sorted	Import prestack data sorted by trace header Offset value.
–mute	Enable Mutes channel in output VDS.
–azimuth	Enable Azimuth channel in output VDS.
–azimuth-type <string>	Azimuth type. Supported azimuth types are: Azimuth (from trace header field) (default), OffsetXY (computed from OffsetX and OffsetY header fields).
–azimuth-unit <string>	Azimuth unit. Supported azimuth units are: Radians, Degrees (default).
–azimuth-scale <value>	Azimuth scale factor. Trace header field Azimuth values will be multiplied by this factor.
–respace-gathers <string>	Respace traces in prestack gathers by Offset trace header field. Supported options are: Off, On, Auto (default).
–segy-survey-coordinate-system-unit <string>	Ignore the unit defined in the SEG-Y file and use the defined unit instead. Possible values: Meters, Feet
–survey-coordinate-system-unit <string>	Unit used in the VDS for survey coordinate system. Possible values: Meters (default), Feet, Original
–resume	Resume mode.
–flush-frequency <value>	Flush frequency in seconds. SEGYImport can resume imports at flush checkpoints. 0 (zero) results in never flushing. Default is 60.
-q, –quiet	Disable info level output.
-Q, –very-quiet	Disable warning level output.
-h, –help	Print this help information.
-H, –help-connection	Print help information about the connection string.
–version	Print version information.

For more information about the --url and --url-connection parameter please see: https://osdu.pages.opengroup.org/platform/domain-data-mgmt-services/seismic/open-vds/connection.html

To create a valid VDS from a SEG-Y file, SEGYImport needs to scan the file to determine the extents of the dataset (e.g. number of samples, number of crosslines, number of inlines). During the scanning process we read from a number of fields in the trace headers, most importantly the primary and secondary keys that are used as the axes of the VDS.

For inline-sorted poststack data the inline number is the primary key and the crossline number is the secondary key (this is the default setting). If these are not in the ‘standard’ byte locations in the header, you can override the trace header format using a JSON file that contains definitions of the SEG-Y header fields (that are not in the standard locations) using the –header-format command line option. You can also specify the header field endianness in this file. This is an example of such a JSON file:

{
  "Endianness": "BigEndian",
  "InlineNumber":    [ 5, "FourByte"],
  "CrosslineNumber": [ 9, "FourByte"]
}

To import binned prestack data you need to pass the –prestack parameter which will allow multiple traces for each inline/crossline location and create an extra “Trace (offset)” dimension in the VDS assuming the traces in each gather are sorted by offset.

For other data types (or crossline-sorted data) it is possible to specify which trace header fields the file is sorted on by using the --primary-key and --secondary-key options.

The result of the scanning process is the ‘file info’ and can optionally be saved to a separate file using the --scan option. Such a file can be used later when importing the data by using the --file-info command line option.

If --scan is specified then --file-info argument can be used to specify the output file. If no output file is given, the file info will be printed to stdout.

When SEGYImport is either done generating a “file-info” or it is supplied with a file, it will start generating VDS chunks that will be uploaded to the destination VDS using the connection parameters.

During the scanning stage SEGYImport will also read the binary header of the SEG-Y file and extract some keys at certain predefined positions. These are not possible to override, since it’s not common practice to store these in a different place.

Name	Offset	Width
TracesPerEnsemble	13	2
AuxiliaryTracesPerEnsemble	15	2
SampleInterval	17	2
NumSamples	21	2
DataSampleFormatCode	25	2
EnsembleFold	27	2
TraceSortingCode	29	2
MeasurementSystem	55	2
SEGYFormatRevisionNumber	301	2
FixedLengthTraceFlag	303	2
ExtendedTextualFileHeaderCount	305	2

The default trace header fields (that can be overridden with a header format JSON file) are:

Header Field Name	Aliases	Offset	Width
InlineNumber	Inline	189	4
CrosslineNumber	Crossline	193	4
EnergySourcePointNumber	Shot, SP	17	4
Receiver		13	4
Offset		37	4
EnsembleNumber	CDP, CMP	21	4
EnsembleXCoordinate	CDPXCoordinate, CDP-X, Easting	181	4
EnsembleYCoordinate	CDPYCoordinate, CDP-Y, Northing	185	4
SourceXCoordinate	Source-X	73	4
SourceYCoordinate	Source-Y	77	4
GroupXCoordinate	Group-X, ReceiverXCoordinate, Receiver-X	81	4
GroupYCoordinate	Group-Y, ReceiverYCoordinate, Receiver-Y	85	4
CoordinateScale	Scalar	71	2
OffsetX		97	2
OffsetY		95	2
Azimuth		61	4
MuteStartTime		111	2
MuteEndTime		113	2

For 2D data and unbinned data additional trace position metadata is stored in the VDS. For 2D prestack, 2D poststack, and CDP gathers the X/Y coordinates are taken from the trace header fields EnsembleXCoordinate and EnsembleYCoordinate. For receiver gathers the X/Y coordinates are taken from GroupXCoordinate and GroupYCoordinate. For shot gathers the X/Y coordinates are taken from SourceXCoordinate and SourceYCoordinate.

A valid --url with an optional --connection argument, or a --vdsfile argument, must be given to specify where the output will be written. An input SEG-Y file must also be specified.

Example usage:

SEGYImport --url s3://openvds-test --header-format D:\Datasets\Australia\HeaderFormat.json D:\Datasets\Australia\shakespeare3d_pstm_Time.segy

SEGYImport --vdsfile C:\VDSdata\shakespeare3d_pstm_Time.vds --header-format D:\Datasets\Australia\HeaderFormat.json D:\Datasets\Australia\shakespeare3d_pstm_Time.segy