Skip to content

How to become a DDMS

Introduction

A Domain Data Management Service (DDMS) can be seen as any source of truth for data that manages the data life cycle, satisfies given mandatory data access concerns, and makes its data globally discoverable and retrievable through the OSDU. It could be a standalone service dedicated to a specific data type or a subcomponent of an application or platform. It simply enables its data to be retrieved outside of its regular scope.

A DDMS needs to enforce the common concerns of

  • Legal compliance
  • Data access authorization
  • Discovery
  • Retrieval of the data based on discovery

OSDU solves these concerns primarily using Storage records. A Storage record is metadata pertaining to the bulk data stored in the DDMS. Every record created in Storage enforces that ACLs are assigned, checks compliance and then indexes the record into search, making it discoverable.

The following is the preferred method of using Records to enable these concerns for a DDMS.

Register as a DDMS

The first step is to register as a DDMS. This makes your DDMS discoverable to clients and presents them with an API definition that tells them how to retrieve the bulk data when a record from their DDMS is discovered.

The only API that needs to be defined is the one that tells them how to retrieve the bulk data based on an Id.

Note that you can register as much of your API specification as you like. You only need to define the method clients should use to retrieve the bulk data using the custom property x-ddms-retrieve-entity: true.

Curl Post

    curl --request POST \
    --url '/api/register/v1/ddms' \
    --header 'accept: application/json' \
    --header 'authorization: Bearer <JWT>' \
    --header 'content-type: application/json' \
    --header 'data-partition-id: common' \
    --data '{
    {
    "id": "{DDMS-ID}",
    "name": "logDDMS",
    "description": "My test ddms.",
    "contactEmail": "test@test.com",
    "interfaces": [
        {
        "entityType": "wellbore",
        "schema": {
            "openapi": "3.0.0",
            "info": {
            "description": "This is a sample Wellbore domain DM service.",
            "version": "1.0.0",
            "title": "OSDU Wellbore Domain DM Service",
            "contact": {
                "email": "osdu-sre@opengroup.org"
            }
            },
            "servers": [
            {
                "url": "https://subsurface.data.osdu.com/v1"
            }
            ],
            "tags": [
            {
                "name": "wellbore",
                "description": "Wellbore data type services"
            }
            ],
            "paths": {
              "/ddms/v3/wellbores/{wellboreid}": {
                "get": {
                "description": "Get Wellbore Id",
                "operationId": "get_osdu_wellbore_versions",
                "x-ddms-retrieve-entity": true,
                "parameters": [
                  {
                    "in": "path",
                    "name": "wellboreid",
                    "required": true,
                    "schema": {
                      "title": "Wellboreid",
                      "type": "string"
                    }
                  }
                ],
                "responses": {
                    "200": {
                    "content": {
                        "application/json": {
                        "schema": {
                          "$ref": "#/components/schemas/RecordVersions"
                        }
                      }
                    },
                    "description": "Successful Response"
                    },
                    "400": {
                    "description": "Invalid ID supplied"
                    },
                    "401": {
                    "description": "Not authorized"
                    },
                    "404": {
                    "description": "Wellbore not found"
                    }
                }
            }
        }
    }

Create a DDMS schema

It is up to the bulk data store to determine what properties of the bulk data they want to push into a Storage record and to make discoverable within the OSDU.

They define a storage schema to represent this. The schema is a list of properties and the type of data they represent that will be on the Record.

When deploying your service, you should do the one-time operation of publishing the schema via the Schema API, the schema upload automation will happen through CSP automation scripts, this is an example of the osdu:wks:master-data--Wellbore:1.0.0.

You can find the schema definition in shared schemas for reference.

Curl Post Schema Service

    curl --request POST \
    --url '/api/schema-service/v1/schema' \
    --header 'accept: application/json' \
    --header 'authorization: Bearer <JWT>' \
    --header 'content-type: application/json' \
    --header 'data-partition-id: common' \
    --data '{
        "kind": "osdu:wks:master-data--Wellbore:1.0.0",
        "schema": [
            {
            "path": "name",
            "kind": "string"
            },
            {
            "path": "ddmsId",
            "kind": "string"
            },
            {
            "path": "localId",
            "kind": "string"
            },
            {
            "path": "entityType",
            "kind": "string"
            }]
    }'

This will then allow any Record that references this schema to be indexed in the DE search. Without this, the Record will be published but without any of the data and it will be hidden by default in search.

Notice, we are also declaring 3 properties:

ddmsId is the id used when you register as a DDMS.

entityType is the domain object type the data represents, e.g. ‘seismic’, ‘well’.

localId is the id of the bulk data as it is referenced in your DDMS. The end user should be able to use this id to retrieve the bulk data from your APIs.

These act as well-known properties that should be added to the record by your DDMS. Clients can then use this information to retrieve the bulk data after discovery using the DDMS registration APIs. Every schema created should declare these properties to use this pattern of ingestion.

Unless you have a scenario where you know what legal tag and ACL should be applied to the data you are ingesting, you will need to expose the legal tag and ACL in your ingestion APIs. This allows your clients to supply the legal tag and ACL themselves.

You can expose the same interface as the Storage records API, allowing you to assign them to the record you create.

JSON

    "acl": {
      "viewers": ['data.default.viewers@{datapartition}.{domain}.com'],
      "owners": ['data.default.owners@{datapartition}.{domain}.com']
    },
    "legal": {
      "legaltags": ['common-sample-legaltag']

Optionally expose derivative compliance through your APIs

If you expect derivative data to be stored in your DDMS, you need to expose 2 more properties through your APIs that can be appended to your Storage record.

Again, you can expose the same interface as the Storage records API, allowing you to assign them to the record directly. The 2 properties are:

  • otherRelevantDataCountries: The alpha 2 country code of the country where the derivative was created or calculated
  • parents: The record ids and versions of the Records this derivative was created from

If a derivative is being created then a legal tag does not need to be assigned as it inherits this from its parents.

JSON

    "legal" :{
                  "otherRelevantDataCountries": ["US"] 
      },
      "ancestry" :{
                  "parents": ["common:id:1:version", "common:id:2:version"]
      }    

Ingest data and create a shadow record

Whenever bulk data is ingested, you need to create a shadow record within Storage. DDMS may have the feature to create the record and ingest it automatically such as api/os-wellbore-ddms/ddms/v3/wellbores. This shadow record represents the specific bulk data instance in a 1:1 relationship and makes each instance globally discoverable.

When you create the shadow record using either the DDMS or Storage API, forward on the original callers jwt token.

First, you should store the bulk data, and then create the shadow Record. This way, a global piece of data is not discoverable before the bulk data is available. If this is not successful, e.g. because an invalid legal tag is provided, the request will fail and you should return this response to the client and attempt to clean up the bulk data.

Remember, you should append your DDMS Id, entityType and the bulk data’s local id to the Storage record.

Curl Put Storage

    curl --request PUT \
    --url '/api/storage/v2/records' \
    --header 'accept: application/json' \
    --header 'authorization: Bearer <JWT>' \
    --header 'content-type: application/json' \
    --header 'data-partition-id: common' \
    --data '[
    {
        "kind": "common:welldb:wellbore:1.0.0",
        "acl": {
        "viewers": ['data.default.viewers@{datapartition}.{domain}.com'],
        "owners": ['data.default.owners@{datapartition}.{domain}.com']
        },
        "legal": {
        "legaltags": ['common-sample-legaltag'],
        "otherRelevantDataCountries": ["FR”]
        },
        "data": {
        "name": "well1",
        "entityType": wellbore,
        "ddmsId": "abcdef",
        "localId": "123456"
        }]'

Curl Put DDMS

    curl --request PUT \
    --url 'api/os-wellbore-ddms/ddms/v3/wellbores' \
    --header 'accept: application/json' \
    --header 'authorization: Bearer <JWT>' \
    --header 'content-type: application/json' \
    --header 'data-partition-id: common' \
    --data '[
              {
                  "acl": {
                      "owners": [
                          "data.default.owners@{{data-partition-id}}.{{entitlements_domain}}"
                      ],
                      "viewers": [
                          "data.default.viewers@{{data-partition-id}}.{{entitlements_domain}}"
                      ]
                  },
                  "data": {
                      "ExtensionProperties": {},
                      "FacilityName": "Faciliity_",
                      "FacilityNameAliases": [
                          {
                              "AliasName": "33-089-00300-00-01",
                              "AliasNameTypeID": "{{data-partition-id}}:reference-data--AliasNameType:UniqueIdentifier:"
                          }
                      ],
                      "FacilityOperators": [
                          {
                              "FacilityOperatorID": "Francois Vinyes"
                          }
                      ],
                      "SpatialLocation": {
                          "Wgs84Coordinates": {
                              "features": [
                                  {
                                      "geometry": {
                                          "coordinates": [
                                              [
                                                  -103.2380248,
                                                  46.8925081,
                                                  5301
                                              ],
                                              [
                                                  -103.2380248,
                                                  46.8925081,
                                                  2801
                                              ],
                                              [
                                                  -103.2378748,
                                                  46.892608100000004,
                                                  301
                                              ],
                                              [
                                                  -103.23742477750001,
                                                  46.89270811,
                                                  -2199
                                              ],
                                              [
                                                  -103.23667470999663,
                                                  46.892808120001,
                                                  -4699
                                              ],
                                              [
                                                  -103.2356245974865,
                                                  46.892908130002,
                                                  -7199
                                              ]
                                          ],
                                          "type": "LineString"
                                      },
                                      "properties": {
                                          "name": "Newton 2-31-Lat-1"
                                      },
                                      "type": "Feature"
                                  }
                              ],
                              "type": "FeatureCollection"
                          }
                      },
                      "WellID": "opendes:master-data--Well:9245"
                  },
                  "id": "{{data-partition-id}}:master-data--Wellbore:arthurtest",
                  "kind": "osdu:wks:master-data--Wellbore:1.0.0",
                  "legal": {
                      "legaltags": [
                          "opendes-public-usa-dataset-epam"
                      ],
                      "otherRelevantDataCountries": [
                          "FR",
                          "US"
                      ]
                  },
                  "meta": [
                      {
                          "kind": "Unit",
                          "name": "Measure depth default unit",
                          "persistableReference": "persistableReference",
                          "propertyNames": [
                              "symbol"
                          ],
                          "propertyValues": [
                              "ft"
                          ]
                      }
                  ]
              }
          ]'

Perform compliance and ACL checks using shadow records

As mentioned, a DDMS should create a shadow record for every instance of bulk data ingested into their data store. This can have advantages beyond global discover-ability. Whenever you request a storage record, both compliance and entitlements are checked before returning the data. A DDMS can use this to their advantage.

By forwarding on any request by the client to retrieve the record, you can delegate these responsibilities to the Storage service. If OSDU returns the Record, the client can access both this and the bulk data, and so you can return the same to the client or only the Record.

Curl Post Storage Query Records

    curl --request POST \
    --url '/api/storage/v2/query/records:batch' \
    --header 'authorization: Bearer <JWT>' \
    --header 'content-type: application/json' \
    --header 'data-partition-id: common' \
    --header 'frame-of-reference: NONE' \
    --data '{
        "records": [
            "common:test:fetchtest-1",
            "common:test:fetchtest-2",
            "common:test:fetchtest-4",
            "common:test:fetchtest-5",
            "common:test:fetchtest-6"

        ]
    }

In this scenario, you also don’t need to store the ACL or legal tag information in your DDMS because those are being retrieved directly from the OSDU in this request. However, you need to either store or be able to generate the Storage record ID needed to retrieve the record for the bulk data requested.

Client retrieves the bulk data

Imagine the client discovered a record with the following data

curl

    "data": {
      "name": "NPD-3180",
      "entityType": wellbore,
      "ddmsId": "wellbores",
      "localId": "opendes:master-data--Wellbore:NPD-3180"
    }

They can use the ddmsId property of the data object to retrieve the API definition of the DDMS you registered at the start.

curl

    curl --request GET \
    --url '/api/register/v1/ddms/wellbore' \
    --header 'authorization: Bearer <JWT>' \
    --header 'content-type: application/json' \
    --header 'data-partition-id: common' 

This will return them the registered DDMS with the API specification. So returning to the original one we registered would look like

    {
        "entityType": "wellbore",
        "schema": {
            "openapi": "3.0.0",
            "info": {
            "description": "This is a sample Wellbore domain DM service.",
            "version": "1.0.0",
            "title": "OSDU Wellbore Domain DM Service",
            "contact": {
                "email": "osdu-sre@opengroup.org"
            }
            },
            "servers": [
            {
                "url": "https://subsurface.data.osdu.com/v1"
            }
            ],
            "tags": [
            {
                "name": "wellbore",
                "description": "Wellbore data type services"
            }
            ],
            "paths": {
            "/wellbore/{wellboreId}": {
                "get": {
                "tags": [
                    "wellbore"
                ],
                "summary": "Find wellbore by ID",
                "description": "Returns a single wellbore",
                "operationId": "getWellboreById",
                "x-ddms-retrieve-entity": true,
                "parameters": [
                    {
                    "name": "wellboreId",
                    "in": "path",
                    "description": "ID of wellbore to return",
                    "required": true,
                    "schema": {
                        "type": "string"
                    }
                    }
                ],
                "responses": {
                    "200": {
                    "description": "successful operation",
                    "content": {
                        "application/json": {
                        "schema": {
                            "$ref": "#/components/schemas/wellbore"
                        }
                        }
                    }
                    },
                    "400": {
                    "description": "Invalid ID supplied"
                    },
                    "401": {
                    "description": "Not authorized"
                    },
                    "404": {
                    "description": "Wellbore not found"
                    }
                }
            }
        }

They can then use the entityType property to work out which API definition to use, and then use the localId property to work out how to create the API call to retrieve the bulk data from the DDMS using the API defined.

Currently, this is a manual step for the user and so they have to understand your API definition. However once they do, they can apply the same pattern to any Record discovered for your DDMS.

Using the returned specification and data we discovered, the resulting API call the user would be expected to make to retrieve the bulk data would be

Curl
    curl --request GET \
    --url 'https://subsurface.data.osdu.com/v3/wellbores/123456' \
    --header 'authorization: Bearer <JWT>' \
    --header 'content-type: application/json' \
    --header 'data-partition-id: common' 

Where 123456 is the localId stored in the records data and the URL is defined in the API spec in the server, and path sections of the operation that has the property x-ddms-retrieve-entity.

Client retrieves Single Entity id data

Client should have already some data/record id for the ddms, I.E. opendes:master-data--Wellbore:NPD-3180.

You can retrieve them directly with register service Single Entity retrieval.

Curl Get Register
    curl --request GET \
    --url '/api/register/v1/ddms/wellbore/wellbores/opendes:master-data--Wellbore:NPD-3180' \
    --header 'authorization: Bearer <JWT>' \
    --header 'content-type: application/json' \
    --header 'data-partition-id: common' 

This will redirect 307 to the proper DDMS url

verbose curl response
HTTP/2 307 
x-frame-options: DENY
strict-transport-security: max-age=31536000; includeSubDomains
cache-control: no-cache, no-store, must-revalidate
access-control-allow-origin: *
access-control-allow-credentials: true
access-control-allow-methods: GET, POST, PUT, DELETE, OPTIONS, HEAD, PATCH
x-content-type-options: nosniff
content-security-policy: default-src 'self'
expires: 0
x-xss-protection: 1; mode=block
access-control-max-age: 3600
access-control-allow-headers: access-control-allow-origin, origin, content-type, accept, authorization, data-partition-id, correlation-id, appkey
location: https://subsurface.data.osdu.com/api/os-wellbore-ddms/ddms/v3/wellbores/opendes:master-data--Wellbore:NPD-3180
content-length: 0
date: Mon, 20 Feb 2023 15:45:08 GMT
x-envoy-upstream-service-time: 252
server: istio-envoy

HTTP/2 200 
date: Mon, 20 Feb 2023 15:45:08 GMT
server: istio-envoy
content-length: 31
content-type: application/json
x-envoy-upstream-service-time: 16

{
    "id": "opendes:master-data--Wellbore:NPD-3180",
    "kind": "osdu:wks:master-data--Wellbore:1.0.0",
    "version": 1664908939570203,
    "acl": {
        "owners": [
            "data.default.owners@opendes.contoso.com"
        ],
        "viewers": [
            "data.default.viewers@opendes.contoso.com"
        ]
    },
    "legal": {
        "legaltags": [
            "opendes-public-usa-dataset-open-test-data"
        ],
        "otherRelevantDataCountries": [
            "US"
        ]
    },
    "createTime": "2022-10-04T18:42:19.631000+00:00",
    "createUser": "c8472f20-b407-4fb8-9609-f4672b6aa610",
    "meta": null,
    "data": {
        "FacilityID": "10173123-005",
        "FacilityTypeID": "opendes:reference-data--FacilityType:WELLBORE:",
        "InitialOperatorID": "opendes:master-data--Organisation:Den%20norske%20stats%20olj:",
        "DataSourceOrganisationID": "opendes:master-data--Organisation:BLENDED:",
        "FacilityName": "15/9-19 SR2",
        "NameAliases": [
            {
                "AliasName": "NPD-3180",
                "AliasNameTypeID": "opendes:reference-data--AliasNameType:UWBI:"
            }
        ],
        "GeoContexts": [],
        "ResourceSecurityClassification": "opendes:reference-data--ResourceSecurityClassification:Public:",
        "Source": "BLENDED",
        "WellID": "opendes:master-data--Well:15%2F9-19:",
        "SequenceNumber": 1,
        "VerticalMeasurements": [
            {
                "VerticalMeasurementPathID": "opendes:reference-data--VerticalMeasurementPath:DEPTH_DATUM_ELEV:",
                "VerticalCRSID": "opendes:reference-data--CoordinateReferenceSystem:MSL:"
            },
            {
                "VerticalMeasurementPathID": "opendes:reference-data--VerticalMeasurementPath:MD:",
                "VerticalCRSID": "opendes:reference-data--CoordinateReferenceSystem:MSL:",
                "VerticalMeasurementID": "MD"
            },
            {
                "VerticalMeasurement": 3132.0,
                "VerticalMeasurementPathID": "opendes:reference-data--VerticalMeasurementPath:TVD:",
                "VerticalCRSID": "opendes:reference-data--CoordinateReferenceSystem:MSL:",
                "VerticalMeasurementID": "TVD"
            }
        ],
        "GeographicBottomHoleLocation": {
            "AsIngestedCoordinates": {
                "type": "AnyCrsFeatureCollection",
                "CoordinateReferenceSystemID": "opendes:reference-data--CoordinateReferenceSystem:4326:",
                "persistableReferenceCrs": "",
                "features": []
            }
        }
    }
}