An improved C# application for loading TNO (Netherlands Organisation for Applied Scientific Research) data into the OSDU platform.
- Simple CLI Interface - Three intuitive commands to get you started
- Automatic Processing - Handles all TNO data types in the correct dependency order
- File Upload Support - Complete 4-step OSDU file upload workflow
- Secure Authentication - Uses Azure Identity for passwordless authentication
- Progress Tracking - Real-time progress updates and detailed logging
- Error Resilience - Comprehensive retry policies and error handling
- Clean Architecture - CQRS pattern with proper separation of concerns
The application follows a comprehensive 6-step process to load TNO data into OSDU:
- Downloads TNO Dataset Files - Retrieves official TNO test data from GitLab repository
- Creates Legal Tag - Establishes required legal compliance tags for data governance
- Uploads Files to OSDU - Executes 4-step file upload workflow:
- Requests file upload URL from File API
- Uploads file content to storage
- Submits metadata to File Service
- Maintains registry of uploaded files with IDs and versions
- Generates Non-Work Product Manifests - Creates manifests for master data:
- Uses CSV templates to generate individual manifests for each data row
- Processes reference data, wells, wellbores, and related entities
- Generates Work Product Manifests - Creates work product metadata:
- Iterates through uploaded files registry
- Retrieves JSON metadata from work product folders
- Updates manifests with legal tags, ACL permissions, and data partition IDs
- Uploads Manifests - Submits all manifests to OSDU in correct dependency order
For detailed information about each step, see Data Load Process Documentation.
Before you begin, ensure you have:
- .NET 9.0 or later installed
- Azure CLI for authentication:
az login --tenant your-tenant-id
- Azure Developer CLI (azd) for deployments
- OSDU Platform Access with
users.datalake.ops
andusers@<data partition>.dataservices.energy roles
role - Visual Studio or VS Code (optional, for development)
Update appsettings.json
in the src/OSDU.DataLoad.Console/
directory with your OSDU instance details:
{
"Osdu": {
"BaseUrl": "https://your-osdu-instance.com",
"TenantId": "your-tenant-id",
"ClientId": "your-client-id",
"DataPartition": "your-data-partition",
"LegalTag": "{DataPartition}-your-legal-tag",
"AclViewer": "data.default.viewers@{DataPartition}.dataservices.energy",
"AclOwner": "data.default.owners@{DataPartition}.dataservices.energy"
}
}
Note: You can provide environment variables instead. See: Configuration Guide
# Navigate to the console project
cd src/OSDU.DataLoad.Console
# Build the solution
dotnet build
# Run commands directly
dotnet run -- help
dotnet run -- download --destination "~/osdu-data/tno"
dotnet run -- load --source "~/osdu-data/tno"
# Run without any arguments - downloads data if needed, then loads it
dotnet run
When run without arguments, the application will:
- Check for TNO data in
~/osdu-data/tno/
(user home directory) - Download the test data if not present (~2.2GB)
- Load all data types into OSDU platform automatically
This is the easiest way to get started - just configure your OSDU settings and run!
# From console project directory (recommended)
dotnet run -- help
# Or from src directory
dotnet run --project OSDU.DataLoad.Console --working-directory OSDU.DataLoad.Console -- help
Shows available commands, usage examples, and current configuration status.
# Download ~2.2GB of official test data (from console project directory)
dotnet run -- download --destination "~/osdu-data/tno"
# Overwrite existing data
dotnet run -- download --destination "~/osdu-data/tno" --overwrite
# Load all TNO data types in dependency order (from console project directory)
dotnet run -- load --source "~/osdu-data/tno"
-
Create an azd environment
# Navigate to the project root azd init -e dev
-
Configure the environment variables
azd env set OSDU_TenantId $(az account show --query tenantId -o tsv ) azd env set AZURE_SUBSCRIPTION_ID <Azure subscription id> azd env set AZURE_LOCATION <Azure Region> azd env set OSDU_BaseUrl <https://your-osdu-instance.com> azd env set OSDU_ClientId <your-client-ID> azd env set OSDU_DataPartition <your-data-partition> azd env set OSDU_LegalTag <{DataPartition}-your-legal-tag> azd env set OSDU_AclViewer <data.default.viewers@{DataPartition}.dataservices.energy> azd env set OSDU_AclOwner <data.default.owners@{DataPartition}.dataservices.energy>
azd provision
Important: Get the object ID of the managed identity and assign it users.datalake.ops
and users@<data partition>.dataservices.energy roles
on your data partition.
azd deploy
For detailed information on specific topics, see our documentation:
- Data Loading Process - Detailed workflow and processing order
- Configuration Guide - Advanced configuration options and environment variables
Symptoms: HTTP 401 errors, "Failed to authenticate" messages
Solutions:
- Azure CLI: Ensure you're logged in:
az login --tenant your-tenant-id
- Permissions: Verify you have the
users.datalake.ops
andusers@<data partition>.dataservices.energy roles
role in OSDU - Configuration: Check TenantId and ClientId in configuration
- Managed Identity: Verify Managed Identity is configured (when running on Azure)
- Scope: Ensure the scope is correctly set to
{ClientId}/.default
- Environment Variables: Verify
AZURE_CLIENT_ID
,AZURE_TENANT_ID
are set correctly
Symptoms: Slow upload speeds, timeouts
Solutions:
- Run upload in Azure: See Azure Deployments
- Adjust batch size: Adjust the MasterDataManifestSubmissionBatchSize value to increae the number of manifests submitted in a single workflow request.
Symptoms: The file is uploaded and metadata is created, but /v2/records/{id} returns 404
fail: OSDU.DataLoad.Infrastructure.Services.OsduHttpClient[0]
[2e82ab6a] GET https://pm44a0805b33bc4.oep.ppe.azure-int.net/api/storage/v2/records/opendes:dataset--File.Generic:e4f2b1ee-2732-4259-ab47-d30ff4c2a095 failed with status NotFound
fail: OSDU.DataLoad.Infrastructure.Services.OsduHttpClient[0]
[2e82ab6a] Step 4 Failed: Could not retrieve record version for FileID: opendes:dataset--File.Generic:e4f2b1ee-2732-4259-ab47-d30ff4c2a095
Solutions:
- Restart the OSDU-Storage pods
Symptoms: No logs in the container app. You may see a kubernetes error.
Solutions:
- Redeploy: Redeploy the container with
az deploy
This solution follows Clean Architecture and CQRS principles. For detailed information on contributing:
- Review the existing code patterns and structure
- Follow established naming conventions
- Add appropriate unit tests for new features
- Update documentation as needed
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.
OSDU is a trademark of The Open Group.