Export API

The general approach is:

  • Consumer specifies the scope of data to be exported (details below).
  • The API if successful, will return the stream in the format specified.
  • Error will be returned on failure of the call.

See here for details on exporting hdfs_path entities.

Title Export API
Example See Examples sections below.
URL api/atlas/admin/export
Method POST
URL Parameters None
Data Parameters The class AtlasExportRequest is used to specify the items to export. The list of AtlasObjectId(s) allow for specifying the multiple items to export in a session. The AtlasObjectId is a tuple of entity type, name of unique attribute, value of unique attribute. Several items can be specified. See examples below.
Success Response File stream as application/zip.
Error Response Errors that are handled within the system will be returned as AtlasBaseException.
Notes Consumer could choose to consume the output of the API by programmatically using java.io.ByteOutputStream or by manually, save the contents of the stream to a file on the disk.
Method Signature
@POST
@Path("/export")
@Consumes("application/json;charset=UTF-8")

Additional Options

It is possible to specify additional parameters for the Export operation.

Current implementation has 2 options. Both are optional:

  • matchType This option configures the approach used for fetching the starting entity. It has follow values:
    • startsWith Search for an entity that is prefixed with the specified criteria.
    • endsWith Search for an entity that is suffixed with the specified criteria.
    • contains Search for an entity that has the specified criteria as a sub-string.
    • matches Search for an entity that is a regular expression match with the specified criteria.

  • fetchType This option configures the approach used for fetching entities. It has following values:
    • FULL: This fetches all the entities that are connected directly and indirectly to the starting entity. E.g. If a starting entity specified is a table, then this option will fetch the table, database and all the other tables within the database.
    • CONNECTED: This fetches all the etnties that are connected directly to the starting entity. E.g. If a starting entity specified is a table, then this option will fetch the table and the database entity only.

If no matchType is specified, exact match is used. Which means, that the entire string is used in the search criteria.

Searching using matchType applies for all types of entities. It is particularly useful for matching entities of type hdfs_path (see here).

The fetchType option defaults to FULL.

For complete example see section below.

Contents of Exported ZIP File

The exported ZIP file has the following entries within it:

  • atlas-export-result.json:
    • Input filters: The scope of export.
    • File format: The format chosen for the export operation.
    • Metrics: The number of entity definitions, classifications and entities exported.
  • atlas-typesdef.json: Type definitions for the entities exported.
  • atlas-export-order.json: Order in which entities should be exported.
  • {guid}.json: Individual entities are exported with file names that correspond to their id.

Examples

The AtlasExportRequest below shows filters that attempt to export 2 databases in cluster cl1:

{
    "itemsToExport": [
       { "typeName": "hive_db", "uniqueAttributes": { "qualifiedName": "accounts@cl1" } },
       { "typeName": "hive_db", "uniqueAttributes": { "qualifiedName": "hr@cl1" } }
    ]
}

The AtlasExportRequest below specifies the fetchType as FULL. The matchType option will fetch accounts@cl1.

{
    "itemsToExport": [
       { "typeName": "hive_db", "uniqueAttributes": { "qualifiedName": "accounts@" } },
    ],
    "options" {
        "fetchType": "FULL",
        "matchType": "startsWith"
    }
}

The AtlasExportRequest below specifies the fetchType as connected. The matchType option will fetch accountsReceivable, accountsPayable, etc present in the database.

{
    "itemsToExport": [
       { "typeName": "hive_db", "uniqueAttributes": { "name": "accounts" } },
    ],
    "options" {
        "fetchType": "CONNECTED",
        "matchType": "startsWith"
    }
}

Below is the AtlasExportResult JSON for the export of the Sales DB present in the QuickStart.

The metrics contains the number of types and entities exported as part of the operation.

{
    "clientIpAddress": "10.0.2.15",
    "hostName": "10.0.2.2",
    "metrics": {
        "duration": 1415,
        "entitiesWithExtInfo": 12,
        "entity:DB_v1": 2,
        "entity:LoadProcess_v1": 2,
        "entity:Table_v1": 6,
        "entity:View_v1": 2,
        "typedef:Column_v1": 1,
        "typedef:DB_v1": 1,
        "typedef:LoadProcess_v1": 1,
        "typedef:StorageDesc_v1": 1,
        "typedef:Table_v1": 1,
        "typedef:View_v1": 1,
        "typedef:classification": 6
    },
    "operationStatus": "SUCCESS",
    "request": {
        "itemsToExport": [
            {
                "typeName": "DB_v1",
                "uniqueAttributes": {
                    "name": "Sales"
                }
            }
        ],
        "options": {
            "fetchType": "full"
        }
    },
    "userName": "admin"
}

CURL Calls

Below are sample CURL calls that demonstrate Export of QuickStart database.

curl -X POST -u adminuser:password -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
    "itemsToExport": [
            { "typeName": "DB", "uniqueAttributes": { "name": "Sales" }
            { "typeName": "DB", "uniqueAttributes": { "name": "Reporting" }
            { "typeName": "DB", "uniqueAttributes": { "name": "Logging" }
        }
    ],
        "options": { "full" }
    }' "http://localhost:21000/api/atlas/admin/export" > quickStartDB.zip