Export & Import APIs for HDFS Path

Introduction

The general approach for using the Import-Export APIs for HDFS Paths remain the same. There are minor variations caused how HDFS paths are handled within Atlas.

Unlike HIVE entities, HDFS entities within Atlas are created manually using the Create Entity link within the Atlas Web UI.

Also, HDFS paths tend to be hierarchical, in the sense that users tend to model the same HDFS storage structure within Atlas.

Sample HDFS Setup

HDFS Path Atlas Entity
/apps/warehouse/finance Entity type: hdfs_path
Name: Finance
QualifiedName: FinanceAll
/apps/warehouse/finance/accounts-receivable Entity type: hdfs_path
Name: FinanceReceivable
QualifiedName: FinanceReceivable
Path: /apps/warehouse/finance
/apps/warehouse/finance/accounts-payable Entity type: hdfs_path
Name: Finance-Payable
QualifiedName: FinancePayable
Path: /apps/warehouse/finance/accounts-payable
/apps/warehouse/finance/billing Entity type: hdfs_path
Name: FinanceBilling
QualifiedName: FinanceBilling
Path: /apps/warehouse/finance/billing

Export API Using matchType

To export entities that represent HDFS path, use the Export API using the matchType option. Details can be found here.

Example Using CURL Calls

Below are sample CURL calls that performs export operation on the Sample HDFS Setup shown above.

curl -X POST -u adminuser:password -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
    "itemsToExport": [
            { "typeName": "hdfs_path", "uniqueAttributes": { "name": "FinanceAll" }
        }
    ],
    "options": {
     "fetchType": "full",
     "matchType": "startsWith"
    }
}' "http://localhost:21000/api/atlas/admin/export" > financeAll.zip

Automatic Creation of HDFS entities

Given that HDFS entity creation is a manual process. The Export API offers a mechanism for creation of requested HDFS entities.