The general approach for using the Import-Export APIs for HDFS Paths remain the same. There are minor variations caused how HDFS paths are handled within Atlas.
Unlike HIVE entities, HDFS entities within Atlas are created manually using the Create Entity link within the Atlas Web UI.
Also, HDFS paths tend to be hierarchical, in the sense that users tend to model the same HDFS storage structure within Atlas.
Sample HDFS Setup
HDFS Path | Atlas Entity |
---|---|
/apps/warehouse/finance | Entity type: hdfs_path Name: Finance QualifiedName: FinanceAll |
/apps/warehouse/finance/accounts-receivable | Entity type: hdfs_path Name: FinanceReceivable QualifiedName: FinanceReceivable Path: /apps/warehouse/finance | /apps/warehouse/finance/accounts-payable | Entity type: hdfs_path Name: Finance-Payable QualifiedName: FinancePayable Path: /apps/warehouse/finance/accounts-payable | /apps/warehouse/finance/billing | Entity type: hdfs_path Name: FinanceBilling QualifiedName: FinanceBilling Path: /apps/warehouse/finance/billing |
To export entities that represent HDFS path, use the Export API using the matchType option. Details can be found here.
Below are sample CURL calls that performs export operation on the Sample HDFS Setup shown above.
curl -X POST -u adminuser:password -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{ "itemsToExport": [ { "typeName": "hdfs_path", "uniqueAttributes": { "name": "FinanceAll" } } ], "options": { "fetchType": "full", "matchType": "startsWith" } }' "http://localhost:21000/api/atlas/admin/export" > financeAll.zip