Apache Atlas 1.0 uses JanusGraph graph database to store its type and entity details. Prior versions of Apache Atlas use Titan 0.5.4 graph database. The two databases use different formats for storage. For deployments upgrading from earlier version Apache Atlas, the data in Titan 0.5.4 graph database should be migrated to JanusGraph graph database.
In addition to the change to the graph database, Apache Atlas 1.0 introduces few optimizations that require different internal representation compared to previous versions. Migration steps detailed below will transform data to be compliant with the new internal representation.
Migration of data is done in following steps:
The duration of migration of data from Apache Atlas 0.8 to Apache Atlas 1.0 can be significant, depending upon the amount of data present in Apache Atlas. This section helps you to estimate the time to migrate, so that you can plan the upgrade process better.
To estimate the time needed to export data, first you need to find the number of entities in Apache Atlas 0.8. This can be done by running the following DSL query:
Referenceable select count()
Assuming Apache Atlas is deployed in a quad-core CPU with 4 GB of RAM allocated:
Atlas Migration Export Utility from Apache Atlas branch-0.8 should be used to export the data from Apache Atlas 0.8 deployments. The implementation of which can be found here.
To build this utility:
Move the Atlas Migration Utility directory to the Apache Atlas 0.8 cluster.
Follow these steps to export the data:
atlas_migration_export.py -d <output directory>
Example:
/home/atlas-migration-utility/atlas_migration_export.py -d /home/atlas-0.8-data
On successful execution, Atlas Migration Utility tool will display messages like these:
atlas-migration-export: starting migration export. Log file location /var/log/atlas/atlas-migration-exporter.log atlas-migration-export: initializing atlas-migration-export: initialized atlas-migration-export: exporting typesDef to file /home/atlas-0.8-data/atlas-migration-typesdef.json atlas-migration-export: exported typesDef to file /home/atlas-0.8-data/atlas-migration-typesdef.json atlas-migration-export: exporting data to file /home/atlas-0.8-data/atlas-migration-data.json atlas-migration-export: exported data to file /home/atlas-0.8-data/atlas-migration-data.json atlas-migration-export: completed migration export!
More details on the progress of export can be found in a log file named atlas-migration-exporter.log, in the log directory specified in atlas-log4j.xml.
Apache Atlas specific Solr collections can be deleted using CURL commands shown below:
curl 'http://<solrHost:port>/solr/admin/collections?action=DELETE&name=vertex_index' curl 'http://<solrHost:port>/solr/admin/collections?action=DELETE&name=edge_index' curl 'http://<solrHost:port>/solr/admin/collections?action=DELETE&name=fulltext_index'
curl 'http://<solrHost:port>/solr/admin/collections?action=CREATE&name=vertex_index&numShards=1&replicationFactor=1&collection.configName=atlas_configs' curl 'http://<solrHost:port>/solr/admin/collections?action=CREATE&name=edge_index&numShards=1&replicationFactor=1&collection.configName=atlas_configs' curl 'http://<solrHost:port>/solr/admin/collections?action=CREATE&name=fulltext_index&numShards=1&replicationFactor=1&collection.configName=atlas_configs'
Please follow the steps below to import the data exported above into Apache Atlas 1.0:
atlas.migration.data.filename=<location of the directory containing exported data>
curl -X GET -u admin:<password> -H "Content-Type: application/json" -H "Cache-Control: no-cache" http://<atlasHost>:port/api/atlas/admin/status
Progress of import will be indicated by a message like this:
{"Status":"MIGRATING","MigrationStatus":{"operationStatus":"IN_PROGRESS","startTime":1526512275110,"endTime":1526512302750,"currentIndex":10,"currentCounter":101,"totalCount":0}}
Successful completion of the operation will show a message like this:
{"Status":"MIGRATING","MigrationStatus":{"operationStatus":"SUCCESS","startTime":1526512275110,"endTime":1526512302750,"currentIndex":0,"currentCounter":0,"totalCount":371}}
Once migration import is complete, i.e. operationStatus is SUCCESS, follow the steps given below to restart Apache Atlas in ACTIVE mode for regular use:
Apache Atlas 1.0 introduces number of new features. For data that is migrated, the following defaults are set:
This features is no longer supported. Classifications that are used as types in attribute definitions (AttributeDefs) are converted in to new types whose name has legacy prefix. These are then handled like any other type. Creation of such types was prevented in an earlier release, hence only type definitions have potential to exist. Care has been taken to handle entities of this type as well.