The default falcon modelling is available in org.apache.atlas.falcon.model.FalconDataModelGenerator. It defines the following types:
falcon_cluster(ClassType) - super types [Infrastructure] - attributes [timestamp, colo, owner, tags] falcon_feed(ClassType) - super types [DataSet] - attributes [timestamp, stored-in, owner, groups, tags] falcon_feed_creation(ClassType) - super types [Process] - attributes [timestamp, stored-in, owner] falcon_feed_replication(ClassType) - super types [Process] - attributes [timestamp, owner] falcon_process(ClassType) - super types [Process] - attributes [timestamp, runs-on, owner, tags, pipelines, workflow-properties]
One falcon_process entity is created for every cluster that the falcon process is defined for.
The entities are created and de-duped using unique qualifiedName attribute. They provide namespace and can be used for querying/lineage as well. The unique attributes are:
Falcon supports listeners on falcon entity submission. This is used to add entities in Atlas using the model defined in org.apache.atlas.falcon.model.FalconDataModelGenerator. The hook submits the request to a thread pool executor to avoid blocking the command execution. The thread submits the entities as message to the notification server and atlas server reads these messages and registers the entities.
export FALCON_SERVER_OPTS="$FALCON_SERVER_OPTS -Datlas.conf=<atlas-conf>"
The following properties in <atlas-conf>/atlas-application.properties control the thread pool and notification details:
Refer Configuration for notification related configurations