Content migration between any 2 Adobe Experience Manager(AEM) instances presents its own challenges, in some cases those are compounded by the method. In this post, I present a way to tackle content migration in offline mode using oak-upgrade. oak-upgrade module has been traditionally used to upgrade from JCR 2.0 to the Oak node store. With the newest AEM platforms shipping with the oak repository, it has found its calling in migrating content between them — sidegrade.
oak-upgrade is a Swiss army knife for copying content between virtually any repositories.
Below is a minimalistic rundown of the steps involved in migrating content between 2 repositories using oak-upgrade.
- Both source and destination oak repositories are the same version. This is typically the case when both source and destination AEM instances are the same version. The oak repository version can be looked up at a few places.
/crx/explorer/config/index.jsp. Look for
/system/console/jmx/com.adobe.granite%3Atype%3DRepository. Look for
The repository is owned by a restricted user. This is generally a best practice to secure the repository at File System(FS) layer. In this case our user is
crxthat owns the
crx-quickstart/repositorypath on the FS.
sudo is available so the process can be started as user
Ensure a FS snapshot of
crx-quickstart/repositoryfolder from the source AEM instance is available.
Content is being migrated from author-to-author and publish-to-publish.
- Remove all custom indexes on the repository. Depending on the type(sync, async, nrt) these indexes could prolong the migration time. Using oak-upgrade will involve opening of the tar files in the repository to copy them. As a result, a reindex will occur on the start up of AEM post-migration. For large sync or nrt indexes, this could potentially add a few hours! to the startup time.
- Stop destination aem. Verify once, twice..several times that the destination aem has stopped. It also helps to verify that the stopping of aem was clean. Ensure there are no repository or tar errors during the shutdown
- Mount FS snapshot of the source AEM instance. At this juncture, ensure, ensure, ensure no aem java processes are running.
- Run the oak-upgrade:
#nohup sudo -u crx java -Xmx20000m -jar oak-upgrade-1.6.16.jar /source/crx-quickstart/repository /dest/crx-quickstart/repository --copy-binaries --src-datastore=/source/crx-quickstart/repository/datastore --datastore=/dest/crx-quickstart/repository/repository/datastore --copy-versions=true --copy-orphaned-versions=false --include-paths=/content,/etc/cloudservices,/etc/cloudsettings,/etc/designs,/etc/segmentation,/etc/tags,/etc/workflow,/var/audit,/jcr:system/rep:namespaces &
- It is suggested to use absolute paths wherever possible. The content paths in the above command do not reflect the new 6.4+ content structure of AEM. Modify per the need.
- Scan nohup.out for any errors.
- When finished the output in nohup.out looks like something below:
15.01.2019 23:16:49.356 [main] *INFO* org.apache.jackrabbit.oak.upgrade.RepositoryUpgrade - Updating indexes ____ 15.01.2019 23:16:49.745 [main] *INFO* org.apache.jackrabbit.oak.upgrade.RepositoryUpgrade - Checking node types: Traversed #30000 /jcr:system/jcr:versionStorage/50/1f/78/501f7852-307a-4230-a201-d74f8be71b86 15.01.2019 23:16:51.008 [main] *INFO* org.apache.jackrabbit.oak.plugins.index.IndexUpdate - /oak:index/uuid => Indexed 540000 nodes in 2.669 s ... 15.01.2019 23:16:51.606 [main] *INFO* org.apache.jackrabbit.oak.upgrade.RepositoryUpgrade - Updating indexes / ___| 15.01.2019 23:16:52.072 [main] *INFO* org.apache.jackrabbit.oak.upgrade.RepositoryUpgrade - Checking node types: Traversed #40000 /jcr:system/jcr:versionStorage/f9/e1 15.01.2019 23:16:52.461 [main] *INFO* org.apache.jackrabbit.oak.upgrade.RepositoryUpgrade - Checking node types: Traversed #50000 /jcr:system/jcr:versionStorage/ba/d4/7f 15.01.2019 23:16:52.926 [main] *INFO* org.apache.jackrabbit.oak.plugins.index.IndexUpdate - /oak:index/uuid => Indexed 550000 nodes in 1.918 s ... 15.01.2019 23:17:15.032 [main] *INFO* org.apache.jackrabbit.oak.upgrade.RepositoryUpgrade - Commit hook EditorHook : (CompositeEditorProvider : ([TypeEditorProvider, IndexEditorProvider])) processed commit in 4.266 min 15.01.2019 23:18:09.231 [main] *INFO* org.apache.jackrabbit.oak.segment.file.FileStore - TarMK closed: /dest/crx-quickstart/repository/segmentstore 15.01.2019 23:18:11.195 [main] *INFO* org.apache.jackrabbit.oak.segment.file.ReadOnlyFileStore - TarMK closed: /source/crx-quickstart/repository/segmentstore
- Start up the destination AEM. Remember that the startup may take longer due to indexing. Ensure async indexing lanes are running and not failing by monitoring these mbeans. It is a good idea to let the indexer finish before restarting aem. Time it takes depends on size of repo.
In my opinion, content migration has traditionally been a DevOps/AEM administrator task but I encourage folks in developer roles also to get their hands dirty. This is an opportunity to learn the innards of Adobe Experience Manager(AEM).
Special thanks to:
@Adobe — Matt Vesely, Josh Hamer, Tom Blackford