Jason, firstly brilliant read. But any thoughts on the hashed composite keys?
Type 1
/Drive A/Folder A/File A - pk : Path sk : DriveA#FolderA#FileA
/Drive A/Folder A/File B - pk : Path sk : DriveA#FolderA#FileB
We can use the sk begins with "DriveA#" this would get all the files under the folder.
Type 2
/Drive A/Folder A/File A - pk : DriveA#FolderA sk : FileA
/Drive A/Folder A/File B - pk : DriveA#FolderA sk : FileB
We can use the pk : "DriveA#FolderA" this would get all the files under Folder A
There are certainly many variations on how you might do the path base pattern. I chose to put the root in the partion key to avoid a hot partition, which you would get with a hard coded "Path". That said, most often I'm in a multi-tenant environment where there is an identifier for the customer, so I'd likely put the customer's ID in the partition and put the entire path in the sort key, as you showed in your first example.
On the second example, you'd have to have duplicated data to use the full path in the partition key, unless you only wanted to get data directly below a node. You have to provide the entire partition key on all query operations, so you can't do something like begins_with(pk, 'DriveA#') if you wanted everything below 'Drive A'.
Jason, firstly brilliant read. But any thoughts on the hashed composite keys?
Type 1
/Drive A/Folder A/File A - pk : Path sk : DriveA#FolderA#FileA
/Drive A/Folder A/File B - pk : Path sk : DriveA#FolderA#FileB
We can use the sk begins with "DriveA#" this would get all the files under the folder.
Type 2
/Drive A/Folder A/File A - pk : DriveA#FolderA sk : FileA
/Drive A/Folder A/File B - pk : DriveA#FolderA sk : FileB
We can use the pk : "DriveA#FolderA" this would get all the files under Folder A
Any thoughts around this?
There are certainly many variations on how you might do the path base pattern. I chose to put the root in the partion key to avoid a hot partition, which you would get with a hard coded "Path". That said, most often I'm in a multi-tenant environment where there is an identifier for the customer, so I'd likely put the customer's ID in the partition and put the entire path in the sort key, as you showed in your first example.
On the second example, you'd have to have duplicated data to use the full path in the partition key, unless you only wanted to get data directly below a node. You have to provide the entire partition key on all query operations, so you can't do something like
begins_with(pk, 'DriveA#')
if you wanted everything below 'Drive A'.Understood. Yea having a query with partition key and operating with begins_with is not possible.