Just to follow up on the last post and the line “it isn’t reasonable to scan, just like a real database”.
So Dynamo is a pay-per-access-pattern model. I don’t know that I’d encountered that before or seen anyone explain it like that. But the pricing model is per index, and the capacity usage is such that you need an index to access any reasonably sized dataset.
Pay-per-access-pattern is pretty much 180 from what you want for indexing personal data for later random access.
I’m still chewing on it, but given the design goal of cheap-at-rest, and given Dynamo’s requirement to specify all data access patterns I’m considering handrolling indexes and storing them on S3 and bypassing Dynamo entirely.