tag:blogger.com,1999:blog-18508356.post378173131439873438..comments2023-05-21T09:53:50.042-04:00Comments on Just a little Python: MongoDB Schema Design at ScaleRick Copelandhttp://www.blogger.com/profile/11612114223288841087noreply@blogger.comBlogger23125tag:blogger.com,1999:blog-18508356.post-21837315913317934252016-10-06T05:51:58.418-04:002016-10-06T05:51:58.418-04:00niceniceAnonymoushttps://www.blogger.com/profile/14447550912257112248noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-19467527831718576612013-09-11T10:22:41.450-04:002013-09-11T10:22:41.450-04:00If you only want a single metric, the query is str...If you only want a single metric, the query is straightforward. If you need multiple metrics, you can do a regex query on _id for '^20101010/', which will be reasonably fast. If you're *always* getting the same set of metrics, however, I'd recommend storing them alongside one another in the document instead.<br /><br />The question of whether to use multiple *collections* or a single collection is pretty much a wash performance-wise. Separating your documents into different collections is only really *necessary* when you have different query patterns (and therefore a need for different indexes, sharding approaches, etc.).<br /><br />Hope this helps!Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-87637193885305098332013-09-11T10:19:16.137-04:002013-09-11T10:19:16.137-04:00Well, you can store the three counters in three co...Well, you can store the three counters in three collections. I don't think there's a good way to make it too much more compact, though; you don't want to "pack" the arrays since that means MongoDB can't do an in-place update, and BSON's not particularly efficient as a storage protocol (compared with relational DBs, that is).Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-49369913811702827242013-09-10T17:34:40.382-04:002013-09-10T17:34:40.382-04:00Great post. I have some questions for you. You are...Great post. I have some questions for you. You are using "20101010/metric-1" as an _id. So you are using a single collection for multiple metrics. How do you query that? <br />Also, having in mind that I have multiple devices with multiple metrics what's better to do. Create one collection per device and store multiple metrics or create one collection for every single metric?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18508356.post-4357902913852962682013-09-09T05:45:31.114-04:002013-09-09T05:45:31.114-04:00I want to structure my data this way but I need a ...I want to structure my data this way but I need a counter for each value(daily, hourly, minute). Any idea of how to store that efficiently? I firstly tried this solution but I want my data to be more compact.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18508356.post-91641446041058503302013-01-15T20:51:26.456-05:002013-01-15T20:51:26.456-05:00Hi Anon,
Monary looks pretty cool; I hadn't h...Hi Anon,<br /><br />Monary looks pretty cool; I hadn't heard of it before. I don't think it's really applicable to this case, however, as it's focused on reading a "stripe" of values from many documents into a single numpy array. What I'm trying to do here is to repeatedly update a single document.<br /><br />Thanks for the comment, though. Monary looks very interesting; I'm going to have to find a place to use it. :-)Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-63821841447986704272013-01-15T17:48:29.149-05:002013-01-15T17:48:29.149-05:00What about using alternative connector like Monary...What about using alternative connector like Monary which instead of encapsulating into dictionaries, it relies on NumPy arrays?<br /><br />https://bitbucket.org/djcbeach/monary/wiki/HomeAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-18508356.post-23271470834758632182012-11-07T15:21:02.218-05:002012-11-07T15:21:02.218-05:00This comment has been removed by the author.Unknownhttps://www.blogger.com/profile/10346436738062930772noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-50165211898808665992012-10-28T19:56:11.855-04:002012-10-28T19:56:11.855-04:00Thanks for the comment, Vivek! Glad you found it u...Thanks for the comment, Vivek! Glad you found it useful :)Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-63399096178405155292012-10-27T12:58:16.058-04:002012-10-27T12:58:16.058-04:00Simple and insightful article, thank you Rick.
I&#...Simple and insightful article, thank you Rick.<br />I'm making schema changes right-away :)Vivekhttp://vyadav.innoreply@blogger.comtag:blogger.com,1999:blog-18508356.post-38768293305974058442012-10-13T10:21:34.087-04:002012-10-13T10:21:34.087-04:00Hi Sergey,
Thanks for the comment! It turns out t...Hi Sergey,<br /><br />Thanks for the comment! It turns out that arrays in BSON are actually stored as dicts where they keys are "1", "2", "3", etc. (surprising but true!), so you wouldn't actually get any faster using them. I"ve brought up this interesting design decision with 10gen folks multiple times, so someday we may have better-performing arrays, but for now they're just documents with a different type code.Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-79235187763710918912012-10-05T03:45:19.622-04:002012-10-05T03:45:19.622-04:00I am just curious
if accessing 'dict' keys...I am just curious<br />if accessing 'dict' keys in mongo document is slow<br />is accessing 'list' indexes slow too ?<br /><br />could we replace<br />'minute': { '0000': N0, '0001': N1, ... '1439': N1439 }<br />with<br />'minute': [N0...N1439]Anonymoushttps://www.blogger.com/profile/03075141187336066805noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-79730225592900811992012-10-02T11:59:02.248-04:002012-10-02T11:59:02.248-04:00In this case, I simply kept the minute of the day ...In this case, I simply kept the minute of the day (numbered 0-1439) as the key of the embedded document. You could also use<br /><br />"23": { "00":..., "59": ... }<br /><br />... which is probably what I would do in the future.Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-290132727041295252012-10-02T10:31:49.005-04:002012-10-02T10:31:49.005-04:00"23": { ..., "1439": 2819 }
Is..."23": { ..., "1439": 2819 }<br />Is that really correct? Shouldn't it be something like:<br />"23": { ..., "2339": 2819 }<br />Or have I missed out something?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18508356.post-88746642278017633422012-09-28T10:57:01.566-04:002012-09-28T10:57:01.566-04:00Thanks for the comment! I have to admit, the insig...Thanks for the comment! I have to admit, the insight on O(N) object traversal is not mine; it was discovered by the 10gen MMS team when optimizing the monitoring service. It was a fascinating and weird enough 2nd order effect that I thought others would be interested as well. <br /><br />Thanks again!Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-57873866865253991412012-09-27T18:46:58.824-04:002012-09-27T18:46:58.824-04:00Great insights! especially with regard to the 0(N)...Great insights! especially with regard to the 0(N) object traversal on updates.Jonhttps://www.blogger.com/profile/16950702112945468418noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-22697453352761367622012-09-27T18:44:17.155-04:002012-09-27T18:44:17.155-04:00This comment has been removed by the author.Jonhttps://www.blogger.com/profile/16950702112945468418noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-67512351661000482842012-09-27T11:53:12.609-04:002012-09-27T11:53:12.609-04:00Thanks for the comment! I'm glad you liked it....Thanks for the comment! I'm glad you liked it.Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-53854776152035589412012-09-27T09:16:41.682-04:002012-09-27T09:16:41.682-04:00Wonderful! Thank you, we need more of this kind of...Wonderful! Thank you, we need more of this kind of posts :)Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18508356.post-54770778597161561162012-09-26T10:46:00.074-04:002012-09-26T10:46:00.074-04:00Thanks for the comment!Thanks for the comment!Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-55294115988577903132012-09-26T00:11:12.679-04:002012-09-26T00:11:12.679-04:00Great post! Implement, measure, adjust.Great post! Implement, measure, adjust.Calebhttps://www.blogger.com/profile/16594901630822768761noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-60508751442745900382012-09-25T13:31:29.943-04:002012-09-25T13:31:29.943-04:00Leo,
Thanks so much for the comment! I'm glad...Leo,<br /><br />Thanks so much for the comment! I'm glad you found the post useful. It's always nice to see real numbers I think (especially in scatter plots ;-) )Rick Copelandhttps://www.blogger.com/profile/11612114223288841087noreply@blogger.comtag:blogger.com,1999:blog-18508356.post-56809018959336131232012-09-25T13:16:30.625-04:002012-09-25T13:16:30.625-04:00Rick,
Thank you very much for putting this togeth...Rick,<br /><br />Thank you very much for putting this together. I have to say that this is one of the best blog posts I have ever read on MongoDB design and scalability. And it has just convinced me of getting rid of growing documents in my models.<br /><br />LeoAnonymousnoreply@blogger.com