But. 'the best choice' is not a thing! I mean, not in isolation! Oh, you want to use a bloom filter for that? That's nice, and super quick for existence-checks on large data - but I just gave you internet-scale data and you need to map-reduce this thing, so your bloom filter either breaks or requires an extra reduce step for collation, would you like to rethink that?
I mean, this is something I have Opinions about (so, sorry to jump into someone else's thread ^.^;) because I'm doing a bunch of coding interviews these days, and for example I ask a question which involves checking if two strings have a (roman-alphabet, lowercase-only) character in common. The 'best choice' is, objectively, forming & storing a 26-place bitmap of all words, and comparing with a bitwise and.
If we're coding in C and you don't get to this by the second go around, I may have concerns. But if we're in python, and you're good, you probably gave me: if set(word1).intersection(word2): and then moved on, because that's /plenty/ pythonic and elegant (compared to the fair handful of lines you'd need for the bitmap, if you can even remember bit-twiddling in python) (and can I just mention how that one line of code is part of why I love python so much? so elegant! don't even need to .split() the words into char-lists, set does that for you!). Sure, I'd hope you'd come up with the 'best choice' bitmap if I pressed you for large-data-set optimisations, but...
The idea that there is ever a 'best choice' or a 'Just The Right Thing To Do' makes me froth at the mouth a little, as you may have noticed. ^.^;
(no subject)
Date: 2013-12-25 05:47 am (UTC)I mean, this is something I have Opinions about (so, sorry to jump into someone else's thread ^.^;) because I'm doing a bunch of coding interviews these days, and for example I ask a question which involves checking if two strings have a (roman-alphabet, lowercase-only) character in common. The 'best choice' is, objectively, forming & storing a 26-place bitmap of all words, and comparing with a bitwise and.
If we're coding in C and you don't get to this by the second go around, I may have concerns. But if we're in python, and you're good, you probably gave me:
if set(word1).intersection(word2):
and then moved on, because that's /plenty/ pythonic and elegant (compared to the fair handful of lines you'd need for the bitmap, if you can even remember bit-twiddling in python) (and can I just mention how that one line of code is part of why I love python so much? so elegant! don't even need to .split() the words into char-lists, set does that for you!). Sure, I'd hope you'd come up with the 'best choice' bitmap if I pressed you for large-data-set optimisations, but...
The idea that there is ever a 'best choice' or a 'Just The Right Thing To Do' makes me froth at the mouth a little, as you may have noticed. ^.^;