pair_counts[(a, b)] = countIt turns out that in my situation, I can save memory by switching to:
pair_counts[a][b] = countNaturally, the normal rules of premature optimization apply: I wrote for readability, waited until I ran out of memory, did lots of profiling, and then optimized as little as possible.
In my small test case, this dropped my memory usage from 84mb to 61mb.


3 comments:
Considering a,b are string could pair[a+"\t"+b] = count be of any use? I havent profiled it, but was just wondering.
Compared to nesting the dicts, a+"\t"+b duplicates a for everything that it is related to:
a\tb
a\tc
a\td
b\tc
Nesting the dicts gets rid of the duplication. Of course, you never know for certain what will happen until you try it ;)
Assuming you don't care about edge direction, what about:
def normalize(a,b):
if(a > b):
return (b,a)
return (a,b)
pair_counts[normalize(a,b)] = count
Post a Comment