Search
  • Tim Burns

Some Math: Log2 and Factorial Functions


My ongoing side project involves using data techniques to match people across data sources. It is a useful and surprisingly subtle operation because given data sets of n and m, comparing both by hand is a (n*m) operation. In a very good world, a comparison of sets can be a (n*C*log2(m)) operation.


The log2(m) function is very nice because the logarithm increases very slowly.

  • Log2( one million ) ~= 20

  • Log2( one billion ) ~= 30

  • Log2( one trillion ) ~= 40

There will always be a constant C because of overhead, but a (million*million) operation will break our computers but a (million*20) operation will not.

Factorials are also important in data matching. Factorials come up when we are trying to match various combinations of data points. Here is an excellent tutorial to understand the use of factorials in matching.




11 views0 comments

Recent Posts

See All

Downloading CMS Data is a bit tricky. The base site is here: https://data.cms.gov/provider-data/docs After beating my head against the wall, I discovered that the data key is embedded on the web page.