top of page
Search
  • Writer's pictureTim Burns

Some Math: Log2 and Factorial Functions


My ongoing side project involves using data techniques to match people across data sources. It is a useful and surprisingly subtle operation because given data sets of n and m, comparing both by hand is a (n*m) operation. In a very good world, a comparison of sets can be a (n*C*log2(m)) operation.


The log2(m) function is very nice because the logarithm increases very slowly.

  • Log2( one million ) ~= 20

  • Log2( one billion ) ~= 30

  • Log2( one trillion ) ~= 40

There will always be a constant C because of overhead, but a (million*million) operation will break our computers but a (million*20) operation will not.

Factorials are also important in data matching. Factorials come up when we are trying to match various combinations of data points. Here is an excellent tutorial to understand the use of factorials in matching.




13 views0 comments

Recent Posts

See All

Carto, Snowflake, and Data Management

A basic principle of data management: Don't move data unless you have to. Moving data is expensive and error-prone. Data Egress Cost: How To Take Back Control And Reduce Egress Charges Archiving to S

bottom of page