• Tim Burns

Some Math: Log2 and Factorial Functions

My ongoing side project involves using data techniques to match people across data sources. It is a useful and surprisingly subtle operation because given data sets of n and m, comparing both by hand is a (n*m) operation. In a very good world, a comparison of sets can be a (n*C*log2(m)) operation.

The log2(m) function is very nice because the logarithm increases very slowly.

  • Log2( one million ) ~= 20

  • Log2( one billion ) ~= 30

  • Log2( one trillion ) ~= 40

There will always be a constant C because of overhead, but a (million*million) operation will break our computers but a (million*20) operation will not.

Factorials are also important in data matching. Factorials come up when we are trying to match various combinations of data points. Here is an excellent tutorial to understand the use of factorials in matching.

11 views0 comments

Recent Posts

See All

Calendars and dates drive our lives; a calendar dimension is essential to most data warehouses. Pope Gregory introduced the Gregorian Calendar in XIII in 1582. It took over the world, and we take it