top of page
Search
  • Writer's pictureTim Burns

Incoming for 2024: Clickhouse OLAP

Updated: Jan 3



I maintain a project that tracks playlist data on KEXP Radio playlists and publishes the latest trending music to my Tableau public dashboard. It's based on the Snowflake data platform. In the last year, the Snowflake cost me $219.51. Considering that most of the cost comes from optimized Snowpipe code that runs in less than a minute, Snowflake's warehouse costs are way too much.


Incoming for 2024: Clickhouse OLAP. I heard about Clickhouse while visiting my wonderful family in Cape Cod. I am lucky to be related to some brilliant developers - some by blood, and some by marriage, Christmas and Easter are always great times to trade ideas. Chris Burchardt introduced me to Clickhouse over some Swedish meatballs and Boomerang Fu. He and his wife Katie are the founders of the Resonant Coaching app.


So why not save $200 in the coming year by running my data transforms on Clickhouse locally? Developing on Snowflake is a white-knuckle operation: I'm always a stray query away from a $100 or more compute cost bill. It's a great model for them, but I don't think it serves their customers. I want to make 2024 the year I write about how to migrate off the Snowflake platform.


Here's the Clickhouse jumpstart on Kubernetes.

Local Installation Notes


Check to see if the pods are running.

(venv) timburns@NeverSummer clickhouse % make get-pods
kubectl get pods -n test-clickhouse-operator
NAME                                   READY   STATUS    RESTARTS       AGE
chi-simple-01-simple-0-0-0             1/1     Running   1 (6d1h ago)   6d4h
clickhouse-operator-5cfb45b958-ch8wl   2/2     Running   0              16m

Get the environment variables of the local cluster. You can check the values by rebuilding with small changes and check the values.

(venv) timburns@NeverSummer clickhouse % make get-env-vars
kubectl -n test-clickhouse-operator exec -it chi-simple-01-simple-0-0-0 -- env
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=chi-simple-01-simple-0-0-0
 ...

Forward the ports and connect to the local instance on port 9000.

make forward-ports &
(venv) timburns@NeverSummer clickhouse % make client
clickhouse client -h localhost

Run a query to see the engine do its thing.


chi-simple-01-simple-0-0-0.chi-simple-01-simple-0-0.test-clickhouse-operator.svc.cluster.local :) SELECT *
FROM s3(
   'https://datasets-documentation.s3.eu-west-3.amazonaws.com/aapl_stock.csv',
   'CSVWithNames'
)
LIMIT 5;

SELECT *
FROM s3('https://datasets-documentation.s3.eu-west-3.amazonaws.com/aapl_stock.csv', 'CSVWithNames')
LIMIT 5

Query id: 08b06e7f-4d47-4d7a-b84e-7786a9b56ae1

┌───────Date─┬────Open─┬────High─┬─────Low─┬───Close─┬───Volume─┬─OpenInt─┐
│ 1984-09-07 │ 0.42388 │ 0.42902 │ 0.41874 │ 0.42388 │ 23220030 │       0 │
│ 1984-09-10 │ 0.42388 │ 0.42516 │ 0.41366 │ 0.42134 │ 18022532 │       0 │
│ 1984-09-11 │ 0.42516 │ 0.43668 │ 0.42516 │ 0.42902 │ 42498199 │       0 │
│ 1984-09-12 │ 0.42902 │ 0.43157 │ 0.41618 │ 0.41618 │ 37125801 │       0 │
└────────────┴─────────┴─────────┴─────────┴─────────┴──────────┴─────────┘
┌───────Date─┬────Open─┬────High─┬─────Low─┬───Close─┬───Volume─┬─OpenInt─┐
│ 1984-09-13 │ 0.43927 │ 0.44052 │ 0.43927 │ 0.43927 │ 57822062 │       0 │
└────────────┴─────────┴─────────┴─────────┴─────────┴──────────┴─────────┘

5 rows in set. Elapsed: 6.925 sec. 

Joining the Development Team

Here is the link for joining the development team.


DB Compare

Here's a cool tool I learned about from a colleague comparing ClickHouse to some of the top column-oriented OLAP data engines.




29 views0 comments

Recent Posts

See All

Commentaires


bottom of page