Using LLM to Compress Columns
- Tim Burns
- May 28
- 1 min read

Building categories and segments is a good task for an LLM. As I am looking for interesting trends around forecasting in the Iowa Liquor Store data, I find that the categories and item descriptions leave much to be desired.
item_description | category_name |
MALIBU COCONUT RUM | FLAVORED RUM |
SMIRNOFF STRAWBERRY | AMERICAN FLAVORED VODKA |
PARAMOUNT WHITE RUM PET | WHITE RUM |
The LLM's task is to examine the item description and category and classify the item into segments.
I have a code to update the item_dim table, augment the data by adding segment columns, and reduce the item_description and category_name.
Query the original raw data and look for rows not added or rows with null values. As our LLM gets better, we will get better matches.
Create a prompt to the LLM to classify the data into segments, using the selected columns.
Merge the new segment created from the LLM back into a dataframe
Merge the dataframe back into the ITEM_DIM table so we have a new column to augment.
This is a good way to densify query data in the ITEM_DIM table for analysis. After combining the data, we can drop sparse columns and create more meaningful ones.
Comments