top of page
Search

Using LLM to Compress Columns

  • Writer: Tim Burns
    Tim Burns
  • May 28
  • 1 min read

A Family of Geese
A Family of Geese

Building categories and segments is a good task for an LLM. As I am looking for interesting trends around forecasting in the Iowa Liquor Store data, I find that the categories and item descriptions leave much to be desired.

item_description

category_name

MALIBU COCONUT RUM

FLAVORED RUM

SMIRNOFF STRAWBERRY

AMERICAN FLAVORED VODKA

PARAMOUNT WHITE RUM PET

WHITE RUM

The LLM's task is to examine the item description and category and classify the item into segments.



I have a code to update the item_dim table, augment the data by adding segment columns, and reduce the item_description and category_name.


  1. Query the original raw data and look for rows not added or rows with null values. As our LLM gets better, we will get better matches.

  2. Create a prompt to the LLM to classify the data into segments, using the selected columns.

  3. Merge the new segment created from the LLM back into a dataframe

  4. Merge the dataframe back into the ITEM_DIM table so we have a new column to augment.


This is a good way to densify query data in the ITEM_DIM table for analysis. After combining the data, we can drop sparse columns and create more meaningful ones.

 
 
 

Recent Posts

See All
Getting Bedrock to Recognize Images

AWS's Bedrock documentation is a bit of a disaster. It took me a while to get some code to generate image descriptions. I finally found...

 
 
 

Comments


  • Facebook
  • Twitter
  • LinkedIn

©2019 by Owl Mountain Software, LLC. Proudly created with Wix.com

bottom of page