Amazon Bedrock Part 4: AWS Cost and Usage Report(CUR) Summary from S3, using Bedrock, LangChain and Anthropic Claude

Diptiman Raichaudhuri
4 min readOct 16, 2023

--

Disclosure: All opinions expressed in this article are my own, and represent no one but myself and not those of my current or any previous employers.

This article is the 4th part of my Amazon Bedrock series.

Here are the links for the previous articles :

Amazon Bedrock Part 1 : RAG with Pinecone Vector Store, Anthropic Claude LLM and LangChain — Income Tax FAQ Q&A bot

Amazon Bedrock Part 2: Hindi Text Summarization using Amazon Bedrock and Anthropic Claude

Amazon Bedrock Part 3: Amazon Neptune Graph database Q&A LangChain Agent with Amazon Bedrock and Anthropic Claude

This is part 4 of a series focused on using Amazon Bedrock and LangChain to summarise AWS Cost and Usage Reports(CUR) kept in a S3 bucket.

I got a sample CUR dataset for the fantastic aws-cost-miner github repo maintained by robsonbittencourt , here.

If you are not too familiar with AWS CUR , as per AWS doc :

“AWS Cost and Usage Reports tracks your AWS usage and provides estimated charges associated with your account”

Essentially, the use-case is to get a financial summary of the CUR report.

Steps are:

  1. Upload CUR file(s) in a S3 bucket
  2. Create a SageMaker Studio notebook.
  3. Install boto3, botocore, langchain and unstructured libraries
  4. Use Bedrock to access Anthropic Claude-v2 model.
  5. Create a prompt and get a financial summary.

First, I created a S3 bucket and uploaded the CSV file :

CUR Report in S3 bucket

Then I created a SageMaker studio notebook :

SageMaker Studio Console

I selected ml.t3.medium Data Science kernel, which should be good enough for the prototype.

Then I installed the required libraries

%pip install --upgrade --quiet boto3 botocore langchain tiktoken unstructured python-dotenv

Imported libraries :

import boto3
import os
import json

You would also need to setup your AWS credentials in an .env file and call load_dotenv() to load these credentials, Add the following in the .env file :

AWS_ACCESS_KEY_ID = "<Your AWS account access key>"
AWS_SECRET_ACCESS_KEY = "<Your AWS account secret key>"

And, then call load_dotenv() :

load_dotenv()

Created a Bedrock runtime reference to invoke Claude model :

bedrock_runtime = boto3.client(
service_name = "bedrock-runtime",
region_name = "us-east-1"
)

Chose Claude-v2 as the model for the summarisation task :

modelId = 'anthropic.claude-v2'
accept = 'application/json'
contentType = 'application/json'

Selected langchain S3FileLoader to load the AWS CUR csv :

from langchain.document_loaders import S3FileLoader
loader = S3FileLoader("cur-data-diptiman", "cur_data.csv")

Here’s the langchain doc. You could have loaded more than one files as well and then use the langchain S3DirectoryLoader instead of the S3FileLoader.

For S3FileLoader the first parameter is the bucket-name folllowed by the file-name .

Then loaded the CUR file :

doc = loader.load()

Create a prompt to generate a financial summary of the CUR report :

prompt_cur = f"""
\n\nHuman: Write a financial summary of the report.
{doc}

Assistant:
"""

Prepared the request payload for bedrock invoke_model() call :

body = json.dumps({
"prompt": prompt_cur,
"max_tokens_to_sample":4096,
"temperature":0.5,
"top_k":250,
"top_p":0.5,
"stop_sequences": ["\n\nHuman:"]
})

And, invoked the Claude-v2 operation :

response = bedrock_runtime.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType)
response_body = json.loads(response.get('body').read())

print(response_body.get('completion'))

A quick summary of the CUR report was generated by Claude-v2 as output response :

Here is a summary of the key financial information from the report:

- Total AWS costs: $1,347.4436

- Largest cost item:
- Amazon EC2 m4.10xlarge instance usage: $1060.1256

- Other major cost items:
- Amazon EC2 m3.xlarge on-demand instance usage: $5.3200
- Amazon EC2 m3.xlarge reserved instance usage: $0.3000

- Costs broken down by application:
- App A: $5.3200
- App B: $0.3000

- Time period covered:
- August 1, 2019 00:00:00 to August 31, 2019 23:59:59

The report covers Amazon EC2 instance usage across m3.xlarge and m4.10xlarge instance types, both on-demand and reserved. Total costs are $1,347.44 over the month of August 2019. The largest cost item is m4.10xlarge on-demand instance usage at $1060.13. Costs are broken down by two applications, with App A accounting for $5.32 and App B accounting for $0.30.

Impressive ! !

Developers can easily build a frontend and provide FinOps practitioners and account billing teams to quickly summarise CUR reports, which are normally very granular and detailed.

But, we should also keep checks and balances to ensure that all summary information is true when compared with the original report.

Here’s the notebook attached.

Happy coding !

--

--