Track LLM Cost with AWS Inference Profiles

You can break down your Bedrock costs by application, team, or environment—without extra tagging work—using AWS Inference Profiles.

Mavvrik automatically ingests cost allocation tags from your AWS Inference Profiles. That means when you tag a profile (e.g., team=mlops, app=chatbot), any GenAI spend flowing through it is visible in Mavvrik GenAI dashboard.

This makes it simple to segment and report on GenAI costs by application, team, use case, or environment—all without rewriting pipelines or manually tagging every call.

Why Use Application Inference Profiles?

Amazon Bedrock does not allow tagging on on-demand models, making cost tracking difficult. The workaround is to create an Application Inference Profile, assign tags to it, and link it to an on-demand model. These tags help in cost tracking via Mavvrik GenAI Dashboard.

Step 1: Create an Application Inference Profile

Use AWS CLI to create an inference profile and link it to a specific model. In this example, we link it to the Claude Opus 4 model and tag it with "app": "llm-data-generator-2".

CODE

aws bedrock create-inference-profile --region 'us-west-2' \
    --inference-profile-name 'profile-genaimavvrik' \
    --description 'profile-genaimavvrik' \
    --model-source '{"copyFrom": "arn:aws:bedrock:us-west-2::foundation-model/arn:aws:bedrock:us-west-2:140068715408:inference-profile/us.anthropic.claude-opus-4-20250514-v1:0"}' \
    --tags '[{"key": "app","value": "llm-data-generator-2"}]'

Expected Output:

CODE

{
    "inferenceProfileArn": "arn:aws:bedrock:us-west-2:AccountId:application-inference-profile/t0bnu66rse28",
    "status": "ACTIVE"
}

This will create the inference profile in the us-west-2 region, linking it to the us.anthropic.claude-opus-4-20250514-v1:0 model, and applying the tag "app": "llm-data-generator-2".

Step 2: Activate the Cost Allocation Tag

Activate the tag in AWS Billing and Cost Management:

Open the AWS Billing and Cost Management Console.
Navigate to Cost Allocation Tags.
Locate the app tag and activate it.
Wait up to 24 hours for the tag to appear on Mavvrik GenAI Dashboard .

Step 3: Invoke the Model Using the Inference Profile

To invoke the model, use the ARN of the inference profile instead of the model ID:

CODE

aws bedrock-runtime converse --region 'us-west-2' \
    --model-id 'arn:aws:bedrock:us-west-2:account-id:application-inference-profile/t0bnu66rse28' \
    --messages '[{"role": "user", "content": [{"text": "text-input"}]}]'

Step 4: Track Costs in Mavvrik Portal

After invoking the model, tags will start appearing in Mavvrik GenAI Dashboard. You can track the usage of model by applying appropriate filters in Mavvrik by following below steps.

Navigate to GenAI > Cost
Click on page level GenAI cost filters
Select the Tags and Apply it