admin管理员组

文章数量:1303412

I am trying to invoke the Anthropic Claude Sonnet 3.5 v2 model in AWS Bedrock. Here is my code (in Python using Boto3):

import boto3
import json

bedrock_runtime = boto3.client(service_name="bedrock-runtime")

body = json.dumps({
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "Hello, world"}],
    "anthropic_version": "bedrock-2023-05-31"
})

response = bedrock_runtime.invoke_model(body=body, modelId="anthropic.claude-3-5-sonnet-20241022-v2:0")

print(response.get("body").read())

I get the error:

botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the InvokeModel operation: Invocation of model ID anthropic.claude-3-5-sonnet-20241022-v2:0 with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model.

What does this mean, and how do I invoke the model through an inference profile?

I am trying to invoke the Anthropic Claude Sonnet 3.5 v2 model in AWS Bedrock. Here is my code (in Python using Boto3):

import boto3
import json

bedrock_runtime = boto3.client(service_name="bedrock-runtime")

body = json.dumps({
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "Hello, world"}],
    "anthropic_version": "bedrock-2023-05-31"
})

response = bedrock_runtime.invoke_model(body=body, modelId="anthropic.claude-3-5-sonnet-20241022-v2:0")

print(response.get("body").read())

I get the error:

botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the InvokeModel operation: Invocation of model ID anthropic.claude-3-5-sonnet-20241022-v2:0 with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model.

What does this mean, and how do I invoke the model through an inference profile?

Share Improve this question asked Feb 10 at 22:37 Asfand QaziAsfand Qazi 6,9055 gold badges35 silver badges37 bronze badges
Add a comment  | 

2 Answers 2

Reset to default 2

As the error says, you must provide the ID of an inference profile and not the model for this particular model. The easiest way to do this is to provide the ID of a system-defined inference profile for this model. You can find it by invoking this awscli command with the correct credentials defined in the environment (or set via standard flags):

aws bedrock list-inference-profiles

You will see this one in the JSON list:

{
  "inferenceProfileName": "US Anthropic Claude 3.5 Sonnet v2",
  "description": "Routes requests to Anthropic Claude 3.5 Sonnet v2 in us-west-2, us-east-1 and us-east-2.",
  "inferenceProfileArn": "arn:aws:bedrock:us-east-1:381492273274:inference-profile/us.anthropic.claude-3-5-sonnet-20241022-v2:0",
  "models": [
    {
      "modelArn": "arn:aws:bedrock:us-west-2::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0"
    },
    {
      "modelArn": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0"
    },
    {
      "modelArn": "arn:aws:bedrock:us-east-2::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0"
    }
  ],
  "inferenceProfileId": "us.anthropic.claude-3-5-sonnet-20241022-v2:0",
  "status": "ACTIVE",
  "type": "SYSTEM_DEFINED"
}

Modify the invoke_model line in your code to specify the ID or ARN of the instance profile instead:

response = bedrock_runtime.invoke_model(body=body, modelId="us.anthropic.claude-3-5-sonnet-20241022-v2:0")

You can add the ARN of an Inference profile to your modelID itself while invoking the model.

response = bedrock_client.invoke_model(

       modelId="arn:aws:bedrock:us-east-1::model/your-bedrock-model-arn"  

       prompt="Your prompt here"

   )

本文标签: