Go Back

Amazon EC2 Inf2 Instances

aws.amazon.com
Summary

Amazon EC2 Inf2 Instances provide high-performance, cost-effective compute capacity for deep learning inference, particularly for generative AI models. They are powered by AWS Inferentia2 chips and support various AI applications including large language models, vision transformers, and content generation. The service integrates with existing ML frameworks and offers features for deploying large-scale AI models efficiently.

Features
7/13
See all

Must Have

4 of 5

Conversational AI

API Access

Fine-Tuning & Custom Models

Enterprise Solutions

Safety & Alignment Framework

Other

3 of 8

Image Generation

Code Generation

Multimodal AI

Research & Publications

Security & Red Teaming

Synthetic Media Provenance

Threat Intelligence Reporting

Global Affairs & Policy

Pricing
Usage-based
See all

Inf2.xlarge

$0.76 per use
  • 1 Inferentia2 Chip
  • 32 GB Accelerator Memory
  • 4 vCPU
  • 16 GiB Memory
  • EBS Only Storage
  • Up to 15 Gbps Network Bandwidth
  • Up to 10 Gbps EBS Bandwidth

Inf2.8xlarge

$1.97 per use
  • 1 Inferentia2 Chip
  • 32 GB Accelerator Memory
  • 32 vCPU
  • 128 GiB Memory
  • EBS Only Storage
  • Up to 25 Gbps Network Bandwidth
  • 10 Gbps EBS Bandwidth

Inf2.24xlarge

$6.49 per use
  • 6 Inferentia2 Chips
  • 192 GB Accelerator Memory
  • 96 vCPU
  • 384 GiB Memory
  • EBS Only Storage
  • Yes Inter-Chip Interconnect
  • 50 Gbps Network Bandwidth
  • 30 Gbps EBS Bandwidth

Inf2.48xlarge

$12.98 per use
  • 12 Inferentia2 Chips
  • 384 GB Accelerator Memory
  • 192 vCPU
  • 768 GiB Memory
  • EBS Only Storage
  • Yes Inter-Chip Interconnect
  • 100 Gbps Network Bandwidth
  • 60 Gbps EBS Bandwidth
Rationale

Amazon EC2 Inf2 Instances are purpose-built for deep learning inference, specifically for generative AI models like large language models (LLMs) and vision transformers. The website explicitly mentions use cases such as text summarization (conversational AI), code generation, and video and image generation, directly aligning with several 'must-have' and 'other' features. While it provides the infrastructure for these AI capabilities, it doesn't directly offer a safety & alignment framework or research publications as a core product feature, but rather the underlying compute for such applications. It also offers enterprise-grade solutions through its EC2 offerings.