Generative AI Architecture Patterns on AWS
- Oct 16, 2023
- 3 min read
In today's digital landscape, we are seeing Generative AI-enabled applications more and more frequently. They are virtually limitless in their applicable use cases, and seemingly omnipresent in terms of capability.
We may find ourselves using Generative AI enabled applications in our every-day life, and I believe that soon it will become a staple in every industry. But why is this? Why is Generative AI so useful?
Here are a few reasons I believe LLMs and Generative AI in general is so useful:
Large Language Models and Foundation Models are highly capable, adaptable and can be trained / tuned to perform better at certain tasks.
Generally, interacting with a high-performing LLM is extremely similar to interacting with a skilled expert in a given industry.
Besides the typical chatbot use cases, LLMs can perform a wide variety of generative tasks, including code generation, document editing, entity redaction, report writing, and even serve as trusted advisors to an organization, provided it has enough context about the organization and/or challenges you present it with.
That last bullet point is a very important one. It is, in fact, the premise of this blog post!
When we look at building Generative AI enabled applications on AWS (and beyond) it's important to understand the high level workflow we want to achieve, and frankly need to have, in order to ensure it performs well.
Retrieval Augmented Generation (RAG)
What is it?
Earlier in the blog post I mentioned context, and how the LLM needs it to serve as that trusted advisor to your organization. Well, I don't want to understate the importance of having the right data, and enough of it. The better context you can provide in addition to your actual question or prompt, the better result you are going to see.
For example, imagine you are working with a business advisor who knows nothing about your business or situation other than what you tell him. He's an expert advisor, he's been doing it for 20+ years so you know that he knows a lot about helping businesses solve their problems. So now it's up to you to provide him with as much detail as possible to help him solve your business problems.
It's no different for an LLM. We must provide it with information and the context which is crucial to solve a given problem or provide meaningful output.
Below is an AWS Architecture Diagram I've created to show you can you can effectively achieve this:

The High-Level Workflow
Identify Data Types, Structure, Format & Access Patterns:
Begin by understanding the nature of your data, its structure, and how users will access it. This information will guide your AWS architecture choices.
Data Collection and Ingestion:
Use AWS services like Amazon S3 for data storage and Amazon Textract for extracting text from documents. Implement AWS Lambda functions for automating data ingestion and transformation.
Data Transformation and Preparation:
Leverage Lambda functions to preprocess and clean data as it's ingested. For text data, consider using Hugging Face models such as BERT for text classification or PII redaction, and leveraging LLMs on Amazon Bedrock for summarization or generative tasks.
Data Storage and Management:
Utilize Amazon S3 for scalable storage. If needed, consider Amazon RDS or Amazon DynamoDB for structured data. Implement Redis for caching and optimizing data access.
Analytics and Processing:
Employ Amazon Bedrock for generating embeddings from text data as well as running inference against large textual datasets using various LLMs that are available through Bedrock.
Monitoring and Optimization:
Set up AWS CloudWatch for monitoring system metrics and application logs. Implement AWS Trusted Advisor to optimize resource usage and cost efficiency.
Data Governance and Security:
Use AWS Identity and Access Management (IAM) to control access to resources. Enable encryption at rest and in transit using AWS Key Management Service (KMS). Ensure compliance with AWS Config and AWS Organizations.
Feedback and Iteration:
Gather feedback from users and stakeholders using AWS Feedback Central. Iterate on your architecture based on feedback and evolving requirements.

