A detailed look at how I created my AI-powered digital clone
Example Chat Interaction
Here's an example of how the AI clone responds to various questions:
Chat with J. Sheets Clone
This example demonstrates how the AI clone leverages its training data to provide detailed, context-aware responses about my professional experience, technical skills, and background.
Architecture Overview
graph TD
%% Data Ingestion Pipeline
subgraph DataIngestion["Data Ingestion Pipeline"]
Raw[Raw Data Sources]
Parser[Document Parser]
Chunker[Text Chunker]
Cleaner[Data Cleaner]
Embedder[Text Embedder]
end
style DataIngestion fill:#2E8B57,stroke:#2E8B57,color:#ffffff
%% Data Sources
subgraph Sources["Source Documents"]
Resume[Resume/CV]
Projects[Project Details]
Skills[Skills & Experience]
Background[Background Info]
end
style Sources fill:#4682B4,stroke:#4682B4,color:#ffffff
%% Data Flow for Ingestion
Sources --> Raw
Raw --> Parser
Parser --> Cleaner
Cleaner --> Chunker
Chunker --> Embedder
Embedder --> VectorDB
%% Frontend Components
subgraph Frontend["Frontend - React"]
UI[EnhancedAIChat Component]
Chat[Chat Interface]
Auth[Authentication]
ModelSelect[Model Selection]
FileUpload[File Upload]
end
style Frontend fill:#8B4513,stroke:#6fc3df,color:#ffffff
%% Backend Components
subgraph Backend["Backend - FastAPI"]
API[API Routes]
RAG[RAG Service]
VectorDB[Vector Store]
DB[Database]
end
style Backend fill:#8d82c4,stroke:#8d82c4,color:#ffffff
%% External Services
subgraph External["External Services"]
OpenAI[OpenAI API]
ChromaDB[ChromaDB]
end
style External fill:#2a2f4a,stroke:#ec8d81,color:#ffffff
%% Data Flow
UI --> |User Input| Chat
Chat --> |Send Message| API
Auth --> |Validate| Chat
ModelSelect --> |Select Model| Chat
FileUpload --> |Attach Files| Chat
API --> |Process Request| RAG
RAG --> |Generate Embeddings| VectorDB
RAG --> |Query| OpenAI
VectorDB --> |Store/Retrieve| ChromaDB
RAG --> |Store History| DB
OpenAI --> |Response| RAG
RAG --> |Format Response| API
API --> |Return Response| Chat
Chat --> |Display| UI
%% Component Details
subgraph Models["Models"]
O3[O3-Mini]
GPT35[GPT-3.5-Turbo]
GPT4[GPT-4]
end
style Models fill:#2a2f4a,stroke:#e7b788,color:#ffffff
ModelSelect --> Models
%% Data Storage
subgraph Storage["Storage"]
Messages[Chat Messages]
Sessions[Chat Sessions]
Sources[Response Sources]
end
style Storage fill:#2a2f4a,stroke:#8ea9e8,color:#ffffff
DB --> Storage
%% RAG Implementation
subgraph RAGSystem["RAG Implementation"]
Embeddings[OpenAI Embeddings]
Retrieval[Document Retrieval]
QAChain[QA Chain]
PromptTemplate[Prompt Templates]
end
style RAGSystem fill:#4A0000,stroke:#4A0000,color:#ffffff
RAG --> RAGSystem
Components Breakdown
Data Ingestion Pipeline
The process begins with collecting and processing various documents about my background, experience, and knowledge:
- Raw Data Sources: Initial input of various document types
- Document Parser: Converts different file formats into plain text
- Data Cleaner: Removes irrelevant information and normalizes text
- Text Chunker: Splits documents into manageable chunks
- Text Embedder: Converts text chunks into vector embeddings
Frontend Components
The user interface is built with React and includes:
- Chat Interface: Main interaction point for users
- Authentication: Ensures secure access
- Model Selection: Choose between different AI models
- File Upload: Support for document attachments
Backend Components
Powered by FastAPI, the backend handles:
- API Routes: RESTful endpoints for chat functionality
- RAG Service: Core retrieval-augmented generation
- Vector Store: Manages document embeddings
- Database: Stores chat history and sessions
External Services
The system integrates with several external services:
- OpenAI API: Powers the language models and embeddings
- ChromaDB: Vector database for efficient similarity search
RAG Implementation
The Retrieval Augmented Generation system includes:
- OpenAI Embeddings: Creates vector representations of text
- Document Retrieval: Finds relevant context for queries
- QA Chain: Orchestrates the question-answering process
- Prompt Templates: Structures the interaction with the AI