AI Cloning Process - Jonathan Sheets

Example Chat Interaction

Here's an example of how the AI clone responds to various questions:

Chat with J. Sheets Clone

JS Clone [o3-mini]

I am Jonathan Sheets' AI clone using the OpenAI o3-mini base model version with added RAG data retrieval, in order to imitate him and answer any questions you may have.

Tell me about your past leadership experiences and how they relate to this role you applied for?

JS Clone [o3-mini]

Without sounding like a try-hard... I've naturally found myself in leadership positions throughout my life. Hockey and Lacrosse Captain, Senior Class President, Undergrad President of UMich Equuity Endowment fund are some past notables. In terms of this role, I've been in situations where I need to mangage up (anticipating my Direct's needs), manage down (managing my team's performance), manage across (collaborating with other teams/pods/co-workers). To lead effectively in the [REDACTED] position at [REDACTED COMPANY], I'd draw upon my past experiences to lead by example in order to hit the productivitty benchmarks for my group and the company as a whole.

Can you tell me about your experience with AI / LLMs / Agent Creation?

JS Clone [o3-mini]

Right now, on a per-contract basis, I develop and implement custom AI agents and LLMs for healthcare and government clients through VERN AI out of MI. I was the 25th contributor to the open-source Auto-GPT project that was the most downloaded repo at the time. I did a graduate course on AI development from the University of Texas at Austin in 2024. I've worked extensively with PyTorch and TensorFlow, implementing various deep learning models for both computer vision and natural language processing tasks. At Jane Street, I developed ML models for predicting market movements and optimizing trading strategies. I'm particularly interested in reinforcement learning and its applications in algorithmic trading.

What programming languages are you most proficient in?

JS Clone [o3-mini]

I'd say my core programming expertise lies in Python and JavaScript. I can hammer out C++ if the end-goal calls for it. Obviously, I can build websites with HTML5 and CSS given you're on this site. I've built everything from data analysis pipelines to machine learning models using frameworks like PyTorch and TensorFlow. I'm also well-versed in JavaScript/TypeScript, particularly with React for frontend development. Additionally, I have experience with some Rust for systems programming and SQL for database management.

This example demonstrates how the AI clone leverages its training data to provide detailed, context-aware responses about my professional experience, technical skills, and background.

Architecture Overview

graph TD %% Data Ingestion Pipeline subgraph DataIngestion["Data Ingestion Pipeline"] Raw[Raw Data Sources] Parser[Document Parser] Chunker[Text Chunker] Cleaner[Data Cleaner] Embedder[Text Embedder] end style DataIngestion fill:#2E8B57,stroke:#2E8B57,color:#ffffff %% Data Sources subgraph Sources["Source Documents"] Resume[Resume/CV] Projects[Project Details] Skills[Skills & Experience] Background[Background Info] end style Sources fill:#4682B4,stroke:#4682B4,color:#ffffff %% Data Flow for Ingestion Sources --> Raw Raw --> Parser Parser --> Cleaner Cleaner --> Chunker Chunker --> Embedder Embedder --> VectorDB %% Frontend Components subgraph Frontend["Frontend - React"] UI[EnhancedAIChat Component] Chat[Chat Interface] Auth[Authentication] ModelSelect[Model Selection] FileUpload[File Upload] end style Frontend fill:#8B4513,stroke:#6fc3df,color:#ffffff %% Backend Components subgraph Backend["Backend - FastAPI"] API[API Routes] RAG[RAG Service] VectorDB[Vector Store] DB[Database] end style Backend fill:#8d82c4,stroke:#8d82c4,color:#ffffff %% External Services subgraph External["External Services"] OpenAI[OpenAI API] ChromaDB[ChromaDB] end style External fill:#2a2f4a,stroke:#ec8d81,color:#ffffff %% Data Flow UI --> |User Input| Chat Chat --> |Send Message| API Auth --> |Validate| Chat ModelSelect --> |Select Model| Chat FileUpload --> |Attach Files| Chat API --> |Process Request| RAG RAG --> |Generate Embeddings| VectorDB RAG --> |Query| OpenAI VectorDB --> |Store/Retrieve| ChromaDB RAG --> |Store History| DB OpenAI --> |Response| RAG RAG --> |Format Response| API API --> |Return Response| Chat Chat --> |Display| UI %% Component Details subgraph Models["Models"] O3[O3-Mini] GPT35[GPT-3.5-Turbo] GPT4[GPT-4] end style Models fill:#2a2f4a,stroke:#e7b788,color:#ffffff ModelSelect --> Models %% Data Storage subgraph Storage["Storage"] Messages[Chat Messages] Sessions[Chat Sessions] Sources[Response Sources] end style Storage fill:#2a2f4a,stroke:#8ea9e8,color:#ffffff DB --> Storage %% RAG Implementation subgraph RAGSystem["RAG Implementation"] Embeddings[OpenAI Embeddings] Retrieval[Document Retrieval] QAChain[QA Chain] PromptTemplate[Prompt Templates] end style RAGSystem fill:#4A0000,stroke:#4A0000,color:#ffffff RAG --> RAGSystem

Components Breakdown

Data Ingestion Pipeline

The process begins with collecting and processing various documents about my background, experience, and knowledge:

Raw Data Sources: Initial input of various document types
Document Parser: Converts different file formats into plain text
Data Cleaner: Removes irrelevant information and normalizes text
Text Chunker: Splits documents into manageable chunks
Text Embedder: Converts text chunks into vector embeddings

Frontend Components

The user interface is built with React and includes:

Chat Interface: Main interaction point for users
Authentication: Ensures secure access
Model Selection: Choose between different AI models
File Upload: Support for document attachments

Backend Components

API Routes: RESTful endpoints for chat functionality
RAG Service: Core retrieval-augmented generation
Vector Store: Manages document embeddings
Database: Stores chat history and sessions

External Services

The system integrates with several external services:

OpenAI API: Powers the language models and embeddings
ChromaDB: Vector database for efficient similarity search

RAG Implementation

The Retrieval Augmented Generation system includes:

OpenAI Embeddings: Creates vector representations of text
Document Retrieval: Finds relevant context for queries
QA Chain: Orchestrates the question-answering process
Prompt Templates: Structures the interaction with the AI