{"id":239,"date":"2026-06-19T19:09:59","date_gmt":"2026-06-19T11:09:59","guid":{"rendered":"https:\/\/erishen.cn\/?page_id=239"},"modified":"2026-06-19T19:28:28","modified_gmt":"2026-06-19T11:28:28","slug":"building-rag-smart-chat-app","status":"publish","type":"page","link":"https:\/\/erishen.cn\/?page_id=239","title":{"rendered":"From Zero to One: Building a RAG-Powered Smart Chat Application"},"content":{"rendered":"<p><\/p>\n<p>With the rapid development of large AI models, building a practical chat application with advanced features has become a hot topic. This article shares the complete process of building a modern AI chat app from scratch, integrating <strong>RAG (Retrieval-Augmented Generation)<\/strong>, <strong>multi-turn conversation management<\/strong>, and <strong>local vector storage<\/strong>.<\/p>\n<p><\/p>\n<p><strong>Live Demo<\/strong>: <a href=\"https:\/\/chat.erishen.cn\">https:\/\/chat.erishen.cn<\/a><\/p>\n<p><\/p>\n<h2>Core Features<\/h2>\n<p><\/p>\n<h3>Intelligent Conversation<\/h3>\n<p>&#8211; <strong>Streaming responses<\/strong>: Real-time streaming via Vercel AI SDK<\/p>\n<p>&#8211; <strong>Multi-turn dialogue<\/strong>: Complete conversation history and context retention<\/p>\n<p>&#8211; <strong>Auto-titling<\/strong>: AI-generated conversation titles based on content<\/p>\n<p>&#8211; <strong>Persistence<\/strong>: Conversation history survives page refreshes<\/p>\n<p><\/p>\n<h3>RAG Document Retrieval<\/h3>\n<p>&#8211; <strong>Document upload<\/strong>: Supports TXT, MD formats<\/p>\n<p>&#8211; <strong>Smart chunking<\/strong>: Auto-splits long documents into semantic chunks<\/p>\n<p>&#8211; <strong>Client-side vectorization<\/strong>: Using @xenova\/transformers for local embedding generation<\/p>\n<p>&#8211; <strong>Semantic search<\/strong>: Cosine similarity-based intelligent document retrieval<\/p>\n<p>&#8211; <strong>Context enhancement<\/strong>: Combines retrieved documents for more accurate responses<\/p>\n<p><\/p>\n<h3>Modern UI\/UX<\/h3>\n<p>&#8211; <strong>Responsive design<\/strong>: Adapts perfectly to desktop and mobile<\/p>\n<p>&#8211; <strong>Theme switching<\/strong>: Light\/dark\/system themes<\/p>\n<p>&#8211; <strong>Component architecture<\/strong>: Tailwind CSS-based design system<\/p>\n<p>&#8211; <strong>Fluid animations<\/strong>: Polished interaction experience<\/p>\n<p><\/p>\n<h2>Technical Architecture<\/h2>\n<p><\/p>\n<h3>Frontend Stack<\/h3>\n<p>Next.js 15 + React 19 + TypeScript, Tailwind CSS 4, Vercel AI SDK<\/p>\n<p><\/p>\n<h3>RAG Architecture (Hybrid)<\/h3>\n<p>&#8211; <strong>Client side<\/strong>: Document processing + vectorization + local storage<\/p>\n<p>&#8211; <strong>Server side<\/strong>: Semantic search + context generation<\/p>\n<p><\/p>\n<h3>Core Libraries<\/h3>\n<p>@xenova\/transformers (client ML), localStorage (vector DB), cosine similarity, React Hooks<\/p>\n<p><\/p>\n<h2>Key Implementations<\/h2>\n<p><\/p>\n<h3>Multi-Turn Conversation Management<\/h3>\n<p><\/p>\n<pre><code>\nexport function useMultiTurnChat() {\n  const [currentConversationId, setCurrentConversationId] = useState&lt;string | null&gt;(null)\n  const [conversations, setConversations] = useState&lt;Conversation[]&gt;([])\n\n  useEffect(() =&gt; {\n    if (!currentConversationId || messages.length === 0) return\n    conversationManager.updateConversation(currentConversationId, {\n      messages: messages.map(msg =&gt; ({\n        id: msg.id,\n        role: msg.role as 'user' | 'assistant',\n        content: msg.content,\n        timestamp: new Date(),\n      }))\n    })\n  }, [messages, currentConversationId])\n\n  return { messages, conversations, createNewConversation, switchConversation, deleteConversation }\n}\n<\/code><\/pre>\n<p><\/p>\n<h3>RAG Document Processing Pipeline<\/h3>\n<p><\/p>\n<pre><code>\nclass DocumentProcessor {\n  async processDocument(file: File): Promise&lt;ProcessedDocument&gt; {\n    const content = await this.readFileContent(file)\n    const chunks = await this.chunkText(content, { chunkSize: 500, chunkOverlap: 50 })\n    const embeddings = await this.generateEmbeddings(chunks)\n    await vectorStore.addDocument({\n      id: generateId(), title: file.name, content,\n      chunks: chunks.map((chunk, index) =&gt; ({\n        id: generateId(), content: chunk, embedding: embeddings[index]\n      }))\n    })\n    return processedDocument\n  }\n}\n<\/code><\/pre>\n<p><\/p>\n<h3>Semantic Search with Cosine Similarity<\/h3>\n<p><\/p>\n<pre><code>\nclass LocalStorageVectorStore {\n  async search(queryEmbedding: number[], topK: number): Promise&lt;SearchResult[]&gt; {\n    const allChunks = this.getAllChunks()\n    const similarities = allChunks.map(chunk =&gt; ({\n      ...chunk,\n      similarity: this.cosineSimilarity(queryEmbedding, chunk.embedding)\n    }))\n    return similarities\n      .sort((a, b) =&gt; b.similarity - a.similarity)\n      .slice(0, topK)\n      .filter(result =&gt; result.similarity &gt; 0.5)\n  }\n\n  private cosineSimilarity(a: number[], b: number[]): number {\n    const dotProduct = a.reduce((sum, val, i) =&gt; sum + val * b[i], 0)\n    const magnitudeA = Math.sqrt(a.reduce((sum, val) =&gt; sum + val * val, 0))\n    const magnitudeB = Math.sqrt(b.reduce((sum, val) =&gt; sum + val * val, 0))\n    return dotProduct \/ (magnitudeA * magnitudeB)\n  }\n}\n<\/code><\/pre>\n<p><\/p>\n<h2>Performance Optimizations<\/h2>\n<p><\/p>\n<p>&#8211; <strong>Client-side vectorization<\/strong>: Reduces server load, Web Workers prevent UI thread blocking, local cache avoids recomputation<\/p>\n<p>&#8211; <strong>Smart chunking<\/strong>: 500 chars per chunk, 50 char overlap, preserves document structure<\/p>\n<p>&#8211; <strong>Memory management<\/strong>: Lazy loading, LRU cache, proactive garbage collection<\/p>\n<p><\/p>\n<h2>Problems Solved<\/h2>\n<p><\/p>\n<p>&#8211; <strong>Page refresh data loss<\/strong>: Auto-save messages to localStorage on every change via useEffect<\/p>\n<p>&#8211; <strong>Vector similarity precision<\/strong>: Normalized embeddings, dynamic thresholds, multi-tier ranking<\/p>\n<p>&#8211; <strong>Large document performance<\/strong>: Web Workers for async processing, prevents blocking<\/p>\n<p><\/p>\n<h2>Performance Metrics<\/h2>\n<p>&#8211; <strong>First load<\/strong>: < 2s<\/p>\n<p>&#8211; <strong>Response time<\/strong>: < 500ms<\/p>\n<p>&#8211; <strong>Document processing<\/strong>: < 3s for 1MB docs<\/p>\n<p>&#8211; <strong>Search latency<\/strong>: < 100ms<\/p>\n<p><\/p>\n<h2>Key Takeaways<\/h2>\n<p><\/p>\n<p>Building this project gave me deep understanding of the complete AI application development pipeline \u2014 from UI\/UX design to RAG system implementation, performance optimization to user experience. The RAG system implementation deepened my understanding of vector databases, semantic search, and context generation.<\/p>\n<p><\/p>\n<p><strong>Links<\/strong>: <a href=\"https:\/\/github.com\/erishen\/ai-chat\">GitHub<\/a> | <a href=\"https:\/\/chat.erishen.cn\">Live Demo<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>With the rapid development of large AI models, building [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"english_url":"","chinese_url":"https:\/\/erishen.cn\/?p=127","footnotes":""},"class_list":["post-239","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/erishen.cn\/index.php?rest_route=\/wp\/v2\/pages\/239","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/erishen.cn\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/erishen.cn\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/erishen.cn\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/erishen.cn\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=239"}],"version-history":[{"count":3,"href":"https:\/\/erishen.cn\/index.php?rest_route=\/wp\/v2\/pages\/239\/revisions"}],"predecessor-version":[{"id":255,"href":"https:\/\/erishen.cn\/index.php?rest_route=\/wp\/v2\/pages\/239\/revisions\/255"}],"wp:attachment":[{"href":"https:\/\/erishen.cn\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=239"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}