{"id":236,"date":"2026-06-19T19:09:36","date_gmt":"2026-06-19T11:09:36","guid":{"rendered":"https:\/\/erishen.cn\/?page_id=236"},"modified":"2026-06-19T19:26:47","modified_gmt":"2026-06-19T11:26:47","slug":"ai-analyze-five-design-decisions","status":"publish","type":"page","link":"https:\/\/erishen.cn\/?page_id=236","title":{"rendered":"AI-Analyze: Five Non-Obvious Design Decisions"},"content":{"rendered":"<p><\/p>\n<p>Most articles about code analysis tools talk about &#8220;how many languages they support, how many rules they have, how many vulnerabilities they can detect.&#8221; ai-analyze supports 7 languages, 12 security rules, and 4-dimensional quality scoring \u2014 these feature numbers are right there in the README. This article doesn&#8217;t rehash the feature list. Instead, it explores five easily overlooked yet deeply worthwhile design decisions \u2014 each with a &#8220;why not&#8221; story behind it.<\/p>\n<p><\/p>\n<h2>Decision 1: MCP Dual-Strategy Is Not About Performance \u2014 It&#8217;s About Security Boundaries<\/h2>\n<p><\/p>\n<p>ai-analyze implements two MCP integration strategies. On the surface, calling Serena&#8217;s Python API directly (in-process import) is faster and simpler, while the protocol-compliant stdio client (JSON-RPC 2.0 over subprocess communication) is slower and more cumbersome. It seems like a performance decision.<\/p>\n<p><\/p>\n<p>But the real key difference isn&#8217;t performance \u2014 it&#8217;s the <strong>security boundary<\/strong>. The direct-call strategy exposes all of Serena&#8217;s tools \u2014 including <code>rename_symbol<\/code> and <code>replace_symbol_body<\/code>, which can modify source code. Your code analyzer can rename symbols and replace function bodies, meaning a single misjudged analysis result could directly corrupt your code. The stdio client <strong>deliberately does not implement<\/strong> these two modification operations. The reason: the subprocess boundary naturally isolates write operations. stdin\/stdout pipes can only transmit query requests and return results \u2014 they cannot modify the filesystem across processes. This is architecture constraining capability \u2014 not &#8220;I don&#8217;t want to implement this,&#8221; but &#8220;I shouldn&#8217;t let you do this.&#8221;<\/p>\n<p><\/p>\n<pre><code>\nclass StdioMCPClient:\n    \"\"\"MCP JSON-RPC 2.0 over stdio transport\"\"\"\n\n    async def connect(self):\n        self.process = await asyncio.create_subprocess_exec(\n            *self.server_command,\n            stdin=asyncio.subprocess.PIPE,\n            stdout=asyncio.subprocess.PIPE,\n        )\n\n    async def send_request(self, method, params=None):\n        request = {\n            \"jsonrpc\": \"2.0\",\n            \"id\": self._next_id,\n            \"method\": method,\n            \"params\": params or {}\n        }\n        self.process.stdin.write(json.dumps(request).encode() + b\"\\n\")\n        await self.process.stdin.drain()\n        response = await self.process.stdout.readline()\n        return json.loads(response.decode())\n<\/code><\/pre>\n<p><\/p>\n<p>The core judgment behind the two coexisting strategies: is your MCP service for yourself, or for others? For yourself, direct calls are more efficient and you can control yourself not to do dangerous operations. For others, protocol compliance is mandatory \u2014 you can&#8217;t assume other people&#8217;s agents will correctly use modification capabilities. Constraining them through architectural boundaries is more reliable than constraining them through documentation.<\/p>\n<p><\/p>\n<h2>Decision 2: The Merge Layer Uses AST as the Backbone \u2014 Serena Is Just Supplementary Fields<\/h2>\n<p><\/p>\n<p>Serena provides symbol structure \u2014 class inheritance, function call chains, cross-file reference relationships. The AST analyzer provides complexity scoring, code smell detection, parameter analysis, and async\/static annotations. When merging the two data sources, which one takes the lead?<\/p>\n<p><\/p>\n<p>Intuition might pick Serena \u2014 symbol structure is the skeleton of code, while complexity and smells are attributes. But ai-analyze&#8217;s <code>UnifiedAnalyzer<\/code> chose AST as the backbone. The reason: AST produces <strong>actionable metrics<\/strong> \u2014 &#8220;this function&#8217;s cognitive complexity is 15, exceeding the threshold of 10&#8221; is a finding you can act on immediately. &#8220;This class is referenced by 3 modules&#8221; is structural information, but it can&#8217;t tell you what to do. The merge logic loops primarily through AST&#8217;s file list, attaching Serena&#8217;s symbol data as supplementary attributes.<\/p>\n<p><\/p>\n<p>Even more interesting: the <code>UnifiedSymbol<\/code> dataclass has a <code>serena_data<\/code> field, but the merge code <strong>never fills it<\/strong>. This isn&#8217;t a bug \u2014 it&#8217;s an architectural placeholder. The current pipeline can already produce useful analysis reports using AST data alone; Serena data is a future enhancement point that doesn&#8217;t block current functionality. Zero runtime overhead for architectural foresight.<\/p>\n<p><\/p>\n<p>The design philosophy this decision conveys: the main loop of data merging should revolve around actionable dimensions, not structural ones. Structural information provides context; metric information drives decisions.<\/p>\n<p><\/p>\n<h2>Decision 3: The AI Prompt Includes an &#8220;Anti-Hype&#8221; Instruction<\/h2>\n<p><\/p>\n<p>ai-analyze has three DeepSeek AI calls: code quality assessment, Docker strategy suggestions, and framework upgrade analysis. Every call&#8217;s prompt embeds structured facts from AST analysis \u2014 Top 5 complex functions, Top 10 code smells, Top 20 dependencies. The AI isn&#8217;t asked to discover complexity problems; it&#8217;s handed confirmed facts to interpret. This &#8220;rules-first, AI-interprets&#8221; pattern reduces hallucination risk \u2014 the AI can&#8217;t claim &#8220;this project has no complexity issues&#8221; because the prompt explicitly lists 5 functions with complexity above 10.<\/p>\n<p><\/p>\n<p>But the least obvious design decision lies in the framework upgrade analysis prompt. There&#8217;s one explicit instruction:<\/p>\n<p><\/p>\n<p>> &#8220;If the current version is already the latest stable release, state that clearly \u2014 don&#8217;t push for an upgrade.&#8221;<\/p>\n<p><\/p>\n<p>This instruction counters a fundamental LLM tendency: <strong>always suggesting changes<\/strong>. LLM training data is full of &#8220;upgrade to the latest version&#8221; and &#8220;migrate to the new framework&#8221; suggestions because technical articles and Stack Overflow answers naturally lean toward recommending change. But in real-world scenarios, a React 18 project doesn&#8217;t need to upgrade to React 19 RC, and a Python 3.11 project doesn&#8217;t need to upgrade to 3.12. Without this instruction, the AI will almost certainly recommend an upgrade \u2014 even when the upgrade has no tangible benefit, even when it introduces compatibility risks.<\/p>\n<p><\/p>\n<p>This follows the same thinking as the temperature gradient across the three AI calls: code quality assessment uses <code>temperature=0.3<\/code> (suggestions can have some creativity), Docker strategy and framework upgrades use <code>temperature=0.2<\/code> (output must be conservative). The temperature gradient maps to consequence severity \u2014 a bad quality suggestion is merely annoying, a bad Docker configuration causes deployment failures, and a bad upgrade recommendation causes production incidents. The more severe the consequences, the lower the randomness.<\/p>\n<p><\/p>\n<h2>Decision 4: Security Dimension Weight Is Only 0.15 \u2014 Because It&#8217;s a &#8220;Weak Metric&#8221;<\/h2>\n<p><\/p>\n<p>The quality scoring formula: <code>overall = complexity \u00d7 0.25 + maintainability \u00d7 0.35 + reliability \u00d7 0.25 + security \u00d7 0.15<\/code><\/p>\n<p><\/p>\n<p>Why is maintainability weighted highest (0.35)? Because it&#8217;s the long-term cost driver \u2014 unmaintainable code gets harder to modify every time, with technical debt growing exponentially. Complexity is a &#8220;current state&#8221; metric, reliability is a &#8220;risk&#8221; metric, and maintainability is a &#8220;trend&#8221; metric. Trends predict future costs, so they deserve the highest weight.<\/p>\n<p><\/p>\n<p>But the more interesting question is why security is weighted lowest (0.15). The security score is calculated as <code>max(0, 100 - code_smells \/\/ 2 * 10)<\/code> \u2014 dividing the number of code smells by 2 and using it as a rough proxy for security issues. The code author acknowledged this in a comment: &#8220;should actually use a dedicated security analysis tool.&#8221; The security score is an <strong>immature metric<\/strong>. Weighting it at 0.25 would let this rough proxy dominate the overall score, producing unreliable results. <strong>Giving a weak metric a low weight is more honest than giving it a high weight<\/strong> \u2014 you don&#8217;t want a denominator-divided-by-two approximation deciding whether a project can be deployed.<\/p>\n<p><\/p>\n<p>The security scanner itself uses 12 regex-matched rules, with risk scoring using severity-weighted geometric weighting (INFO=0.1, LOW=0.25, MEDIUM=0.5, HIGH=0.75, CRITICAL=1.0) plus a saturation curve (10 CRITICAL findings = max score of 100, more than that doesn&#8217;t increase). Geometric weighting prevents the gaming strategy of &#8220;fixing a bunch of LOW-level issues to inflate the score,&#8221; while the saturation curve prevents large legacy projects from getting inflated scores just because they have many findings. These designs make the security scanner itself quite reliable, but the security dimension&#8217;s scoring formula is a rough proxy \u2014 so the weight has to be conservative.<\/p>\n<p><\/p>\n<h2>Decision 5: Cache Isn&#8217;t &#8220;Read From the Fastest Layer&#8221; \u2014 It&#8217;s &#8220;Auto-Migrate to Faster Layers on Read&#8221;<\/h2>\n<p><\/p>\n<p>The standard pattern for a three-level cache system (memory \u2192 file \u2192 Redis) is &#8220;write to all layers on write, read from the fastest layer on read.&#8221; ai-analyze adds an extra behavior: <strong>read-backfill<\/strong>. When a read hits L2 (file cache), the system copies that data to L1 (memory). When it hits L3 (Redis), it backfills to both L1 and L2.<\/p>\n<p><\/p>\n<p>This means the cache <strong>auto-warms up according to usage patterns<\/strong>. The first analysis goes through the full L3\u2192L2\u2192L1 chain; the second time, after hitting L2, it copies to L1; the third time, it hits L1 directly. No warmup steps needed, no manual migration \u2014 the cache naturally flows toward the faster layers with use.<\/p>\n<p><\/p>\n<p>Redis also has an &#8220;absorptive&#8221; fault-tolerance design: when the connection fails, <code>_client<\/code> is set to <code>None<\/code>, and every subsequent operation checks <code>if not self._client: return None<\/code>. If Redis completely goes down, no exceptions are thrown \u2014 the system silently degrades to L1+L2 and continues running. This doesn&#8217;t mean &#8220;Redis is optional&#8221; \u2014 it means &#8220;Redis failure should not block the analysis pipeline.&#8221;<\/p>\n<p><\/p>\n<p>The incremental analyzer has an additional layer of optimization: when merging cached results with new analysis results, it keeps deserialized Python objects in a memory-based <code>_file_result_cache<\/code> dictionary. The MultiLevelCache stores serialized JSON, but the merge phase needs objects. Deserialization is an expensive operation \u2014 keeping object state in memory avoids the overhead of repeatedly reading from cache and deserializing. This is a third, unlabeled cache layer, existing specifically for the merge phase.<\/p>\n<p><\/p>\n<hr\/>\n<p><\/p>\n<p>The common thread across these five decisions: <strong>every &#8220;why not&#8221; is more worth telling than the &#8220;how.&#8221;<\/strong> Don&#8217;t expose modification operations (architectural constraint); don&#8217;t center on structure (actionability first); don&#8217;t let AI push for upgrades (counteracting innate tendencies); don&#8217;t give weak metrics high weight (honesty over precision); don&#8217;t let caches wait passively (proactive warming). These decisions aren&#8217;t code implementation details \u2014 they&#8217;re expressions of design philosophy. They determine that ai-analyze isn&#8217;t just another analysis tool, but a system with genuine engineering judgment.<\/p>\n<p><\/p>\n<h2>Source Code Navigation<\/h2>\n<p><\/p>\n<table>\n<thead>\n<tr>\n<td>Module<\/td>\n<td>Source File<\/td>\n<td>Description<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>MCP Protocol Client<\/td>\n<td><a href=\"https:\/\/github.com\/erishen\/ai-analyze\/blob\/main\/src\/serena_stdio_client.py\">serena_stdio_client.py<\/a><\/td>\n<td>Deliberately excludes modification operations for security boundary<\/td>\n<\/tr>\n<tr>\n<td>MCP Direct Call Client<\/td>\n<td><a href=\"https:\/\/github.com\/erishen\/ai-analyze\/blob\/main\/src\/serena_client.py\">serena_client.py<\/a><\/td>\n<td>In-process full-feature calls, performance-first<\/td>\n<\/tr>\n<tr>\n<td>Unified Merge Layer<\/td>\n<td><a href=\"https:\/\/github.com\/erishen\/ai-analyze\/blob\/main\/src\/unified_analyzer.py\">unified_analyzer.py<\/a><\/td>\n<td>AST as backbone, Serena as supplement, serena_data placeholder<\/td>\n<\/tr>\n<tr>\n<td>DeepSeek AI Integration<\/td>\n<td><a href=\"https:\/\/github.com\/erishen\/ai-analyze\/blob\/main\/tools\/ai_enhanced_analyzer.py\">ai_enhanced_analyzer.py<\/a><\/td>\n<td>Rules-first prompts, anti-hype instructions, temperature gradient<\/td>\n<\/tr>\n<tr>\n<td>Quality Scoring<\/td>\n<td><a href=\"https:\/\/github.com\/erishen\/ai-analyze\/blob\/main\/src\/quality_score.py\">quality_score.py<\/a><\/td>\n<td>Maintainability 0.35 highest, Security 0.15 low-weight honest proxy<\/td>\n<\/tr>\n<tr>\n<td>Security Scanner<\/td>\n<td><a href=\"https:\/\/github.com\/erishen\/ai-analyze\/blob\/main\/src\/security_scanner.py\">security_scanner.py<\/a><\/td>\n<td>Severity-weighted geometric scoring + saturation curve<\/td>\n<\/tr>\n<tr>\n<td>Three-Layer Cache<\/td>\n<td><a href=\"https:\/\/github.com\/erishen\/ai-analyze\/blob\/main\/src\/multi_level_cache.py\">multi_level_cache.py<\/a><\/td>\n<td>Read-backfill auto-warming, Redis absorptive fault tolerance<\/td>\n<\/tr>\n<tr>\n<td>Incremental Analysis<\/td>\n<td><a href=\"https:\/\/github.com\/erishen\/ai-analyze\/blob\/main\/src\/incremental_analyzer.py\">incremental_analyzer.py<\/a><\/td>\n<td>MD5 change detection + file cache progressive migration + memory object layer<\/td>\n<\/tr>\n<tr>\n<td>AST Analyzer<\/td>\n<td><a href=\"https:\/\/github.com\/erishen\/ai-analyze\/blob\/main\/src\/ast_analyzer.py\">ast_analyzer.py<\/a><\/td>\n<td>Single tree traversal replacing 3 traversals, longest-task-first scheduling<\/td>\n<\/tr>\n<tr>\n<td>Pipeline Orchestration<\/td>\n<td><a href=\"https:\/\/github.com\/erishen\/ai-analyze\/blob\/main\/tools\/full_analyzer.py\">full_analyzer.py<\/a><\/td>\n<td>Skip flags as cost-control grid<\/td>\n<\/tr>\n<tr>\n<td>Docker Generation<\/td>\n<td><a href=\"https:\/\/github.com\/erishen\/ai-analyze\/blob\/main\/tools\/docker_generator.py\">docker_generator.py<\/a><\/td>\n<td>Content detection over structure detection, Husky removal practical details<\/td>\n<\/tr>\n<tr>\n<td>Plugin System<\/td>\n<td><a href=\"https:\/\/github.com\/erishen\/ai-analyze\/blob\/main\/src\/plugin_system.py\">plugin_system.py<\/a><\/td>\n<td>shared_data pipeline, namespace collision prevention<\/td>\n<\/tr>\n<tr>\n<td>Exception System<\/td>\n<td><a href=\"https:\/\/github.com\/erishen\/ai-analyze\/blob\/main\/src\/exceptions.py\">exceptions.py<\/a><\/td>\n<td>error_code + context dict, parseable log format<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><\/p>\n<p>Project repository: <a href=\"https:\/\/github.com\/erishen\/ai-analyze\">https:\/\/github.com\/erishen\/ai-analyze<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Most articles about code analysis tools talk about &#038;#82 [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"english_url":"","chinese_url":"https:\/\/erishen.cn\/?p=234","footnotes":""},"class_list":["post-236","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/erishen.cn\/index.php?rest_route=\/wp\/v2\/pages\/236","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/erishen.cn\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/erishen.cn\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/erishen.cn\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/erishen.cn\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=236"}],"version-history":[{"count":3,"href":"https:\/\/erishen.cn\/index.php?rest_route=\/wp\/v2\/pages\/236\/revisions"}],"predecessor-version":[{"id":252,"href":"https:\/\/erishen.cn\/index.php?rest_route=\/wp\/v2\/pages\/236\/revisions\/252"}],"wp:attachment":[{"href":"https:\/\/erishen.cn\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=236"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}