1 comments

  • IndieDev-Will 1 hour ago
    OP here. Just a quick breakdown of the technical architecture for those interested.

    I realized early on that using "One Giant System Prompt" was like hiring one chef to make Sushi, Pizza, and Pastries simultaneously. The result was always mediocre.

    So, I tore down the v1 architecture and rebuilt the backend using Next.js with what I call a "Kitchen Brigade" Agentic Workflow:

    1. The "Maître D'" (The Router Layer): Before any generation happens, a lightweight model intercepts the request. It doesn't generate the answer. It classifies the intent: Is this a Logic problem? A Creative writing task? Or an emotional complaint? This routing step is crucial for latency vs. quality trade-offs.

    2. Dynamic Prompt Assembly (The Assembler): Instead of a static system prompt, I assemble the "Order Ticket" dynamically based on the User Context.

    Input: User's selected "Depth Level" (ELI5 vs PhD) + Intent Class.

    Output: A constructed prompt that instructs the model on how to think, not just what to say.

    3. The Model Matrix (The Specialists): The backend routes to different models based on the Maître D's classification:

    DeepSeek/o1: For heavy logic, math, and "First Principles" derivations.

    Claude 3.5/Gemini: For high-EQ responses and nuanced creative explanations.

    GPT-4o-mini: For quick, routine routing tasks to keep latency low.

    It’s all deployed on Vercel.

    I'm curious: For those of you building Agents, do you prefer this kind of "Hard Routing" (explicit logic) or do you trust the newer models to "Self-Route" via tool use? Would love to hear your thoughts on the architecture.