Evolving with AI: From Basic Chat to Custom Multi-Agent Orchestration

👤 DevOps Engineer & AI Practitioner

AI AgentsLLMGitHub CopilotPrompt EngineeringVS CodeMulti-Agent

When the generative AI wave hit in late 2022, the tech industry was flooded with hype, disruption, and — frankly — a bit of anxiety. Like many engineers, I initially wondered how AI would impact my role as a DevOps Engineer. But instead of shying away from it, I leaned in.

What follows is my journey of AI adoption: a progression from using AI as a glorified search engine to independently orchestrating autonomous, multi-agent workflows in enterprise codebases. My hope is that sharing this evolution will help you bypass the early hurdles and start utilizing AI as a true, predictable engineering partner.

Phase 1: The Upgraded Search Engine

My journey started where most do: basic chatbots. Initially, ChatGPT served as a replacement for standard web searches. Instead of spending hours trawling through forums and documentation to solve a problem or research a topic, I could ask a direct question and get an immediate, contextualized answer.

It was a huge time-saver. By skipping the tedious manual filtering of search results, I accelerated my daily research and information-gathering tasks significantly.

Phase 2: Embracing the Coding Assistant

As a DevOps engineer, setting up infrastructure was my primary domain, but occasionally I needed to write scripts or small applications. When tools like GitHub Copilot gained popularity, I originally resisted relying on auto-completion. If you are just learning to program, it is crucial to write code yourself to grasp the fundamentals.

However, I eventually discovered AI's true value as an unblocker. Whenever I hit a wall where my current programming knowledge wasn't enough, I turned to AI for code snippets and examples. It acted as an on-demand mentor. As I integrated working solutions into my VS Code environment, I learned by doing — exponentially speeding up my transition from pure DevOps to system-level programming.

Phase 3: The Epiphany of Boundaries (Next-Token Prediction)

As I moved from writing quick scripts to tackling enterprise-grade (brownfield) projects, I hit a bottleneck. Chatbots are excellent for greenfield projects (starting from scratch), but when I asked an AI agent to implement a feature in a mature codebase, the results were unpredictable. Depending on the model, or even the time of day, the AI would hallucinate or suggest code that violated our architecture.

The Key Insight

Once I fundamentally understood that Large Language Models (LLMs) are essentially advanced next-token predictors, everything clicked. An LLM predicts the next word based on the context you provide. If you give it a vague prompt, it operates in an infinitely wide boundary, leading to generic or highly variable results. By providing context, constraints, and specific goals, you narrow that boundary.

I realized: To get production-ready, predictable outputs, I needed to strictly define the AI's operational boundaries.

Phase 4: Mastering Reusable Prompts

Writing highly descriptive, context-rich prompts manually every time is tedious. The turning point was discovering reusable workspace prompts.

In VS Code, by creating a .github/prompts/ directory, I could template my instructions. For example, I created a python.prompts.md file detailing that the AI should act as a Principal Python Engineer, strictly follow PEP8, prefer pure functions for testability, and favor composition over inheritance.

From then on, all I had to do was type /python in Copilot Chat, followed by a simple instruction like "Create a JSON parser." The AI automatically inherited my exact engineering standards.

Taking It Further: Templating

I created parameterized prompts for repetitive refactoring, such as changing code casing. I created a prompt file where the source and target cases were variables. In chat, I could simply type: /change_case convert snake_case to camelCase in config.yaml. Writing one generic prompt with dynamic inputs made my workflow infinitely scalable.

Phase 5: Global Guardrails via `copilot-instructions.md`

Not every question requires a specialized prompt, but you always want the AI to understand your project's baseline context.

To solve this, I adopted the .github/copilot-instructions.md file. In this single file, you define the global guardrails for your repository: the primary language, the architecture patterns, the location of documentation, and specific folder structures.

Now, every single interaction you have with Copilot — whether using a specialized prompt, generating an inline edit, or just asking a quick chat question — carries this file's context. By simply having this file in your repository, you guarantee the AI acts consistently across your entire team.

Phase 6: Multi-Agent Orchestration

The absolute pinnacle of my AI journey has been implementing Custom Agents. Instead of executing tasks linearly and waiting for the AI to finish one step before asking it to do another, I structured my workflow using specialized AI personas defined via .github/agents/ files.

Here is the pipeline I orchestrated:

1. Planner Agent

Uses an advanced reasoning model (like Gemini 1.5 Pro). Given a feature request, its sole job is to write a highly detailed implementation plan and save it as a Markdown file in a /plans directory.

2. Coder Agent

Configured to run on a fast coding model. It is instructed to read the plan generated by the Planner Agent, execute the code changes exactly as written, and ask clarifying questions instead of making assumptions.

3. Reviewer Agent

Operates on a fresh context window to avoid bias. It reviews the committed code against best practices and generates a review report.

4. The Orchestrator

The "Manager" agent. When triggered, it takes a feature request, autonomously triggers the Planner, hands the output to the Coder, and finally calls the Reviewer — handling the end-to-end development cycle.

Pro-Tips for AI Power Users

Meta-Prompting (Make the AI Write the Prompts): Don't write your prompt and agent files entirely from scratch. Write a quick, messy draft, feed it to an LLM, and ask: "Refine this prompt to be more specific, precise, and token-efficient." Let the AI optimize its own instructions.
Leverage Audio (Wispr Flow): Building intricate prompts means a lot of writing. I use Wispr Flow (a highly accurate dictation tool) to speak my naturally complex thoughts, which drastically accelerates prompt generation.
Utilize "Gems" / Custom Personas: If you use web interfaces like Gemini, heavily lean into creating "Gems" (custom, pre-prompted personas). Let the tool auto-refine your instruction set, ensuring your web chats are just as bounded and predictable as your IDE workflows.

Conclusion

Over the last few years, I've gone from asking simple questions to managing a localized team of AI agents directly within my codebase. By understanding LLM boundaries and leaning into automated workflows, AI transformed from a slightly unpredictable text generator into a precise, reliable extension of my engineering capabilities.

If you take one thing away: Stop treating AI like a magic 8-ball. Treat it like a junior engineer. Give it explicit boundaries, detailed instructions, and review processes. When you do, the results will speak for themselves.

(Next on my radar: Exploring Claude Code to see how its agentic CLI features compare to my current setup!)