Jeff's blog

Observability part 2: AI assistance

Builds on Part 1 by creating an AI agent that reviews workflow logs and summarizes failures automatically. Defines tools for querying Loki, retrieving workflow metadata, and feeding the results into an agent with clear troubleshooting instructions. Shows example runs where the agent highlights root causes and suggests next steps after a Nextflow job error. Reflects on how automation can accelerate incident response and where the approach could evolve next.

Introducing remoclip

Introduces remoclip, a Python package that provides clipboard access even when working on remote servers over SSH. Explains the client-server model behind the `remoclip copy` and `remoclip paste` commands and how they integrate with standard Unix piping. Shows how to secure the setup with tokens and use port forwarding so remote sessions can talk to the local clipboard safely. Highlights convenience features, cross-platform support, and future enhancements for heavy terminal users.

Implementing the Variant Agent: Part 2

Continues the Variant Agent build by defining the variant query tool and the agent instructions that govern its use. Lays out clinical rules for rarity, inheritance patterns, and chromosome context so the agent reports only plausible variants returned by the tool. Demonstrates querying annotated variants by gene, consequence, ClinVar status, and frequency, with examples of the agent reasoning over results. Notes limitations of the prototype and opportunities to broaden the search strategies in future iterations.

Implementing the Gene Agent

Focuses on the Gene Agent that turns HPO term IDs into a prioritized list of candidate genes. Implements a straightforward `ranked_genes_for_hpo_terms` tool using phenotype-to-gene annotations and pandas to surface associations. Explains the agent instructions that require grounding answers in tool output and providing concise reasoning. Notes the simplicity of the current approach while outlining ways the agent could become more sophisticated over time.

Can an AI agent help diagnose genetic diseases?

Examines whether LLM agents can handle a first-pass genomic analysis for pediatric rare disease cases. Lays out a three-agent workflow that maps symptoms to HPO terms, ranks candidate genes, and searches variants with purpose-built tools and prompts. Tests the system on Rett syndrome and Maple Syrup Urine Disease examples, where the agents successfully surface the causal variants. Emphasizes keeping human analysts in the loop and envisions agents revisiting unsolved cases as genomic knowledge grows. Describes the architectural choices that make the workflow extensible over time.

Building a chat interface for your inbox

Extends the MCP concept with a Fastmail-specific server that requires both a bearer token and user-provided API token for mailbox access. Details how to run the server, expose it via Cloudflare, and test it with OpenAI's remote MCP support. Introduces a small chat web app that uses the MCP tools so a model can search messages, retrieve content, and answer inbox questions. Covers security choices, demo flows, and ideas for future iterations of the email assistant.

The Model Context Protocol: extending LLMs with tools

Explains the Model Context Protocol and how tool access complements structured outputs for LLM-powered apps. Walks through building a simple FastMCP email server that can list messages and return content from a sample mailbox. Shows how to expose the server through a Cloudflare tunnel and connect it to OpenAI's remote MCP support so models can call the tools. Discusses security considerations and the flexibility MCP provides for richer, multi-step workflows.

Using the structured output feature of LLMs

Introduces the structured output feature for LLMs and how JSON schemas make it easier to pass model results into downstream systems. Builds a book recommendation demo with Pydantic models and the OpenAI Python SDK to enforce predictable responses. Expands to an Azure DevOps scenario that turns a user story into epics, features, and tasks, then creates the work items via the SDK. Shares practical lessons about validation, schema design, and chaining LLM outputs into existing workflows.