Decorative DNA and waves

Posts

  • Observability part 2: AI assistance

    Builds on Part 1 by creating an AI agent that reviews workflow logs and summarizes failures automatically. Defines tools for querying Loki, retrieving workflow metadata, and feeding the results into an agent with clear troubleshooting instructions. Shows example runs where the agent highlights root causes and suggests next steps after a Nextflow job error. Reflects on how automation can accelerate incident response and where the approach could evolve next.

  • Observability in genomic workflows: Part 1

    Looks at how to improve observability for genomic workflows running Nextflow on a Slurm cluster. Spins up a dockerized Slurm environment with Nextflow, then adds centralized, metadata-rich logging with Loki and Grafana. Demonstrates how consolidated logs make it easier to troubleshoot without SSHing into cluster nodes. Lays the foundation for AI-assisted troubleshooting covered in the follow-up post.

  • Introducing remoclip

    Introduces remoclip, a Python package that provides clipboard access even when working on remote servers over SSH. Explains the client-server model behind the `remoclip copy` and `remoclip paste` commands and how they integrate with standard Unix piping. Shows how to secure the setup with tokens and use port forwarding so remote sessions can talk to the local clipboard safely. Highlights convenience features, cross-platform support, and future enhancements for heavy terminal users.

  • Getting started with AI

    Offers a beginner-friendly primer on the OpenAI API, structured outputs, and custom tool use. Recommends Python and shows how to configure environment variables, SDK setup, and best practices for experimentation. Walks through text and multimodal examples that demonstrate core API patterns and how to ground responses in schemas. Aims to equip readers to move beyond the ChatGPT UI and start building their own AI-powered tools.

  • Genomics AI Agent Workflow release

    Announces the release of the open-source genomics rare disease agent workflow on GitHub. Notes that an OpenAI API key and Nirvana-annotated variants are all that's needed to experiment with the agents. Points to instructions for annotating a VCF and links back to the posts that explain each workflow component in depth.

  • Implementing the Variant Agent: Part 2

    Continues the Variant Agent build by defining the variant query tool and the agent instructions that govern its use. Lays out clinical rules for rarity, inheritance patterns, and chromosome context so the agent reports only plausible variants returned by the tool. Demonstrates querying annotated variants by gene, consequence, ClinVar status, and frequency, with examples of the agent reasoning over results. Notes limitations of the prototype and opportunities to broaden the search strategies in future iterations.

  • Implementing the Variant Agent: Part 1

    Kicks off the Variant Agent implementation by preparing real sequencing data the agent can query. Walks through pulling a public Colombian family trio VCF, filtering to exome targets, and annotating variants with Illumina's Nirvana. Loads the annotated variants into DuckDB via SQLAlchemy to enable fast, flexible searches. Sets up the data foundation needed for the Variant Agent's search tool covered in the next post.

  • Implementing the Gene Agent

    Focuses on the Gene Agent that turns HPO term IDs into a prioritized list of candidate genes. Implements a straightforward `ranked_genes_for_hpo_terms` tool using phenotype-to-gene annotations and pandas to surface associations. Explains the agent instructions that require grounding answers in tool output and providing concise reasoning. Notes the simplicity of the current approach while outlining ways the agent could become more sophisticated over time.

  • Implementing the HPO Agent

    Dives into the HPO Agent that converts free-text symptom descriptions into precise HPO terms for the genomics workflow. Breaks down the instructions that force tool use, forbid guessing, and require reasoning tied to search results. Builds a vector database of HPO terms and an `hpo_search` tool so the agent can reliably match phenotypes. Shares example outputs, pitfalls, and ideas for improving recall without sacrificing specificity.

  • Can an AI agent help diagnose genetic diseases?

    Examines whether LLM agents can handle a first-pass genomic analysis for pediatric rare disease cases. Lays out a three-agent workflow that maps symptoms to HPO terms, ranks candidate genes, and searches variants with purpose-built tools and prompts. Tests the system on Rett syndrome and Maple Syrup Urine Disease examples, where the agents successfully surface the causal variants. Emphasizes keeping human analysts in the loop and envisions agents revisiting unsolved cases as genomic knowledge grows. Describes the architectural choices that make the workflow extensible over time.

  • Building a chat interface for your inbox

    Extends the MCP concept with a Fastmail-specific server that requires both a bearer token and user-provided API token for mailbox access. Details how to run the server, expose it via Cloudflare, and test it with OpenAI's remote MCP support. Introduces a small chat web app that uses the MCP tools so a model can search messages, retrieve content, and answer inbox questions. Covers security choices, demo flows, and ideas for future iterations of the email assistant.

  • The Model Context Protocol: extending LLMs with tools

    Explains the Model Context Protocol and how tool access complements structured outputs for LLM-powered apps. Walks through building a simple FastMCP email server that can list messages and return content from a sample mailbox. Shows how to expose the server through a Cloudflare tunnel and connect it to OpenAI's remote MCP support so models can call the tools. Discusses security considerations and the flexibility MCP provides for richer, multi-step workflows.

  • Using the structured output feature of LLMs

    Introduces the structured output feature for LLMs and how JSON schemas make it easier to pass model results into downstream systems. Builds a book recommendation demo with Pydantic models and the OpenAI Python SDK to enforce predictable responses. Expands to an Azure DevOps scenario that turns a user story into epics, features, and tasks, then creates the work items via the SDK. Shares practical lessons about validation, schema design, and chaining LLM outputs into existing workflows.

  • Explore your Apple Watch heart rate data in R

    Walks through exporting Apple Health data from the Watch app and loading the XML into R with tidy tools. Demonstrates filtering and counting record types, isolating heart rate measurements, and converting units for analysis. Shows how to visualize trends, build a simple Shiny interface, and experiment with moving averages to explore your own heart rate patterns.

  • Faster rendering in RStudio

    Explains the slowdown caused by RStudio's default Knit behavior spawning a fresh R session for every render. Describes building a console-based RMarkdown rendering addin that mirrors a TextMate workflow and keeps loaded packages in memory. Highlights the speed benefits along with the risk of hidden dependencies on the console environment and encourages periodic clean renders with Knit.

subscribe via RSS