Automating My Busywork With Scheduled AI Agents

For a while my week had three leaks I could never quite plug. Too many Slack channels to keep up with, a review queue that grew faster than I could drain it, and zero time to write down what actually shipped. None of these are hard problems. They are just constant, and they eat the hours where I would rather be building.

So I spent a weekend wiring up a set of scheduled AI agents to handle the boring parts. Not chatbots I poke at, but jobs that run on a cron in the cloud, do a specific task, and get out of the way. Here is what I built and, more usefully, what I learned doing it.

AI is not just for writing code

Almost all of the AI conversation right now is about coding. Autocomplete, agents that open pull requests, tools that refactor a file while you watch. That is great, and I lean on it. But it quietly ignores the other half of the job: the reading, the summarizing, and the keeping everyone in the loop. That part never ships a feature, so it gets no hype, and it is exactly where a model that is good at distilling text earns its keep. Summarization is not the consolation prize. For me it has been the highest leverage use of AI this year, and none of the three jobs below writes a single line of production code.

The three jobs

Slack digests. A few times a day an agent reads the channels I care about, keeps only what happened in that window, and posts one tidy summary to a private channel of mine. Each bullet that came from a thread I was part of gets a small link back to the thread, so I can jump in if I need to. I stopped scrolling and started skimming.

A review drafter. Every hour during the workday, an agent looks for pull requests waiting on my review, reads the diff, and leaves a pending (draft) review with inline comments. It never submits anything. I open the PR, the comments are already there waiting, I edit what I disagree with and hit submit. Reviews that used to sit for a day now start in minutes.

A weekly write-up. Every Monday an agent looks back over the last seven days across our repositories, figures out the major changes, possible issues, and things worth watching, and publishes a post to a small internal site. It opens with genuine credit to the people who shipped, includes a little contributor leaderboard, and ties code back to the tickets it closed. The team gets a readable record of the week without anyone writing it by hand.

How it is glued together

All of this rides on Claude Code’s scheduled remote agents, which it calls routines. A routine is just a saved Claude Code setup, a prompt plus the repositories and connectors it needs, packaged once and run automatically. The trigger can be a schedule, an on-demand API call, or a GitHub event, and it all runs on Anthropic’s cloud rather than my laptop. I set each one up through a short scheduling flow: give it a name, a cron expression, a prompt, and the model to use. The cron is in UTC with a one hour minimum, so I do the timezone math from where I live, and for the weekday jobs I add a small guard at the top of the prompt that checks the local day and bails out on weekends.

Routines are the path of least resistance for me because I already live in Claude Code, but there is nothing magic about them. The same idea works with a plain cron job on a server you own, or a scheduled GitHub Actions workflow that calls a model on a timer. What you are actually building is an agent with a good prompt and something that fires it on schedule. The routine just spares me from babysitting a box or a YAML file, and it runs in the cloud so nothing breaks when my machine is asleep.

When a routine fires, it spins up a fresh, isolated session in Anthropic’s cloud. The session gets its own git checkout of whatever repositories I point it at, plus the usual tools: a shell, file read and search, git, and web fetch. It starts with zero memory of previous runs, which is exactly why the prompt has to be completely self-contained. The prompt is the real product here. Everything the agent knows about the task lives inside it, so I spend most of my time wording the steps precisely: what to gather, what to skip, what “done” looks like, and what never to do.

External services come in through MCP connectors. Slack, GitHub, and the issue tracker each connect once, and then any routine can use them. GitHub comes through an MCP server authenticated as me, so the agents can search pull requests, read diffs, and leave draft reviews. The weekly write-up goes further and uses plain git in the session to clone a repo, commit a new post, and push it, with a rebase first so it never trips over commits that landed in the meantime. No tokens pasted into prompts, no secrets in the repo.

The decision that mattered most was picking the model per routine. Pure summarizing runs on the cheapest fast model. Code review runs on the strongest one, because that is where quality actually shows up. The weekly analysis sits in the middle, on a mid model with a medium effort setting I spell out in the prompt. Matching the model to the job kept the cost sane without making the output worse where it counts.

What a routine actually looks like

There is not much to one. It is a schedule plus a prompt. Here is a trimmed, genericized version of my review drafter, the routine that taught me the most:

# schedule: 0 0-9,23 * * *   (hourly, ~8am to 6pm my time, weekdays)
# model:    the strongest one, since this is code review

You review pull requests for my GitHub account. You run hourly on
weekdays. Use the GitHub tools (you are authenticated as me). There is
no gh CLI; use the GitHub MCP tools.

STEP 0  Run `TZ=<my zone> date +%u`. If it is Saturday or Sunday, stop
        and do nothing.

STEP 1  Find open PRs that request my review across my org. If there
        are none, stop.

STEP 2  For each PR, list its existing reviews.
        - If I already left a real review, skip the PR.
        - If a previous run left an EMPTY draft, delete it and redo.
        Also read my recent Slack posts and skip any PR already
        announced, so the hourly run never posts the same one twice.

STEP 3  Read the diff. Draft concrete inline comments (path, line,
        severity label, body). Create a PENDING review and attach the
        comments. Do NOT submit it.
        VERIFY: re-read the pending review and confirm it really has
        comments. If it came out empty, delete it and mark the PR as
        failed. Never report a draft that is not actually there.

STEP 4  Post one short Slack summary, listing only PRs that got a real,
        non-empty draft. Link each author to their GitHub profile.
        No em dashes. Write like a person.

The cron is the easy half. The prompt is where all the behavior lives, including every mistake I had to learn the hard way. That empty-draft check in step 3 exists because of a solid week of empty drafts.

What I actually learned

The plumbing was the easy part. The lessons were in the failure modes.

Make the agent verify its own work. My proudest moment was also my most embarrassing one. The review drafter happily reported “drafts ready” for a week while leaving completely empty reviews behind. It had created the review shell and never attached the comments. Now every agent that claims it did something has to re-read the result and prove it, and it is only allowed to report success when the thing is actually there.

Ground everything in real data. An early version pulled ticket IDs out of PR titles with a regex and cheerfully linked to issues that did not exist, because some titles carried leftover placeholder text. The fix was to stop trusting text that looks like an ID and only reference things the issue tracker confirms are real. An accurate post with fewer links beats a confident one full of dead ones.

Write like a person. Left alone, these tools reach for em dashes, “in summary”, and the same three transition words. I bake the opposite into every prompt. I also scale the praise in the weekly post to how much actually shipped, so a quiet week reads honest instead of inflated.

Respect idempotency and timezones. An hourly job that does not remember what it already did will spam you. A daily one that ignores the clock will skip a post because a static site generator quietly refuses to publish anything dated in the future. Both cost me a confused hour.

Was it worth it

Yes, and not because it is clever. It is worth it because the busywork is gone and the judgment stayed with me. I still decide what a review means and whether the week went well. The agents just remove the friction between me and that decision. That is the whole trick: automate the gathering and the drafting, keep the thinking.

If you try this, start with the most repetitive, lowest-stakes task you have, make it verify itself, and only then let it touch anything that other people will see.

This post was written with the help of AI (Claude by Anthropic).