Software Handle The First Pass

Every automation starts with a trigger. For me, it was a spreadsheet of contacts that grew over time. Each person had their own history: what they clicked, what they ignored, what they asked about in a reply months ago. I wanted the system to remember those details without me having to write individual emails.

OpenAI handles the writing part. I feed it a few notes about the person and a loose structure for the message. It returns something that reads like a draft from a quiet assistant. Not perfect, but close enough to edit. The key was narrowing the prompt. Too much freedom and the tone wandered. Too little and it became robotic. Finding the middle ground took a few weeks.

SendGrid takes over once the draft is ready. I chose it because the API was straightforward and the delivery logs were detailed. The system schedules emails based on time zones and past open times. I learned quickly that sending at 8 AM across all time zones was a mistake. The logs showed me that.

What the Analytics Actually Showed

The analytics piece came last. I added a small dashboard that pulls data from SendGrid and compares it against the content variations. At first I was looking for big patterns. But the patterns were small. Subject lines with a specific verb tended to perform better on Tuesday mornings. Links placed in the third paragraph saw more clicks than links in the first.

I do not use the word optimization anymore because it implies a final state. There is no final state. The data just suggests what to try next.

One thing that stood out was how much timing mattered for different segments. People who engaged with technical content preferred mid morning. Those who responded to more conversational emails opened later in the day. The system now stores those preferences and adjusts send times per recipient. It is not perfect, but it gets closer each cycle.

SERVICES

The Stack

Three services handling content, delivery, and measurement.

OpenAI API

Generates drafts from brief prompts. Stores tone preferences per contact.

SendGrid

Handles delivery, bounce tracking, and engagement logs.

Custom Analytics Layer

Python script that compares content variables against open and click rates.

The Feedback Loop I Did Not Expect

What surprised me was how the system started to influence my own writing. After seeing what performed well, I began adjusting the prompts to lean into those patterns. Not to game the metrics, but because the data pointed to something genuine. People responded to shorter paragraphs and questions at the end. They liked when the subject line matched the first sentence.

I started using the same structure in my own emails. The ones I wrote by hand. It felt strange at first, borrowing from a machine I had built. But the machine was only reflecting what people had already shown me. That made it easier to trust.

The loop works in three steps now. The AI drafts. The delivery service sends. The analytics report back. Then I review the report and adjust the prompts. Each cycle takes less time than the one before. What used to take a full day now takes an hour of reviewing and tweaking.

PERFORMANCE

Last 30 Days

Across 1,247 automated sends

47%

Open rate

12%

Click rate

3.2%

Unsubscribe

BEST PERFORMING VARIABLE

Subject lines under 45 characters with a direct question

Where the Complexity Lives

The hardest part was not the code. It was managing the thresholds. I had to decide when to let the system send automatically and when to hold it back. Too much automation and the emails felt cold. Too much manual review and I lost the time savings.

I settled on a hybrid model. The AI writes three variations. I pick one. Then the system handles timing and follow ups. That balance feels right for now. It might shift as the models get better. But I suspect the human part will always need to be there.

Another layer I did not anticipate was content freshness. The AI would sometimes pull from older patterns if I did not update the prompt library. I started keeping a small document with recent examples of what worked. That document feeds into the prompt each time. It keeps the writing from feeling stale.

WORKFLOW

One Send, From Start to Finish

1. Trigger

Contact reaches a milestone or inactivity period

2. Draft

AI generates three options using stored preferences

3. Review

Quick edit and selection of final version

4. Schedule

System picks optimal time based on past behavior

Small Changes That Made a Difference

The subject line testing was the first thing I automated. I used to write one subject line and hope. Now the system tests three variations on a small segment before sending to the rest. The difference was immediate. Open rates went up by a noticeable margin without any change to the actual email content.

Timing was the second adjustment. I built a simple lookup that checks when each recipient last opened an email from me. The system defaults to that hour plus or minus thirty minutes. It sounds like a small detail, but it cut my early morning sends to people who never opened before 10 AM. That mattered.

I also stopped trying to personalize everything. The AI wanted to insert names and details constantly. But some emails worked better when they were more general. The system learned that over time. Now it only personalizes when the data suggests it will help.

TOOLKIT

Resources I Use

Not sponsored. Just what works.

SendGrid API Docs

Clear examples. Good error messages. Free tier for testing.

OpenAI Playground

Where I test prompts before moving them to code.

Retool

Built a quick dashboard to review drafts and logs in one place.

What I Would Do Differently

If I started again, I would build the analytics layer earlier. I waited until I had the sending and content pieces in place. That meant I was sending for weeks without knowing what was actually working. The first month of data was noisy because I had no baseline.

I also underestimated how much storage the prompt history would need. Every variation, every subject line test, every edit I made got saved. That folder grew fast. But having it has been useful for reviewing what changed over time.

The thing I would keep the same is the pace. I added features slowly. One API at a time. One test group at a time. That kept the system stable and let me understand each part before moving to the next.