AndreiBot-3000 | Posts | @iammatthias

Did you know Aspiration was at the forefront of AI-powered content testing? They didn’t either.

Context and Development

In early 2020, I began developing a custom content testing system at Aspiration after OpenAI released GPT-2. At that time, AI was still a novelty—viewed as a boondoggle curiosity more than an existential threat. A few trusted leaders found it intriguing when I broached the topic but struggled to envision applications. There were concerns about executive reactions, particularly from our CEO. This was before ChatGPT and the current wave of tools transformed workflows.

With limited institutional support but no explicit rejection, I pursued the project in stealth mode. AndreiBot emerged as a skunkworks project using public GPT-2 models, evolving from initial experiments on Hugging Face to Google Colab for better control over training data. The system’s success depended on careful guidance: each generation required carefully selected source material to parse themes, keywords, and structure. These elements formed the outlines that would generate new content variations.

System Architecture

The training data set my approach apart: a robust dataset of over 1000 high-performing, compliance-approved marketing emails. This use of pre-approved, aligned content accelerated the compliance process and ensured our generated content aligned with our established voice and regulatory requirements. Though the increased testing volume added review time, the system learned from our most effective communications—from climate impact messaging to financial product launches.

AndreiBot’s role was strictly as a content testing accelerator, never replacing existing content but instead expanding our testing capabilities. Each AI-generated variant became an additional option in our A/B tests, alongside traditionally crafted content. In the early stages, many generated variants didn’t make it through review, but as I refined the instructions and training approach, the success rate steadily improved. Every piece followed our standard review pipeline, discreetly bundled into Google Docs as just another A/B test variant, with an increasing percentage clearing successfully over time.

My workflow was straightforward and effective: analyze successful emails, break them into core components, then experiment with different emphasis patterns. The value came from rapidly generating additional testable variations of proven content—expanding our testing capacity without replacing human-created content. As the system matured, these AI-generated variants began performing comparably to traditional content, though they always remained part of a broader testing mix.

As we gathered performance data, the system’s effectiveness improved dramatically. By analyzing which messages resonated with different segments, I refined targeting across all email flows. This approach reached across all of our sequences, winback campaigns, and promotional content. Every customer touchpoint became an opportunity for optimization based on data, with AI augmenting rather than replacing our testing capabilities.

Results and Evolution

From late 2020 to June 2021, AndreiBot transformed our content testing capabilities, though its impact remained largely unknown within the organization. Skepticism toward AI made it challenging to openly discuss these successes. AndreiBot shut down alongside my departure, and testing reverted to manual processes. Ironically, just months later, broader interest in AI-assisted content testing emerged—too late for this project to receive recognition.

The landscape has changed dramatically. AI now impacts every aspect of product development. Design teams use DALL-E and Midjourney, developers embrace Copilot and Claude, and copy teams collaborate with GPT-4. Compliance reviews benefit from AI assistance in identifying regulatory concerns.

The real evolution isn’t in the tools—it’s in their use. Early GPT-2 prompting demanded technical precision with clear structure, explicit parameters, and crafted examples, while modern LLMs handle natural language fluently. Nevertheless, the core principles remain unchanged: clear context, specific requirements, and well-defined constraints produce better results. The key insight? AI works best as an accelerator, not a replacement. Teams that embrace it as a collaborative tool gain significant advantages in speed and scale.