EPISODE #1
BY THE ORDERs OF EXCELLENCE AND CONTINUITY
Need for speed - the parallel awakening
MISSION BRIEF
They called it “mature.” Java. Maven. JUnit 5. Selenium.
An independent UI test repo with nearly 2,000 fragile end-to-end tests. Each test class? A house of cards. One fails, all fall.
Test runs? 26 hours.
Pass rate? Somewhere between 85–90%, depending on caffeine intake and wind direction.
Execution method? One laptop. One human. Overnight. Plugged in, lid open, mouse taped to prevent sleep mode.
A ritual. Not a system.
When they brought us in, they said proudly:
“We have a mature automation framework, a top-tier QA team, and an experienced test architect.”
Naturally, we checked LinkedIn.
There he was – the architect. Posts about the strategic value of "div[1]/span[2]" locators. A buffet of certifications – O’Reilly, Udemy, a forest of badges.
His featured post:
“How to use Thread.sleep correctly.”
We took a moment to process that. Then we got to work.
OP 01: SYSTEM UPGRADE
First order of business: retire the ThinkPad night watchman.
We migrated the execution to GitHub Actions. Nightly CI jobs were introduced. Logs and history followed.
We threw out the archaic Surefire HTML reports, which looked like they were built in MS Paint using HTML tables, and replaced them with Allure, deployed live to GitHub Pages.
Then came the real monster: their WebDriver architecture.
They were using a singleton WebDriver, a beautiful way to force single-thread execution and guarantee that nothing, ever, could run in parallel without a ritual sacrifice.
We scrapped it. We introduced a BaseTest class to control lifecycle management. Every test class received its own isolated WebDriver instance.
Screenshots, logging, exception hooks — all handled inside BaseTest.
But because of their religious devotion to the Page Object Model, this meant we had to inject the driver into the constructor of every single page class and then pass it manually into every single test class.
They had a lot of pages.
By the 30th constructor, we weren’t refactoring anymore. We were participating in a spiritual endurance test.
At one point, we gently suggested:
“We could redesign this from scratch. Clean, modern, parallel-ready.”
But no. They were too proud of their masterpiece.
Too emotionally attached to their Frankenstein-style Page Object Model framework.
“It’s proven,” they said.
We said nothing. And kept injecting.
Result: legacy stack neutralized, CI stability established.
We moved the suite to 5‑thread parallel, cut runtime to 7–8 hours, published Allure to GitHub Pages, and scheduled nightly runs.
The testers were thrilled.
The engineers were impressed.
The architect posted a celebratory update on LinkedIn.
We just nodded.
OP 02: GRID UNLEASHED
They were satisfied.
Each morning, they had fresh Allure reports. Testers reviewed them. Bugs were filed.
Confidence was high.
But we weren’t satisfied.
8 hours was still glacial.
So, we moved to phase two. We upgraded their GitHub runner to a beefier instance. Inside
it, we installed a Selenium Grid with 3 nodes, each running 5 parallel threads.
The Java code stayed on the main runner — logic, orchestration, reporting. All UI
interactions were offloaded to grid nodes.
Clean. Modular. Fast.
Still one report. Still one pipeline.
But now?
Execution time: ~3 hours.
That’s when their Head of QA started walking proudly through the office, handing out
Allure links like gospel scrolls and offering “on-demand regression in 3 hours” to product
managers like a QA messiah granting blessings.
OP 03: Maximum Overdrive
The client said, “This is amazing. You’ve done enough.”
But we hadn’t.
We were just warming up.
We flipped the switch on phase three:
True parallelism. Dynamic scaling. Full chaos mode.
We built a system that:
Used GitHub’s matrix strategy to launch up to 20 runners in parallel
Installed Selenium Grid inside each runner
Spawned 3 nodes per grid, each running 5 threads
That’s 15 threads per runner × 20 runners = 300 concurrent test threads.
But to pull it off, we needed orchestration. So we built a custom Maven plugin that:
Counted test classes
Calculated optimal sharding
Uploaded metadata to GitHub as an artifact
Informed the matrix config to dynamically determine how many runners to launch
Each runner generated its own Allure results, which were then collected and merged by a final job into a single polished dashboard.
To the user, it looked like one clean test run.
Behind the scenes?
A swarm of parallel machines eating through test classes like a digital locust storm.
Execution time: 20 minutes.
We were aiming for 5, but one test class — one — took 20 minutes alone.
We called it “A Short Documentary.”
>
Test-Allocator Maven Plugin
>
GitHub Acton Setup
OP 04: COLLATERAL EXPOSURE
Then it broke. Pass rate dropped from 90% to 40%. Slack lit up.
"Regression’s failing!"
"Who touched the grid?"
"Is GitHub down?"
No. The grid was fine. The tests were fine. The app? Wasn’t. Endpoints assumed one user at a time. Shared states. Race conditions. Tech debt – now fully visible. The tests didn't break the system. They revealed it.
QA lead? Promoted. Automation Architect? Celebrated. The Software Architect? Furious. Years of assumptions, gone in one clean run. The Project Manager? Pissed. His roadmap replaced by a Critical Refactor Initiative(TM) no one wanted, but now couldn't ignore.
And us? We archived the logs. Left the system fully exposed. And collected what was ours.
No slides. No approvals. No applause. Just the reward – wired fast.
The engineers walked. The system staggered. And somewhere, someone tried to schedule a follow-up meeting.
MISSION REPORT
✅ From 26 hours to 20 minutes
✅ From manual execution to CI orchestration
✅ From terminal reports to interactive dashboards
✅ From "mature framework" to "refactor or die"
They got velocity. We got the reward. And another mission added to the ledger.
Now? They want it everywhere. They want us back.
“Can we trigger it on every commit?”
“Can we gate pull requests with it?”
“Twenty minutes? We could put it on a leash and run it after every push.”
They’re calling it “the perfect pipeline”
We’re calling it…
Episode 2.