Semaphore Uncut

Flaky Test API Now GA, New Auto-Fix Skill, Skill Quality Improvements


Listen Later

Flaky test data is now accessible through the sem-ai API, and this week also brings a skill that uses that data to fix the tests automatically.

Flaky test data is now generally available through the API. You can query your flakiest tests programmatically, sorted by number of disruptions, with failure timestamps and logs included. Find it on github.

A new sem-ai skill can identify and fix flaky tests end-to-end. The agent pulls the highest-disruption tests, gathers failure context, determines the root cause, and writes a fix. It then verifies the result, first by running tests locally, and if that’s not possible, by spinning up Semaphore test boxes to run the test across multiple machines in parallel. Running the test repeatedly across machines is especially important for flaky tests, since a single passing run isn’t enough to confirm a fix. Benchmarking on a real open source project with Claude Opus on high effort showed a cost of $1 to $1.50 per fix.

Four existing skills were improved with additional examples. Agents were occasionally not following skill instructions due to a lack of examples. Adding concrete examples improves adherence and makes sem-ai’s output more consistent across runs.

What’s Coming

User and organization management will be added to the API in an upcoming release. The team is also continuing to refine skills and commands based on usage patterns.

* Try sem-ai

* Try Semaphore Cloud

* All product news

Till the next time,

Pete Miloravac https://semaphore.io



This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit semaphoreio.substack.com
...more
View all episodesView all episodes
Download on the App Store

Semaphore UncutBy Semaphore

  • 5
  • 5
  • 5
  • 5
  • 5

5

2 ratings