# What benchmarks or case studies exist?

**Cosine has demonstrated proven results across real-world enterprise deployments and industry benchmarks.** Customers consistently report major gains in productivity, backlog reduction, and engineering throughput.

***

### Key performance benchmarks

#### Internal productivity benchmarks

Cosine’s own engineering team uses the platform extensively, providing real-world validation of its capabilities.

* **1,900+ pull requests** merged since June using Cosine.
* Average **PR completion time cut by 40%** compared to manual workflows.
* Backlog items resolved autonomously with minimal human intervention.

#### SWE-bench and code intelligence performance

Cosine’s underlying model, **Genie**, has demonstrated strong results on **SWE-bench** and related code reasoning tasks — outperforming comparable open-weight and closed-source models in end-to-end code comprehension and bug resolution accuracy.

> Note: Cosine’s benchmarks focus on real-world task outcomes (validated pull requests and test success rates) rather than static code-completion scores.

***

### Enterprise case studies

#### Global investment bank — On-premise deployment

A leading global bank deployed Cosine on-premise to automate maintenance and feature work across its internal trading systems.

* **30% of backlog cleared** in the first month.
* Average **time-to-merge reduced by 45%**.
* Deployment passed stringent internal InfoSec reviews with zero exceptions.

#### Defence technology company — Secure code refactoring

A defence contractor integrated Cosine in a fully **air-gapped environment**, using it for large-scale code refactors and documentation generation.

* Reduced manual refactoring effort by **60%**.
* Improved test coverage by **20 percentage points**.
* Enabled continuous updates without exposing code externally.

#### SaaS provider — Developer velocity boost

A mid-size SaaS company connected Cosine to Jira and Slack for automated PR creation and backlog cleanup.

* Resolved **hundreds of small issues in under an hour**.
* Increased engineering throughput by **50%** in the first quarter.
* Expanded adoption to multiple teams within weeks.

***

### Outcomes across pilots

| Metric                      | Average Improvement |
| --------------------------- | ------------------- |
| **Cycle time reduction**    | 20–40%              |
| **PR throughput**           | +60%                |
| **Backlog reduction**       | 30–40%              |
| **Test coverage**           | +15–25 pts          |
| **Deployment time (cloud)** | <10 minutes         |

These metrics are consistent across Cosine’s internal use and customer pilots in financial services, SaaS, and defence.

***

### Why this matters

Benchmarks are only meaningful when they reflect real production outcomes. Cosine’s results are validated not by synthetic tests, but by **merged pull requests, reduced cycle times, and improved developer velocity** in real engineering environments.

***

### Related pages

* [How do we contact sales / request a demo / start a trial?](https://chatgpt.com/g/g-689e4f005f7481919723ce3614903467-cosine-salesperson/c/68dfea49-ecb8-8332-b7d9-420bf6dc4e81)
* [ROI — what outcomes should we expect?](https://chatgpt.com/g/g-689e4f005f7481919723ce3614903467-cosine-salesperson/c/68dfea49-ecb8-8332-b7d9-420bf6dc4e81)
* [How does Cosine work?](https://chatgpt.com/g/g-689e4f005f7481919723ce3614903467-cosine-salesperson/c/68dfea49-ecb8-8332-b7d9-420bf6dc4e81)

→ Next:  How does Cosine support enterprise security and compliance?
