# Company Discovery API
> Discover companies from the open web using natural language queries with structured matching, custom extractions, and source citations
**Base URL:** `https://api.nyne.ai`
**Endpoint:** `https://api.nyne.ai/company/discovery`

---

## Overview
The Company Discovery API lets you find companies from the open web using natural language queries. Describe the companies you are looking for, define match conditions, and let the API search, evaluate, and enrich candidates automatically. Each result includes match evaluations, custom extractions, and source citations with confidence scores.
### What You Get
- **Natural Language Search:** Describe the companies you want to find in plain English- **Match Evaluation:** Define requirements and get per-company match status with evidence- **Custom Extractions:** Specify enrichment fields to extract for each discovered company- **Source Citations:** Every evaluation backed by reasoning, confidence scores, and source citations
## Authentication

All requests require header authentication:

```
X-API-Key: YOUR_API_KEY
X-API-Secret: YOUR_API_SECRET
```

## Rate Limits

| Limit | Value |
|-------|-------|
| Per Minute | 60 requests |
| Per Hour | 1000 requests |
| Monthly | Varies by plan |

### Response Headers

Responses from endpoints that perform rate-limit checks include both legacy and standard rate-limit headers:

| Header | Description |
|--------|-------------|
| `X-RateLimit-Limit` | Active per-minute or per-hour limit |
| `X-RateLimit-Remaining` | Remaining requests in the active window |
| `X-RateLimit-Reset` | Unix timestamp when the active window resets |
| `RateLimit-Limit` | Active per-minute or per-hour limit |
| `RateLimit-Remaining` | Remaining requests in the active window |
| `RateLimit-Reset` | Seconds until the active window resets |
| `Retry-After` | Seconds to wait before retrying; present on HTTP `429` rate-limit responses |

Monthly quota responses include quota-specific headers when available:

| Header | Description |
|--------|-------------|
| `X-Quota-Limit` | Monthly request or credit limit |
| `X-Quota-Used` | Amount used in the current billing cycle |
| `X-Quota-Remaining` | Amount remaining in the current billing cycle |
| `X-Quota-Reset` | Unix timestamp when the current billing cycle resets |

## Credit Usage

- **Discovery Request:** 10 credits per discovery request (10 credits per request)
> Credits are charged per discovery request regardless of the number of results returned. The credit cost is fixed at 10 credits per request.
---

## POST /company/discoveryDiscover companies from the open web using natural language queries with structured matching and source citations.
### Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `query` | string | Yes | Natural language query describing the companies to find (max 2000 characters) |
| `requirements` | array | No | Array of match conditions, each with name and description (max 20 items) |
| `extract` | array | No | Array of enrichment fields to extract, each with name and description (max 10 items) |
| `limit` | integer | No | Maximum number of results to return. Range: 5-100. Default: 10 |
| `quality` | string | No | Quality tier for discovery. One of: basic, standard, premium. Default: standard |
| `exclude` | array | No | Array of entities to exclude from results, each with name and url (max 100 items) |
| `metadata` | object | No | Pass-through metadata returned with results. Values must be string, number, or boolean. |
| `callback_url` | string | No | HTTPS URL to receive webhook notification when processing completes |

> **Required:** The <code>query</code> parameter is required. All other parameters are optional.
### Request Examples

**With Advanced Features:**

```json
{
  "query": "AI startups in healthcare that have raised Series A funding",
  "requirements": [
    {
      "name": "series_a_funded",
      "description": "Has raised a Series A funding round"
    },
    {
      "name": "healthcare_focus",
      "description": "Primary business focus is in healthcare or medical technology"
    }
  ],
  "extract": [
    {
      "name": "ceo_name",
      "description": "Name of the CEO or founder"
    },
    {
      "name": "funding_amount",
      "description": "Total funding amount raised"
    }
  ],
  "limit": 10,
  "quality": "standard",
  "callback_url": "https://yourapp.com/webhook/discovery"
}
```

**Basic Request:**

```json
{
  "query": "AI startups in healthcare that have raised Series A funding"
}
```

### Code Examples

**cURL:**

```bash
# Submit a discovery request
curl -X POST "https://api.nyne.ai/company/discovery" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "X-API-Secret: YOUR_API_SECRET" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "AI startups in healthcare that have raised Series A funding",
    "requirements": [
      {"name": "series_a_funded", "description": "Has raised a Series A funding round"},
      {"name": "healthcare_focus", "description": "Primary business focus is in healthcare"}
    ],
    "extract": [
      {"name": "ceo_name", "description": "Name of the CEO or founder"}
    ],
    "limit": 10,
    "quality": "standard"
  }'

# Check status (replace REQUEST_ID with the returned request_id)
curl -X GET "https://api.nyne.ai/company/discovery?request_id=REQUEST_ID" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "X-API-Secret: YOUR_API_SECRET"
```

**Python:**

```python
import requests
import time

API_KEY = "YOUR_API_KEY"
API_SECRET = "YOUR_API_SECRET"
BASE_URL = "https://api.nyne.ai"

headers = {
    "X-API-Key": API_KEY,
    "X-API-Secret": API_SECRET,
    "Content-Type": "application/json"
}

# Step 1: Submit discovery request
payload = {
    "query": "AI startups in healthcare that have raised Series A funding",
    "requirements": [
        {"name": "series_a_funded", "description": "Has raised a Series A funding round"},
        {"name": "healthcare_focus", "description": "Primary business focus is in healthcare"}
    ],
    "extract": [
        {"name": "ceo_name", "description": "Name of the CEO or founder"},
        {"name": "funding_amount", "description": "Total funding amount raised"}
    ],
    "limit": 10,
    "quality": "standard"
}

response = requests.post(f"{BASE_URL}/company/discovery", json=payload, headers=headers)
data = response.json()

if not data.get("success"):
    print(f"Error: {data}")
    exit(1)

request_id = data["data"]["request_id"]
print(f"Request submitted: {request_id}")

# Step 2: Poll for results
while True:
    time.sleep(5)
    response = requests.get(
        f"{BASE_URL}/company/discovery",
        params={"request_id": request_id},
        headers=headers
    )
    result = response.json()
    status = result["data"]["status"]
    print(f"Status: {status}")

    if status == "completed":
        discovery = result["data"]["result"]
        print(f"Found {discovery['results_count']} companies")
        for company in discovery["results"]:
            print(f"  - {company['name']}: {company['description']}")
            if company.get("extractions"):
                for key, value in company["extractions"].items():
                    print(f"    {key}: {value}")
        break
    elif status == "failed":
        print(f"Discovery failed: {result['data'].get('error', 'Unknown error')}")
        break
```

**JavaScript:**

```javascript
const API_KEY = "YOUR_API_KEY";
const API_SECRET = "YOUR_API_SECRET";
const BASE_URL = "https://api.nyne.ai";

const headers = {
  "X-API-Key": API_KEY,
  "X-API-Secret": API_SECRET,
  "Content-Type": "application/json"
};

async function discoverCompanies() {
  // Step 1: Submit discovery request
  const payload = {
    query: "AI startups in healthcare that have raised Series A funding",
    requirements: [
      { name: "series_a_funded", description: "Has raised a Series A funding round" },
      { name: "healthcare_focus", description: "Primary business focus is in healthcare" }
    ],
    extract: [
      { name: "ceo_name", description: "Name of the CEO or founder" },
      { name: "funding_amount", description: "Total funding amount raised" }
    ],
    limit: 10,
    quality: "standard"
  };

  const submitResponse = await fetch(`${BASE_URL}/company/discovery`, {
    method: "POST",
    headers,
    body: JSON.stringify(payload)
  });
  const submitData = await submitResponse.json();

  if (!submitData.success) {
    console.error("Error:", submitData);
    return;
  }

  const requestId = submitData.data.request_id;
  console.log(`Request submitted: ${requestId}`);

  // Step 2: Poll for results
  while (true) {
    await new Promise(resolve => setTimeout(resolve, 5000));

    const statusResponse = await fetch(
      `${BASE_URL}/company/discovery?request_id=${requestId}`,
      { headers }
    );
    const statusData = await statusResponse.json();
    const status = statusData.data.status;
    console.log(`Status: ${status}`);

    if (status === "completed") {
      const discovery = statusData.data.result;
      console.log(`Found ${discovery.results_count} companies`);
      discovery.results.forEach(company => {
        console.log(`  - ${company.name}: ${company.description}`);
        if (company.extractions) {
          Object.entries(company.extractions).forEach(([key, value]) => {
            console.log(`    ${key}: ${value}`);
          });
        }
      });
      return statusData;
    } else if (status === "failed") {
      console.error("Discovery failed:", statusData.data.error || "Unknown error");
      return statusData;
    }
  }
}

discoverCompanies();
```

**PHP:**

```php
<?php

$apiKey = "YOUR_API_KEY";
$apiSecret = "YOUR_API_SECRET";
$baseUrl = "https://api.nyne.ai";

// Step 1: Submit discovery request
$payload = [
    "query" => "AI startups in healthcare that have raised Series A funding",
    "requirements" => [
        ["name" => "series_a_funded", "description" => "Has raised a Series A funding round"],
        ["name" => "healthcare_focus", "description" => "Primary business focus is in healthcare"]
    ],
    "extract" => [
        ["name" => "ceo_name", "description" => "Name of the CEO or founder"],
        ["name" => "funding_amount", "description" => "Total funding amount raised"]
    ],
    "limit" => 10,
    "quality" => "standard"
];

$ch = curl_init("$baseUrl/company/discovery");
curl_setopt_array($ch, [
    CURLOPT_POST => true,
    CURLOPT_POSTFIELDS => json_encode($payload),
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_HTTPHEADER => [
        "X-API-Key: $apiKey",
        "X-API-Secret: $apiSecret",
        "Content-Type: application/json"
    ]
]);

$response = curl_exec($ch);
curl_close($ch);

$data = json_decode($response, true);

if (!$data["success"]) {
    echo "Error: " . print_r($data, true);
    exit(1);
}

$requestId = $data["data"]["request_id"];
echo "Request submitted: $requestId\n";

// Step 2: Poll for results
while (true) {
    sleep(5);

    $ch = curl_init("$baseUrl/company/discovery?request_id=$requestId");
    curl_setopt_array($ch, [
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_HTTPHEADER => [
            "X-API-Key: $apiKey",
            "X-API-Secret: $apiSecret"
        ]
    ]);

    $response = curl_exec($ch);
    curl_close($ch);

    $result = json_decode($response, true);
    $status = $result["data"]["status"];
    echo "Status: $status\n";

    if ($status === "completed") {
        $discovery = $result["data"]["result"];
        echo "Found " . $discovery["results_count"] . " companies\n";
        foreach ($discovery["results"] as $company) {
            echo "  - " . $company["name"] . ": " . $company["description"] . "\n";
            if (isset($company["extractions"])) {
                foreach ($company["extractions"] as $key => $value) {
                    echo "    $key: $value\n";
                }
            }
        }
        break;
    } elseif ($status === "failed") {
        echo "Discovery failed: " . ($result["data"]["error"] ?? "Unknown error") . "\n";
        break;
    }
}

?>
```

### Response Codes

| Code | Description |
|------|-------------|
| 202 | Discovery request accepted and queued for processing |
| 400 | Invalid request parameters |
| 401 | Invalid API credentials |
| 402 | Insufficient credits |
| 429 | Rate limit exceeded |

### Response Example

```json
{
  "success": true,
  "data": {
    "request_id": "a1b2c3d4e5f6...",
    "status": "pending",
    "message": "Discovery request queued for processing. Use GET with request_id to check status.",
    "created_on": "2026-02-19T10:30:00"
  },
  "timestamp": "2026-02-19T10:30:00Z"
}
```

---

## GET /company/discovery
Check the status of a request.

### Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `request_id` | string | Yes | The request ID returned from the POST discovery request |

### Request Example

```
GET /company/discovery?request_id=a1b2c3d4e5f6...
```

### Response Codes

| Code | Description |
|------|-------------|
| 200 | Status retrieved successfully (includes results when completed) |
| 400 | Invalid request parameters |
| 401 | Invalid API credentials |
| 404 | Request ID not found |

### Response Example

```json
{
  "success": true,
  "data": {
    "request_id": "a1b2c3d4e5f6...",
    "status": "completed",
    "completed": true,
    "completed_on": "2026-02-19T10:33:00",
    "created_on": "2026-02-19T10:30:00",
    "result": {
      "entity_type": "companies",
      "query": "AI startups in healthcare that have raised Series A funding",
      "results_count": 10,
      "results": [
        {
          "name": "MedAI Corp",
          "url": "medai.com",
          "description": "AI-powered medical diagnostics startup specializing in early disease detection using machine learning",
          "match_status": "matched",
          "evaluations": {
            "series_a_funded": {
              "value": "yes",
              "matched": true
            },
            "healthcare_focus": {
              "value": "yes",
              "matched": true
            }
          },
          "extractions": {
            "ceo_name": "John Smith",
            "funding_amount": "$15M Series A"
          },
          "sources": [
            {
              "field": "series_a_funded",
              "reasoning": "Found Series A announcement from 2025 with confirmed funding round",
              "confidence": "high",
              "citations": [
                {
                  "title": "TechCrunch - MedAI Corp Raises Series A",
                  "url": "https://techcrunch.com/2025/medai-series-a",
                  "excerpts": ["MedAI Corp announced a $15M Series A round led by Andreessen Horowitz..."]
                }
              ]
            },
            {
              "field": "healthcare_focus",
              "reasoning": "Company website and press coverage confirm healthcare AI as primary focus",
              "confidence": "high",
              "citations": [
                {
                  "title": "MedAI Corp - About Us",
                  "url": "https://medai.com/about",
                  "excerpts": ["MedAI Corp is building AI-powered diagnostic tools for healthcare providers"]
                }
              ]
            }
          ]
        }
      ],
      "metrics": {
        "candidates_evaluated": 50,
        "candidates_matched": 10
      }
    }
  },
  "timestamp": "2026-02-19T10:33:00Z"
}
```

---

## Response Format

All API responses follow a consistent JSON format. All fields in enrichment results are **optional** and only included when data is available:
- Results are ranked by match quality; the best matches appear first- The evaluations object contains one entry per requirement, keyed by the requirement name- The extractions object contains one entry per extract field, keyed by the extract field name- The sources array provides per-field reasoning, confidence levels (high, medium, low), and citation URLs with excerpts- The metrics object shows how many candidates were evaluated vs. matched- If a field has no data, it may be omitted from the response- Processing stages: pending &rarr; searching &rarr; completed (or failed)
---

## Error Codes

| HTTP Code | Error Code | Description |
|-----------|------------|-------------|
| 400 | `INVALID_PARAMETERS` | Missing or invalid request parameters |
| 400 | `MISSING_PARAMETER` | Required parameter not provided (e.g., query) |
| 401 | `AUTHENTICATION_FAILED` | Invalid or missing API credentials |
| 402 | `INSUFFICIENT_CREDITS` | Not enough credits (requires 10 credits per request) |
| 403 | `NO_ACTIVE_SUBSCRIPTION` | No active subscription found |
| 403 | `PRODUCT_NOT_AVAILABLE` | Discovery not available on your plan |
| 403 | `ACCESS_DENIED` | Not authorized to access this resource |
| 404 | `NOT_FOUND` | Request not found (invalid request_id) |
| 429 | `RATE_LIMIT_EXCEEDED` | Too many requests, please slow down |
| 500 | `QUEUE_ERROR` | Failed to queue request for processing |

---

## Callbacks

When you provide a `callback_url`, the API sends results to your endpoint via HTTP POST.

### Retry Policy- Maximum 5 retry attempts- Exponential backoff: 1s, 5s, 15s, 1m, 5m- 30-second timeout per request- Your endpoint should return HTTP 2xx for success
Your callback endpoint should respond with HTTP 200-299. Any other code triggers a retry.

### Callback Payload Example

```json
{
  "request_id": "a1b2c3d4e5f6...",
  "status": "completed",
  "completed": true,
  "result": {
    "entity_type": "companies",
    "query": "AI startups in healthcare that have raised Series A funding",
    "results_count": 10,
    "results": [...],
    "metrics": {
      "candidates_evaluated": 50,
      "candidates_matched": 10
    }
  }
}
```

---

## Best Practices

Prioritize inputs in this order for best match rates:
1. **Write Specific Queries** (Best) - Be as specific as possible in your natural language query. Instead of "tech companies", try "B2B SaaS companies in cybersecurity with 50-200 employees based in the US". Specific queries yield higher-quality, more relevant results.2. **Use Requirements for Filtering** (Best) - Define match requirements to filter candidates against specific criteria. Each requirement gets an independent evaluation with evidence, making it easy to validate matches programmatically.3. **Set Appropriate Limits** (Good) - Start with smaller limits (10-20) for exploratory queries and increase when you need broader coverage. Larger limits increase processing time.4. **Use Exclusion Lists** (Good) - Exclude known companies or competitors from results using the exclude parameter. This prevents duplicates when running iterative discovery campaigns.5. **Leverage Callbacks** (Good) - Use callback_url instead of polling for production workflows. Callbacks deliver results as soon as processing completes without the overhead of repeated status checks.
---

## Note for AI Agents
The Company Discovery API is well-suited for AI agent workflows. Submit discovery requests with structured requirements and extract fields, then process results programmatically using the match_status and evaluations fields. Use the confidence field in source citations to assess data reliability. For automated pipelines, always use callback_url rather than polling.
---

## Related APIs
- **Company Search API** - Structured company search -> `https://api.nyne.ai/documentation/company/search`
- **Company Enrichment API** - Enrich discovered companies -> `https://api.nyne.ai/documentation/company/enrichment`