DeepSeek API Guide: How to Use DeepSeek's Reasoning Model in Your App

šŸ“… June 15, 2025 ā±ļø 18 min read šŸ‘¤ Haotokai Team

The rise of reasoning-capable large language models has opened up entirely new possibilities for application development. Among these, DeepSeek's R1 model stands out as one of the most capable reasoning models available today, often matching or exceeding the performance of much more expensive alternatives. In this comprehensive guide, we'll explore how to integrate DeepSeek's API into your applications, with a special focus on the powerful R1 reasoning model.

Table of Contents

What is DeepSeek?

DeepSeek is a Chinese AI research company that has rapidly gained recognition in the global AI community for producing high-performance large language models at competitive prices. Founded in 2023, DeepSeek has released several model families including:

What makes DeepSeek particularly attractive to developers is its aggressive pricing strategy combined with impressive performance. For many use cases, DeepSeek models deliver comparable or better results than industry leaders at a fraction of the cost.

šŸ’” Developer Insight

DeepSeek's pricing starts at just $0.14 per million tokens for input and $0.28 per million tokens for output on their R1 reasoning model — significantly cheaper than comparable reasoning models from other providers. This makes it ideal for cost-conscious developers building reasoning-heavy applications.

Understanding the DeepSeek R1 Reasoning Model

What Makes R1 Special?

The DeepSeek-R1 is a reasoning-enhanced model that employs chain-of-thought reasoning natively. Unlike standard LLMs that generate answers directly, R1 produces intermediate reasoning steps before arriving at a final answer. This approach dramatically improves performance on:

How Reasoning Models Work

Standard LLMs predict the next token based on patterns learned during training. While this works well for many tasks, it struggles with problems that require sequential reasoning because the model doesn't "think through" intermediate steps.

Reasoning models like R1 use a different approach — they generate a reasoning trace (or "chain of thought") before producing the final answer. This allows the model to:

  1. Break down complex problems into smaller sub-problems
  2. Verify each step before moving forward
  3. Self-correct when errors are detected
  4. Handle multi-step dependencies more effectively

R1 Performance Benchmarks

DeepSeek-R1 has demonstrated impressive performance across multiple reasoning benchmarks:

Benchmark DeepSeek-R1 GPT-4o Claude 3.5 Sonnet
GSM8K (Math) 92.4% 90.2% 92.0%
MATH 78.5% 76.6% 71.1%
HumanEval (Code) 90.2% 90.2% 92.0%
GPQA (Science) 71.5% 71.8% 65.0%

These numbers show that R1 competes favorably with top-tier models, especially in mathematical reasoning — and it does so at a much lower price point.

Getting Started with DeepSeek API

Prerequisites

Before you can start using the DeepSeek API, you'll need:

API Key Setup

Once you have your API key, it's best practice to store it as an environment variable rather than hardcoding it:

# Set environment variable (Linux/macOS)
export DEEPSEEK_API_KEY="your-api-key-here"

# Windows PowerShell
$env:DEEPSEEK_API_KEY = "your-api-key-here"

Alternative: Use Haotokai for Easier Access

If you want to avoid managing multiple API keys and benefit from unified billing, you can access DeepSeek models through Haotokai. Haotokai provides a single API endpoint for multiple AI models (including DeepSeek R1) and supports PayPal payments — making it especially convenient for international developers.

DeepSeek API Reference & Key Endpoints

Base URL

The DeepSeek API base URL is:

https://api.deepseek.com/v1

If using Haotokai, the base URL is:

https://www.haotokai.com/v1

Available Models

Model Name Description Context Window
deepseek-reasoner R1 reasoning model 64K tokens
deepseek-chat V2 general-purpose model 64K tokens
deepseek-coder Code-specialized model 128K tokens

Chat Completions Endpoint

The primary endpoint for interacting with DeepSeek models is the chat completions endpoint:

POST /chat/completions

Key request parameters include:

āš ļø Important Note About R1

The deepseek-reasoner model returns reasoning content in the response. When using streaming mode, you'll receive reasoning tokens separately from the final answer content. Make sure your client code handles both the reasoning_content and content fields in the response.

Code Examples: Building with DeepSeek API

Basic Python Example with Requests

Here's a simple example of calling the DeepSeek API using Python's requests library:

import os
import requests

def call_deepseek_reasoner(prompt, api_key=None):
    """
    Call DeepSeek R1 reasoning model with a prompt.
    """
    api_key = api_key or os.getenv("DEEPSEEK_API_KEY")
    
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    
    payload = {
        "model": "deepseek-reasoner",
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful AI assistant with strong reasoning abilities."
            },
            {
                "role": "user",
                "content": prompt
            }
        ],
        "temperature": 0.7,
        "max_tokens": 2048
    }
    
    response = requests.post(
        "https://api.deepseek.com/v1/chat/completions",
        headers=headers,
        json=payload
    )
    
    if response.status_code == 200:
        result = response.json()
        return {
            "content": result["choices"][0]["message"]["content"],
            "reasoning_content": result["choices"][0]["message"].get("reasoning_content", ""),
            "usage": result["usage"]
        }
    else:
        raise Exception(f"API Error: {response.status_code} - {response.text}")

# Example usage
if __name__ == "__main__":
    problem = "Solve: If x + y = 10 and x - y = 4, what is x * y?"
    result = call_deepseek_reasoner(problem)
    
    print("=== Reasoning ===")
    print(result["reasoning_content"])
    print("\n=== Final Answer ===")
    print(result["content"])
    print(f"\n=== Tokens Used: {result['usage']['total_tokens']} ===")

Using Haotokai Unified API

If you're using Haotokai to access DeepSeek (and other models), the code is nearly identical — just change the base URL and use your Haotokai API key:

import os
import requests

def call_haotokai_model(prompt, model="deepseek-reasoner", api_key=None):
    """
    Call any AI model through Haotokai's unified API.
    Supports DeepSeek, Qwen, Claude, and more with one API key.
    """
    api_key = api_key or os.getenv("HAOTOKAI_API_KEY")
    
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    
    payload = {
        "model": model,
        "messages": [
            {"role": "user", "content": prompt}
        ],
        "temperature": 0.7,
        "stream": False
    }
    
    response = requests.post(
        "https://www.haotokai.com/v1/chat/completions",
        headers=headers,
        json=payload
    )
    
    if response.status_code == 200:
        return response.json()["choices"][0]["message"]["content"]
    else:
        raise Exception(f"Error: {response.status_code} - {response.text}")

# Try different models with the same code
models = ["deepseek-reasoner", "qwen2.5-72b-instruct", "claude-3-sonnet-20240229"]

for model in models:
    print(f"\n--- Testing {model} ---")
    answer = call_haotokai_model("What is 2^10?", model=model)
    print(f"Answer: {answer[:100]}...")

Streaming Response Example

For real-time applications, you'll want to use streaming. Here's how to handle streaming responses from DeepSeek R1:

import os
import requests
import json

def stream_deepseek_reasoning(prompt, api_key=None):
    """
    Stream reasoning and answer from DeepSeek R1 in real-time.
    """
    api_key = api_key or os.getenv("DEEPSEEK_API_KEY")
    
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
        "Accept": "text/event-stream"
    }
    
    payload = {
        "model": "deepseek-reasoner",
        "messages": [{"role": "user", "content": prompt}],
        "stream": True
    }
    
    response = requests.post(
        "https://api.deepseek.com/v1/chat/completions",
        headers=headers,
        json=payload,
        stream=True
    )
    
    reasoning = []
    answer = []
    is_reasoning_complete = False
    
    for line in response.iter_lines():
        if line:
            line = line.decode('utf-8')
            if line.startswith('data: '):
                data = line[6:]
                if data == '[DONE]':
                    break
                try:
                    chunk = json.loads(data)
                    delta = chunk["choices"][0]["delta"]
                    
                    if "reasoning_content" in delta and delta["reasoning_content"]:
                        reasoning.append(delta["reasoning_content"])
                        print(f"\ršŸ¤” Reasoning: {''.join(reasoning)[-50:]}", end="", flush=True)
                    elif "content" in delta and delta["content"]:
                        if not is_reasoning_complete:
                            print("\n\nāœ… Answer: ", end="", flush=True)
                            is_reasoning_complete = True
                        answer.append(delta["content"])
                        print(delta["content"], end="", flush=True)
                except json.JSONDecodeError:
                    pass
    
    return {
        "reasoning": ''.join(reasoning),
        "answer": ''.join(answer)
    }

Function Calling with DeepSeek

DeepSeek supports function calling (tools), which allows you to connect the model to external systems and data sources. Here's an example:

import os
import requests
import json

def call_deepseek_with_tools(prompt, api_key=None):
    api_key = api_key or os.getenv("DEEPSEEK_API_KEY")
    
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    
    tools = [
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get the current weather for a location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "city": {
                            "type": "string",
                            "description": "The city name"
                        },
                        "unit": {
                            "type": "string",
                            "enum": ["celsius", "fahrenheit"],
                            "default": "celsius"
                        }
                    },
                    "required": ["city"]
                }
            }
        }
    ]
    
    payload = {
        "model": "deepseek-chat",
        "messages": [{"role": "user", "content": prompt}],
        "tools": tools,
        "tool_choice": "auto"
    }
    
    response = requests.post(
        "https://api.deepseek.com/v1/chat/completions",
        headers=headers,
        json=payload
    )
    
    result = response.json()
    message = result["choices"][0]["message"]
    
    if "tool_calls" in message:
        for tool_call in message["tool_calls"]:
            print(f"Tool call: {tool_call['function']['name']}")
            print(f"Arguments: {tool_call['function']['arguments']}")
    
    return result

Best Practices for Reasoning Model Integration

1. Design Effective Prompts for Reasoning

While R1 is natively good at reasoning, prompt engineering still matters. Follow these guidelines:

2. Handle Reasoning Content Properly

When working with R1, remember that the reasoning content is separate from the final answer. In your application:

3. Optimize for Cost

Reasoning models use more tokens than standard models because they generate intermediate steps. Here's how to manage costs:

šŸ’” Cost-Saving Tip

With Haotokai, you can route simple queries to cheaper models like Qwen and complex reasoning tasks to DeepSeek R1 — all through the same API endpoint. This hybrid approach can significantly reduce your overall AI costs while maintaining quality.

4. Error Handling & Retries

Always implement proper error handling:

import time
import requests

def robust_api_call(url, headers, payload, max_retries=3, base_delay=1):
    """
    Robust API call with exponential backoff retry logic.
    """
    for attempt in range(max_retries):
        try:
            response = requests.post(url, headers=headers, json=payload, timeout=60)
            
            if response.status_code == 200:
                return response.json()
            elif response.status_code == 429:
                # Rate limit hit — wait and retry
                retry_after = int(response.headers.get('Retry-After', base_delay * (2 ** attempt)))
                time.sleep(retry_after)
            elif response.status_code in [500, 502, 503, 504]:
                # Server errors — retry with exponential backoff
                time.sleep(base_delay * (2 ** attempt))
            else:
                # Client errors — don't retry
                raise Exception(f"Client error: {response.status_code}")
                
        except requests.exceptions.Timeout:
            time.sleep(base_delay * (2 ** attempt))
        except requests.exceptions.RequestException as e:
            if attempt == max_retries - 1:
                raise e
            time.sleep(base_delay * (2 ** attempt))
    
    raise Exception("Max retries exceeded")

DeepSeek R1 vs Other Reasoning Models

How does DeepSeek R1 stack up against other reasoning-capable models? Let's compare:

Model Input Cost (/M tokens) Output Cost (/M tokens) Context Window Best For
DeepSeek R1 $0.14 $0.28 64K Cost-effective reasoning
OpenAI o1 $15.00 $60.00 128K Premium reasoning
Claude 3.5 Sonnet $3.00 $15.00 200K Balanced performance
Qwen 2.5 72B $0.50 $1.50 128K Chinese + English
Groq Llama 3.3 70B $0.59 $0.79 128K Fast inference

As you can see, DeepSeek R1 offers exceptional value — especially for developers who need strong reasoning capabilities without the premium price tag.

Simplifying DeepSeek Integration with Haotokai

While DeepSeek's direct API is powerful, using it through Haotokai offers several advantages for developers:

One API Key, Multiple Models

With Haotokai, you get access to DeepSeek R1 alongside Qwen, Claude, Llama, and other top models — all with a single API key and unified billing. No more managing accounts across multiple platforms.

PayPal Support

Haotokai supports PayPal payments, making it easy for international developers to get started without needing a Chinese payment method or dealing with currency conversion hassles.

OpenAI-Compatible API

Haotokai uses an OpenAI-compatible API format, which means you can easily switch from other providers or use existing OpenAI SDKs and tools with minimal code changes.

Competitive Pricing

Haotokai offers competitive rates on all models, often with volume discounts for heavy users. You'll get the same great DeepSeek pricing plus the convenience of a unified platform.

Getting Started with Haotokai

  1. Visit haotokai.com and create an account
  2. Top up your balance using PayPal (or other payment methods)
  3. Copy your API key from the dashboard
  4. Start building — it's that simple!

Ready to Build with DeepSeek R1?

Get started with Haotokai today — access DeepSeek R1 and 50+ other AI models with one API key, pay with PayPal, and build amazing reasoning-powered applications.

Try Haotokai Free →

Conclusion

DeepSeek's R1 reasoning model represents a significant leap forward in accessible AI reasoning capabilities. With performance comparable to much more expensive alternatives and a developer-friendly API, it's an excellent choice for building applications that require logical thinking, mathematical problem-solving, or complex analysis.

Whether you choose to use DeepSeek's API directly or access it through a unified platform like Haotokai, the key is to start experimenting. Reasoning models open up entirely new categories of applications — from intelligent tutoring systems to automated code review to scientific research assistants.

The future of AI isn't just about bigger models — it's about models that can think. And with DeepSeek R1, that future is more accessible than ever.