Deploying scripts as an API in Azure

Deploying local python scripts and converting them as an API.

Deploying a Python Script as an API on Azure

Goal: Take a local PDF metadata extraction script and deploy it as a production-ready REST API on Azure.


πŸ“‹ What This Blog Covers

flowchart LR A["πŸ“„ Local Python Script"] --> B["🌐 REST API
(FastAPI/Flask)"] B --> C["🐳 Docker Container"] C --> D["☁️ Azure Deployment"] D --> E["πŸ”’ Auth + Monitoring"]
  1. Sample local script to extract content from PDF
  2. Available options to deploy the script as an API
  3. Designing the deployment architecture
  4. Implementation and testing

Script resources: GitHub Repo


πŸ”§ Background & Prerequisites

1. PDF Content Extraction β€” The Script

Library Strengths Best For
PyMuPDF (fitz) Fastest, handles complex layouts General text extraction
pdfplumber Excellent table extraction Tabular data
PyPDF2 Lightweight, basic extraction Simple PDFs
Tesseract OCR Open-source OCR for scanned PDFs Image-based PDFs
Azure Doc Intelligence Cloud-based, layout analysis Enterprise extraction

Key extraction targets:


2. API Design

sequenceDiagram participant Client participant API as FastAPI Server participant Extractor as PDF Extractor Client->>API: POST /api/extract
(multipart/form-data) API->>API: Validate file type & size API->>Extractor: Extract text + metadata Extractor-->>API: Structured result API-->>Client: 200 JSON response
{text, metadata, pages, time_ms}
Aspect Design Decision
Endpoint POST /api/extract β€” upload PDF, get extracted data
Input multipart/form-data file upload (or URL download)
Output JSON: {text, metadata, page_count, tables, processing_time_ms}
Validation File type check, size limits, malware considerations
Status Codes 200 success, 400 bad request, 413 too large, 500 error
Docs OpenAPI/Swagger auto-generated (built into FastAPI)
Auth API key in header or Azure Entra ID token

3. Framework Choice β€” Flask vs FastAPI

Feature Flask FastAPI
Style WSGI (sync) ASGI (async)
Auto docs Extension needed Built-in Swagger UI
Validation Manual Pydantic models
Performance Good Excellent
Learning curve Minimal Minimal
Verdict βœ… Use if integrating into existing Flask app βœ… Recommended for new APIs

4. Containerization with Docker

# Example multi-stage Dockerfile
FROM python:3.12-slim AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

FROM python:3.12-slim
WORKDIR /app
COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages
COPY . .
EXPOSE 8000
HEALTHCHECK CMD curl -f http://localhost:8000/health || exit 1
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

πŸ’‘ Tips: Use multi-stage builds to reduce image size. Install system deps like poppler-utils for pdftotext. Pass secrets via environment variables β€” never hardcode.


5. Azure Deployment Options

graph TD Script["🐍 Python Script"] --> Container["🐳 Docker Image"] Container --> ACR["πŸ“¦ Azure Container Registry"] ACR --> ACA["⭐ Container Apps
(Recommended)"] ACR --> AppService["🌐 App Service"] ACR --> Functions["⚑ Azure Functions"] ACR --> ACI["πŸ“¦ Container Instances"] ACR --> AKS["☸️ AKS"]
Option Scale to Zero Complexity Best For Monthly Cost
Container Apps ⭐ βœ… Low Variable-traffic APIs Pay per use
App Service ❌ Low Steady-traffic APIs ~$13+ (B1)
Azure Functions βœ… Low Infrequent calls Free tier available
Container Instances ❌ Minimal Testing/one-off jobs Pay per second
AKS ❌ High Multi-service architectures $$$$

⭐ Recommendation: Azure Container Apps β€” best balance of simplicity, cost, and zero-to-scale capabilities.


6. CI/CD Pipeline

flowchart LR Push["πŸ“€ Git Push"] --> Build["πŸ—οΈ GitHub Actions"] Build --> Image["🐳 Build Docker Image"] Image --> ACR["πŸ“¦ Push to ACR"] ACR --> Deploy["πŸš€ Deploy to
Container Apps"]

βœ… TODO β€” Remaining Work

# Task Priority
1 Write PDF extraction script (PyMuPDF: metadata + text + tables) πŸ”΄ High
2 Wrap in FastAPI with endpoints, validation, error handling πŸ”΄ High
3 Add OpenAPI/Swagger documentation πŸ”΄ High
4 Write Dockerfile and test locally πŸ”΄ High
5 Push image to Azure Container Registry 🟑 Medium
6 Deploy to Azure Container Apps 🟑 Medium
7 Set up GitHub Actions CI/CD pipeline 🟑 Medium
8 Add authentication (API key or Azure Entra ID) 🟑 Medium
9 Load test with sample PDFs and document performance 🟒 Low
10 Create full architecture diagram with all components 🟒 Low

Available options to deploy the script as an API

Designing the architecture to deploy the script

Implementation and testing API

Back to Blog About the Author