scheLLMa¶

Schemas for LLMs and Structured Output

Converts Pydantic models/JSON Schemas to clean, simplified type definitions perfect for generating structured output with LLM prompts.

Unlike verbose JSON Schema formats, scheLLMa produces readable, concise type definitions that are ideal for language model interactions and structured output generation:

Reduce token usage - Concise format saves on API costs
Minimize parsing errors - Simple syntax is easier for models to parse, less verbose than JSON Schema, reducing confusion
Stay readable - Human-friendly format for prompt engineering

Pydantic

class ProductModel(BaseModel):
    """Product with comprehensive field constraints."""

    # String constraints
    name: str = Field(min_length=3, max_length=100, description="Product name")
    sku: str = Field(pattern=r"^[A-Z]{3}-\d{4}$", description="Product SKU")
    email: str = Field(pattern=r"^[^@]+@[^@]+\.[^@]+$", description="Contact email")

    # Numeric constraints
    price: float = Field(ge=0.01, le=999999.99, description="Product price")
    quantity: int = Field(ge=1, description="Stock quantity")
    discount: float = Field(multiple_of=0.05, description="Discount percentage")

    # Array constraints
    categories: list[str] = Field(
        min_length=1, max_length=5, description="Product categories"
    )
    tags: set[str] = Field(description="Unique product tags")

ScheLLMa vs JSON Schema

scheLLMaJSON Schema

{
    // Product name, length: 3-100, required
    "name": string,
    // Product SKU, pattern: ^[A-Z]{3}-\d{4}$, required
    "sku": string,
    // Contact email, format: email, required
    "email": string,
    // Product price, range: 0.01-999999.99, required
    "price": number,
    // Stock quantity, minimum: 1, required
    "quantity": int,
    // Discount percentage, multipleOf: 0.05 (5% increments), required
    "discount": number,
    // Product categories, items: 1-5, required
    "categories": string[],
    // Unique product tags, uniqueItems: true, required
    "tags": string[],
}

{
    "description": "Product with comprehensive field constraints.",
    "properties": {
        "name": {
        "description": "Product name",
        "maxLength": 100,
        "minLength": 3,
        "title": "Name",
        "type": "string"
        },
        "sku": {
        "description": "Product SKU",
        "pattern": "^[A-Z]{3}-\\d{4}$",
        "title": "Sku",
        "type": "string"
        },
        "email": {
        "description": "Contact email",
        "pattern": "^[^@]+@[^@]+\\.[^@]+$",
        "title": "Email",
        "type": "string"
        },
        "price": {
        "description": "Product price",
        "maximum": 999999.99,
        "minimum": 0.01,
        "title": "Price",
        "type": "number"
        },
        "quantity": {
        "description": "Stock quantity",
        "minimum": 1,
        "title": "Quantity",
        "type": "integer"
        },
        "discount": {
        "description": "Discount percentage",
        "multipleOf": 0.05,
        "title": "Discount",
        "type": "number"
        },
        "categories": {
        "description": "Product categories",
        "items": {
            "type": "string"
        },
        "maxItems": 5,
        "minItems": 1,
        "title": "Categories",
        "type": "array"
        },
        "tags": {
        "description": "Unique product tags",
        "items": {
            "type": "string"
        },
        "title": "Tags",
        "type": "array",
        "uniqueItems": true
        }
    },
    "required": [
        "name",
        "sku",
        "email",
        "price",
        "quantity",
        "discount",
        "categories",
        "tags"
    ],
    "title": "ProductModel",
    "type": "object"
}

Checkout the demo for more examples!

Combine it with parsing libs, like openai sdk or Instructor for AWESOME results!

Features¶

🤖 Optimized for LLM prompts - Clean, readable type definitions
💰 Token-efficient - Reduces LLM API costs
🎯 Support for all common Python types (str, int, bool, datetime, etc.)
🏗️ Handle complex nested structures and collections - Strong support for Pydantic model types
🔗 Support for enums, optional types, and unions - Properly extract and display union types
⚙️ Customizable output formatting - Indentation, compact mode, and more
🎨 Rich Default Values - Automatically shows default values in human-readable comments
📏 Smart Constraints - Displays field constraints (length, range, patterns) in clear language
✅ Clear Field Status - Explicit required/optional marking
📚 Rich Examples - Inline examples and documentation for better LLM understanding
🔀 Advanced Union Types - Full support for allOf, not constraints, and discriminated unions
🔢 Advanced Arrays - Contains constraints, minContains/maxContains, and enhanced tuple support

Quick Start¶

View the demo for more examples and features!

from pydantic import BaseModel
from schellma import schellma
import openai

class TaskRequest(BaseModel):
    title: str
    priority: int
    tags: list[str]
    due_date: str | None = None

# Generate schema for LLM prompt
schema = schellma(TaskRequest)

# Add the scheLLMa schema to the prompt
prompt = f"""
Please create a task with the following structure:

{schema}
"""
print(prompt)

# Use with your favorite LLM API
completion = openai.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[{
        "role": "user",
        "content": prompt
    }]
)

content = completion.choices[0].message.content
print(content)

task = TaskRequest.model_validate_json(clean_content(content))
print(task)
# TaskRequest(title='Task 1', priority=1, tags=['tag1', 'tag2'], due_date=None)

Using the new openai chat.completions.parse API¶

# or directly parse with openai sdk
completion = openai.chat.completions.parse(
    model="gpt-4.1-mini",
    messages=[{
        "role": "user",
        "content": prompt
    }]
)
task = completion.choices[0].message.parsed
print(task)
# TaskRequest(title='Task 1', priority=1, tags=['tag1', 'tag2'], due_date=None)

Using the new openai Responses API¶

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

schema = schellma(CalendarEvent)

response = openai.responses.parse(
    model="gpt-4o-2024-08-06",
    input=[
        # Make sure to include the schema in your prompt
        {"role": "system", "content": f"Extract the event information. {schema}"},
        {
            "role": "user",
            "content": "Alice and Bob are going to a science fair on Friday.",
        },
    ],
    text_format=CalendarEvent,
)

event = response.output_parsed
print(event)
# CalendarEvent(name='Alice and Bob are going to a science fair on Friday.', date='Friday', participants=['Alice', 'Bob'])

Installation¶

pip install schellma

Or using uv:

uv add schellma

Install from github

uv add git+https://github.com/andrader/schellma.git

Comparison with JSON Schema¶

JSON Schema (verbose, token-heavy):

{
  "type": "object",
  "properties": {
    "name": { "type": "string" },
    "age": { "type": "integer" },
    "email": { "type": ["string", "null"], "default": null }
  },
  "required": ["name", "age"],
  "additionalProperties": false
}

scheLLMa (clean, token-efficient):

{
    "name": string,
    "age": int,
    "email": string | null,
}

Advanced Usage with Type Definitions¶

from pydantic import BaseModel
from typing import List, Optional
from schellma import schellma

class Address(BaseModel):
    street: str
    city: str
    country: str

class User(BaseModel):
    name: str
    age: int
    addresses: List[Address]
    primary_address: Optional[Address] = None

# Generate with separate type definitions
schema = schellma(User, define_types=True)
print(schema)

Output:

Address {
    "street": string,
    "city": string,
    "country": string,
}

{
    "name": string,
    "age": int,
    "addresses": Address[],
    "primary_address": Address | null,
}

Examples¶

Enum Support¶

from enum import Enum
from pydantic import BaseModel

class Status(Enum):
    ACTIVE = "active"
    INACTIVE = "inactive"

class Task(BaseModel):
    title: str
    status: Status

schema = schellma(Task)
# Output: { "title": string, "status": "active" | "inactive" }

Complex Nested Structures¶

from pydantic import BaseModel
from typing import Dict, List

class Tag(BaseModel):
    name: str
    color: str

class Post(BaseModel):
    title: str
    content: str
    tags: List[Tag]
    metadata: Dict[str, str]

schema = schellma(Post, define_types=True)

Development¶

Setup¶

git clone https://github.com/andrader/schellma.git
cd schellma
uv sync --dev

Running Tests¶

uv run python -m pytest

Type Checking¶

uv run mypy src/schellma/

Linting¶

uv run ruff check src/schellma/
uv run ruff format src/schellma/

Contributing¶

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Development Guidelines¶

Follow the existing code style (enforced by ruff)
Add tests for any new functionality
Update documentation as needed
Ensure all tests pass and type checking succeeds

License¶

This project is licensed under the MIT License - see the LICENSE file for details.

Changelog¶

See Changelog for the changelog.