scheLLMa¶
Schemas for LLMs and Structured Output
Converts Pydantic models/JSON Schemas to clean, simplified type definitions perfect for generating structured output with LLM prompts.
Unlike verbose JSON Schema formats, scheLLMa produces readable, concise type definitions that are ideal for language model interactions and structured output generation:
- Reduce token usage - Concise format saves on API costs
- Minimize parsing errors - Simple syntax is easier for models to parse, less verbose than JSON Schema, reducing confusion
- Stay readable - Human-friendly format for prompt engineering
class ProductModel(BaseModel):
"""Product with comprehensive field constraints."""
# String constraints
name: str = Field(min_length=3, max_length=100, description="Product name")
sku: str = Field(pattern=r"^[A-Z]{3}-\d{4}$", description="Product SKU")
email: str = Field(pattern=r"^[^@]+@[^@]+\.[^@]+$", description="Contact email")
# Numeric constraints
price: float = Field(ge=0.01, le=999999.99, description="Product price")
quantity: int = Field(ge=1, description="Stock quantity")
discount: float = Field(multiple_of=0.05, description="Discount percentage")
# Array constraints
categories: list[str] = Field(
min_length=1, max_length=5, description="Product categories"
)
tags: set[str] = Field(description="Unique product tags")
ScheLLMa vs JSON Schema
{
// Product name, length: 3-100, required
"name": string,
// Product SKU, pattern: ^[A-Z]{3}-\d{4}$, required
"sku": string,
// Contact email, format: email, required
"email": string,
// Product price, range: 0.01-999999.99, required
"price": number,
// Stock quantity, minimum: 1, required
"quantity": int,
// Discount percentage, multipleOf: 0.05 (5% increments), required
"discount": number,
// Product categories, items: 1-5, required
"categories": string[],
// Unique product tags, uniqueItems: true, required
"tags": string[],
}
{
"description": "Product with comprehensive field constraints.",
"properties": {
"name": {
"description": "Product name",
"maxLength": 100,
"minLength": 3,
"title": "Name",
"type": "string"
},
"sku": {
"description": "Product SKU",
"pattern": "^[A-Z]{3}-\\d{4}$",
"title": "Sku",
"type": "string"
},
"email": {
"description": "Contact email",
"pattern": "^[^@]+@[^@]+\\.[^@]+$",
"title": "Email",
"type": "string"
},
"price": {
"description": "Product price",
"maximum": 999999.99,
"minimum": 0.01,
"title": "Price",
"type": "number"
},
"quantity": {
"description": "Stock quantity",
"minimum": 1,
"title": "Quantity",
"type": "integer"
},
"discount": {
"description": "Discount percentage",
"multipleOf": 0.05,
"title": "Discount",
"type": "number"
},
"categories": {
"description": "Product categories",
"items": {
"type": "string"
},
"maxItems": 5,
"minItems": 1,
"title": "Categories",
"type": "array"
},
"tags": {
"description": "Unique product tags",
"items": {
"type": "string"
},
"title": "Tags",
"type": "array",
"uniqueItems": true
}
},
"required": [
"name",
"sku",
"email",
"price",
"quantity",
"discount",
"categories",
"tags"
],
"title": "ProductModel",
"type": "object"
}
Checkout the demo for more examples!
Combine it with parsing libs, like openai
sdk or Instructor
for AWESOME results!
Features¶
- 🤖 Optimized for LLM prompts - Clean, readable type definitions
- 💰 Token-efficient - Reduces LLM API costs
- 🎯 Support for all common Python types (str, int, bool, datetime, etc.)
- 🏗️ Handle complex nested structures and collections - Strong support for Pydantic model types
- 🔗 Support for enums, optional types, and unions - Properly extract and display union types
- ⚙️ Customizable output formatting - Indentation, compact mode, and more
- 🎨 Rich Default Values - Automatically shows default values in human-readable comments
- 📏 Smart Constraints - Displays field constraints (length, range, patterns) in clear language
- ✅ Clear Field Status - Explicit required/optional marking
- 📚 Rich Examples - Inline examples and documentation for better LLM understanding
- 🔀 Advanced Union Types - Full support for allOf, not constraints, and discriminated unions
- 🔢 Advanced Arrays - Contains constraints, minContains/maxContains, and enhanced tuple support
Quick Start¶
View the demo for more examples and features!
from pydantic import BaseModel
from schellma import schellma
import openai
class TaskRequest(BaseModel):
title: str
priority: int
tags: list[str]
due_date: str | None = None
# Generate schema for LLM prompt
schema = schellma(TaskRequest)
# Add the scheLLMa schema to the prompt
prompt = f"""
Please create a task with the following structure:
{schema}
"""
print(prompt)
# Use with your favorite LLM API
completion = openai.chat.completions.create(
model="gpt-4.1-mini",
messages=[{
"role": "user",
"content": prompt
}]
)
content = completion.choices[0].message.content
print(content)
task = TaskRequest.model_validate_json(clean_content(content))
print(task)
# TaskRequest(title='Task 1', priority=1, tags=['tag1', 'tag2'], due_date=None)
Using the new openai chat.completions.parse API¶
# or directly parse with openai sdk
completion = openai.chat.completions.parse(
model="gpt-4.1-mini",
messages=[{
"role": "user",
"content": prompt
}]
)
task = completion.choices[0].message.parsed
print(task)
# TaskRequest(title='Task 1', priority=1, tags=['tag1', 'tag2'], due_date=None)
Using the new openai Responses API¶
class CalendarEvent(BaseModel):
name: str
date: str
participants: list[str]
schema = schellma(CalendarEvent)
response = openai.responses.parse(
model="gpt-4o-2024-08-06",
input=[
# Make sure to include the schema in your prompt
{"role": "system", "content": f"Extract the event information. {schema}"},
{
"role": "user",
"content": "Alice and Bob are going to a science fair on Friday.",
},
],
text_format=CalendarEvent,
)
event = response.output_parsed
print(event)
# CalendarEvent(name='Alice and Bob are going to a science fair on Friday.', date='Friday', participants=['Alice', 'Bob'])
Installation¶
Or using uv:
Install from github
Comparison with JSON Schema¶
JSON Schema (verbose, token-heavy):
{
"type": "object",
"properties": {
"name": { "type": "string" },
"age": { "type": "integer" },
"email": { "type": ["string", "null"], "default": null }
},
"required": ["name", "age"],
"additionalProperties": false
}
scheLLMa (clean, token-efficient):
Advanced Usage with Type Definitions¶
from pydantic import BaseModel
from typing import List, Optional
from schellma import schellma
class Address(BaseModel):
street: str
city: str
country: str
class User(BaseModel):
name: str
age: int
addresses: List[Address]
primary_address: Optional[Address] = None
# Generate with separate type definitions
schema = schellma(User, define_types=True)
print(schema)
Output:
Address {
"street": string,
"city": string,
"country": string,
}
{
"name": string,
"age": int,
"addresses": Address[],
"primary_address": Address | null,
}
Examples¶
Enum Support¶
from enum import Enum
from pydantic import BaseModel
class Status(Enum):
ACTIVE = "active"
INACTIVE = "inactive"
class Task(BaseModel):
title: str
status: Status
schema = schellma(Task)
# Output: { "title": string, "status": "active" | "inactive" }
Complex Nested Structures¶
from pydantic import BaseModel
from typing import Dict, List
class Tag(BaseModel):
name: str
color: str
class Post(BaseModel):
title: str
content: str
tags: List[Tag]
metadata: Dict[str, str]
schema = schellma(Post, define_types=True)
Development¶
Setup¶
Running Tests¶
Type Checking¶
Linting¶
Contributing¶
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
Development Guidelines¶
- Follow the existing code style (enforced by ruff)
- Add tests for any new functionality
- Update documentation as needed
- Ensure all tests pass and type checking succeeds
License¶
This project is licensed under the MIT License - see the LICENSE file for details.
Changelog¶
See Changelog for the changelog.