Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nested array and object support to the schema DSL #792

Open
simonw opened this issue Feb 27, 2025 · 2 comments
Open

Add nested array and object support to the schema DSL #792

simonw opened this issue Feb 27, 2025 · 2 comments
Labels
enhancement New feature or request schemas

Comments

@simonw
Copy link
Owner

simonw commented Feb 27, 2025

Mainly because I want the NY Times demo to return an array of string names of places and people.

Follows:

@simonw simonw added enhancement New feature or request schemas labels Feb 27, 2025
@simonw simonw added this to the 0.23 (schemas) milestone Feb 27, 2025
@simonw
Copy link
Owner Author

simonw commented Feb 27, 2025

I came up with this design:

name str, hobbies [name, duration int], address {street, city, zip int}

And got Claude to try implementing it: https://claude.ai/share/f7213da2-9e58-495b-822b-6a8c02c151ae

@simonw simonw removed this from the 0.23 (schemas) milestone Feb 27, 2025
@simonw
Copy link
Owner Author

simonw commented Feb 27, 2025

I won't include this in the first release of the feature. Claude's implementation is too buggy. https://claude.ai/share/f7213da2-9e58-495b-822b-6a8c02c151ae

This example from the Claude generated docs:

llm schemas dsl "name: person's full name
    age int: age in years
    address {
      street: street name and number
      city: city name
      state: state or province
      zip int: postal code
      coordinates {lat float, long float}: geographical location
    }: physical address
    contacts [
      {
        type: contact type (email, phone, etc.)
        value: contact details
        preferred bool: whether this is the preferred contact method
      }
    ]: list of contact methods
    hobbies [str]: list of hobbies and interests
    employment_history [
      {
        company: employer name
        title: job title
        years_active int: duration of employment
        projects [{name, budget float}]: major projects worked on
      }
    ]: work experience"

Produced the wrong result - note how address.street is top level when it should be nested:

{
  "type": "object",
  "properties": {
    "name": {
      "type": "string",
      "description": "person's full name"
    },
    "age": {
      "type": "integer",
      "description": "age in years"
    },
    "address": {
      "type": "string"
    },
    "street": {
      "type": "string",
      "description": "street name and number"
    },
    "city": {
      "type": "string",
      "description": "city name"
    },
    "state": {
      "type": "string",
      "description": "state or province"
    },
    "zip": {
      "type": "integer",
      "description": "postal code"
    },
    "coordinates": {
      "type": "object",
      "properties": {
        "lat": {
          "type": "number"
        },
        "long": {
          "type": "number"
        }
      },
      "required": [
        "lat",
        "long"
      ],
      "description": "geographical location"
    },
    "}": {
      "type": "string"
    },
    "contacts": {
      "type": "string"
    },
    "{": {
      "type": "string"
    },
    "type": {
      "type": "string",
      "description": "contact type (email, phone, etc.)"
    },
    "value": {
      "type": "string",
      "description": "contact details"
    },
    "preferred": {
      "type": "boolean",
      "description": "whether this is the preferred contact method"
    },
    "]": {
      "type": "string",
      "description": "work experience"
    },
    "hobbies": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "description": "list of hobbies and interests"
    },
    "employment_history": {
      "type": "string"
    },
    "company": {
      "type": "string",
      "description": "employer name"
    },
    "title": {
      "type": "string",
      "description": "job title"
    },
    "years_active": {
      "type": "integer",
      "description": "duration of employment"
    },
    "projects": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "{name,": {
            "type": "string"
          }
        },
        "required": [
          "{name,"
        ]
      },
      "description": "major projects worked on"
    }
  },
  "required": [
    "name",
    "age",
    "address",
    "street",
    "city",
    "state",
    "zip",
    "coordinates",
    "}",
    "contacts",
    "{",
    "type",
    "value",
    "preferred",
    "}",
    "]",
    "hobbies",
    "employment_history",
    "{",
    "company",
    "title",
    "years_active",
    "projects",
    "}",
    "]"
  ]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request schemas
Projects
None yet
Development

No branches or pull requests

1 participant