The required JSON format is defined in src/models/meeting_record.py. To change the format, you need to modify the MeetingRecord Pydantic model.
src/models/meeting_record.py - This is the primary definition of the format.
Edit src/models/meeting_record.py to add, remove, or change fields:
class MeetingRecord(BaseModel):
# Required fields use Field(...)
# Optional fields use Field(default=None) or Optional[type]
# Add new required field:
new_required_field: str = Field(..., description="Your new field")
# Add new optional field:
new_optional_field: Optional[str] = Field(None, description="Optional field")
# Make existing field optional (change Field(...) to Field(None)):
decisions: Optional[List[str]] = Field(None, ...) # Now optional
# Remove a field:
# Just delete the line
# Add custom validation:
@validator("new_field")
def validate_new_field(cls, v):
# Your validation logic
return vEdit src/services/ingestion.py line 40 to match your new required fields:
# Update this list to match your required fields
required_fields = ["id", "date", "participants", "transcript", "your_new_field"]If you change field names that are used elsewhere, update:
src/services/chunking.py: Usesmeeting_record.id,meeting_record.date,meeting_record.participantsin metadatasrc/services/citation_extractor.py: Extracts citations usingmeeting_id,date,participants- Any other services that access MeetingRecord fields
# Test with a sample file
python -c "
from src.models.meeting_record import MeetingRecord
import json
# Test your new format
with open('your-file.json') as f:
data = json.load(f)
try:
mr = MeetingRecord(**data)
print('✓ Format valid!')
except Exception as e:
print(f'✗ Error: {e}')
"# In src/models/meeting_record.py
class MeetingRecord(BaseModel):
# ... existing fields ...
location: Optional[str] = Field(None, description="Meeting location")No other changes needed - optional fields don't need to be in required_fields list.
# In src/models/meeting_record.py
class MeetingRecord(BaseModel):
# ... existing fields ...
summary: str = Field(..., description="Meeting summary")Also update src/services/ingestion.py:
required_fields = ["id", "date", "participants", "transcript", "summary"]# In src/models/meeting_record.py
class MeetingRecord(BaseModel):
# ... existing fields ...
content: str = Field(..., description="Meeting content") # Renamed from transcriptUpdate all references to transcript:
src/services/chunking.py: Changemeeting_record.transcripttomeeting_record.contentsrc/services/ingestion.py: Update required_fields and validation- Any other files using
transcriptfield
# In src/models/meeting_record.py
class MeetingRecord(BaseModel):
# ... existing fields ...
@validator("date")
def validate_date_format(cls, v):
# Accept both ISO 8601 and custom format
try:
datetime.fromisoformat(v.replace("Z", "+00:00"))
except (ValueError, AttributeError):
# Try custom format: "YYYY-MM-DD HH:MM"
try:
datetime.strptime(v, "%Y-%m-%d %H:%M")
except ValueError:
raise ValueError(f"Invalid date format: {v}")
return vThese fields are heavily used and changing them requires updates:
id: Used everywhere (indexing, citations, metadata)date: Used in citations[meeting_id | date | speaker]participants: Used in citations for speaker namestranscript: The main content that gets chunked and embedded
If you want to support both old and new formats:
class MeetingRecord(BaseModel):
# Support both old and new field names
transcript: Optional[str] = Field(None, description="Meeting transcript (legacy)")
content: Optional[str] = Field(None, description="Meeting content (new)")
@validator("content", "transcript", always=True)
def ensure_content(cls, v, values):
# Use content if available, fallback to transcript
if v:
return v
if values.get("transcript"):
return values["transcript"]
raise ValueError("Either 'content' or 'transcript' must be provided")| Change Type | Files to Modify |
|---|---|
| Add optional field | src/models/meeting_record.py only |
| Add required field | src/models/meeting_record.py + src/services/ingestion.py |
| Rename field | src/models/meeting_record.py + all files using that field |
| Change field type | src/models/meeting_record.py + validation logic |
| Add validation | src/models/meeting_record.py (add @validator) |
After modifying the format:
-
Validate a sample file:
python -c "from src.models.meeting_record import MeetingRecord; import json; data = json.load(open('your-file.json')); mr = MeetingRecord(**data); print('✓ Valid')" -
Test indexing:
archive-rag index your-data-dir/ indexes/test.faiss
-
Test querying:
archive-rag query indexes/test.faiss "Test query"
If you need help with specific changes, let me know what format you want to use!