Modifying the Meeting JSON Format

Overview

The required JSON format is defined in src/models/meeting_record.py. To change the format, you need to modify the MeetingRecord Pydantic model.

Main File to Modify

src/models/meeting_record.py - This is the primary definition of the format.

Steps to Change the Format

1. Modify the MeetingRecord Model

Edit src/models/meeting_record.py to add, remove, or change fields:

class MeetingRecord(BaseModel):
    # Required fields use Field(...)
    # Optional fields use Field(default=None) or Optional[type]
    
    # Add new required field:
    new_required_field: str = Field(..., description="Your new field")
    
    # Add new optional field:
    new_optional_field: Optional[str] = Field(None, description="Optional field")
    
    # Make existing field optional (change Field(...) to Field(None)):
    decisions: Optional[List[str]] = Field(None, ...)  # Now optional
    
    # Remove a field:
    # Just delete the line
    
    # Add custom validation:
    @validator("new_field")
    def validate_new_field(cls, v):
        # Your validation logic
        return v

2. Update Required Fields Check

Edit src/services/ingestion.py line 40 to match your new required fields:

# Update this list to match your required fields
required_fields = ["id", "date", "participants", "transcript", "your_new_field"]

3. Update Field Usage (if needed)

If you change field names that are used elsewhere, update:

src/services/chunking.py: Uses meeting_record.id, meeting_record.date, meeting_record.participants in metadata
src/services/citation_extractor.py: Extracts citations using meeting_id, date, participants
Any other services that access MeetingRecord fields

4. Test Your Changes

# Test with a sample file
python -c "
from src.models.meeting_record import MeetingRecord
import json

# Test your new format
with open('your-file.json') as f:
    data = json.load(f)
    
try:
    mr = MeetingRecord(**data)
    print('✓ Format valid!')
except Exception as e:
    print(f'✗ Error: {e}')
"

Example: Adding a New Field

Example 1: Add "location" field (optional)

# In src/models/meeting_record.py
class MeetingRecord(BaseModel):
    # ... existing fields ...
    location: Optional[str] = Field(None, description="Meeting location")

No other changes needed - optional fields don't need to be in required_fields list.

Example 2: Add "summary" field (required)

# In src/models/meeting_record.py
class MeetingRecord(BaseModel):
    # ... existing fields ...
    summary: str = Field(..., description="Meeting summary")

Also update src/services/ingestion.py:

required_fields = ["id", "date", "participants", "transcript", "summary"]

Example 3: Rename "transcript" to "content"

# In src/models/meeting_record.py
class MeetingRecord(BaseModel):
    # ... existing fields ...
    content: str = Field(..., description="Meeting content")  # Renamed from transcript

Update all references to transcript:

src/services/chunking.py: Change meeting_record.transcript to meeting_record.content
src/services/ingestion.py: Update required_fields and validation
Any other files using transcript field

Example 4: Change "date" to accept multiple formats

# In src/models/meeting_record.py
class MeetingRecord(BaseModel):
    # ... existing fields ...
    
    @validator("date")
    def validate_date_format(cls, v):
        # Accept both ISO 8601 and custom format
        try:
            datetime.fromisoformat(v.replace("Z", "+00:00"))
        except (ValueError, AttributeError):
            # Try custom format: "YYYY-MM-DD HH:MM"
            try:
                datetime.strptime(v, "%Y-%m-%d %H:%M")
            except ValueError:
                raise ValueError(f"Invalid date format: {v}")
        return v

Important Considerations

Fields Used by Other Components

These fields are heavily used and changing them requires updates:

id: Used everywhere (indexing, citations, metadata)
date: Used in citations [meeting_id | date | speaker]
participants: Used in citations for speaker names
transcript: The main content that gets chunked and embedded

Backward Compatibility

If you want to support both old and new formats:

class MeetingRecord(BaseModel):
    # Support both old and new field names
    transcript: Optional[str] = Field(None, description="Meeting transcript (legacy)")
    content: Optional[str] = Field(None, description="Meeting content (new)")
    
    @validator("content", "transcript", always=True)
    def ensure_content(cls, v, values):
        # Use content if available, fallback to transcript
        if v:
            return v
        if values.get("transcript"):
            return values["transcript"]
        raise ValueError("Either 'content' or 'transcript' must be provided")

Quick Reference

Change Type	Files to Modify
Add optional field	`src/models/meeting_record.py` only
Add required field	`src/models/meeting_record.py` + `src/services/ingestion.py`
Rename field	`src/models/meeting_record.py` + all files using that field
Change field type	`src/models/meeting_record.py` + validation logic
Add validation	`src/models/meeting_record.py` (add `@validator`)

Testing Your Changes

After modifying the format:

Validate a sample file:

python -c "from src.models.meeting_record import MeetingRecord; import json; data = json.load(open('your-file.json')); mr = MeetingRecord(**data); print('✓ Valid')"

Test indexing:

archive-rag index your-data-dir/ indexes/test.faiss

Test querying:

archive-rag query indexes/test.faiss "Test query"

If you need help with specific changes, let me know what format you want to use!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modifying the Meeting JSON Format

Overview

Main File to Modify

Steps to Change the Format

1. Modify the MeetingRecord Model

2. Update Required Fields Check

3. Update Field Usage (if needed)

4. Test Your Changes

Example: Adding a New Field

Example 1: Add "location" field (optional)

Example 2: Add "summary" field (required)

Example 3: Rename "transcript" to "content"

Example 4: Change "date" to accept multiple formats

Important Considerations

Fields Used by Other Components

Backward Compatibility

Quick Reference

Testing Your Changes

FilesExpand file tree

MODIFYING_DATA_FORMAT.md

Latest commit

History

MODIFYING_DATA_FORMAT.md

File metadata and controls

Modifying the Meeting JSON Format

Overview

Main File to Modify

Steps to Change the Format

1. Modify the MeetingRecord Model

2. Update Required Fields Check

3. Update Field Usage (if needed)

4. Test Your Changes

Example: Adding a New Field

Example 1: Add "location" field (optional)

Example 2: Add "summary" field (required)

Example 3: Rename "transcript" to "content"

Example 4: Change "date" to accept multiple formats

Important Considerations

Fields Used by Other Components

Backward Compatibility

Quick Reference

Testing Your Changes