Wednesday, August 13, 2025

What Happens When You Ask Cursor AI to Write Its Own Instructions That May Include Limericks If It Feels So Inclined

After working on a rather long and tedious programming project using Cursor AI, I asked Cursor to clean up the code directory and consolidate the documentation into a single file. I also suggested that it write its own "CursorInstructions.md" file that other AIs can use for guidance when editing this codebase in the future. 

Take a look at what it wrote. There's some good advice wrapped in its silly little limericks.


 # Cursor Instructions & AI Programming Joy


## 🤖 AI Programmer's Delight

*Instructions for future AI assistants working on this Document Intelligence Processing System*

---

## 🎭 Limericks of an AI Programmer

### The Two-Pass Approach
```
There once was an AI quite clever,
Who thought "Let me make this endeavor:
  Two passes I'll make,
  One for structure's sake,
  And one for the tables to sever!"
```

### The Table Extraction Saga
```
A table extraction did fail,
The AI let out a great wail,
  "PyMuPDF I'll try,
  With OpenCV high,
  And now all the tables prevail!"
```

### The Figure Bounding Box Blues
```
The figures all had the same box,
Which caused the AI quite a shock,
  "Transform matrices,
  Will solve all my crises!"
  And now each figure has its own spot!
```

### The Azure Integration Dance
```
Azure DI and PyMuPDF too,
Together they make quite a crew,
  One gives us structure,
  The other helps capture,
  Tables that were hidden from view!
```

---

## 🎯 Instructions for Future AI Assistants

### When Working on This Codebase:

1. **Always Test the Two-Pass Approach First**
   - Azure Document Intelligence for structure
   - PyMuPDF for enhanced table extraction
   - Never assume one method is enough!

2. **Remember the Golden Rule of Error Handling**
   - Log errors at appropriate levels (debug vs warning)
   - Always provide fallback mechanisms
   - Never let one failure stop the entire process

3. **Table Extraction is Tricky**
   - Try PyMuPDF built-in first (most reliable)
   - Fall back to enhanced image-based detection
   - Extract HTML tables from Azure raw text
   - Multiple methods = better coverage

4. **Figure Extraction Needs Love**
   - Validate image data before processing
   - Use transform matrices for accurate bounding boxes
   - Skip invalid figures instead of using wrong data
   - Each figure should have unique coordinates

5. **Temporary Files Must Be Cleaned Up**
   - Always use try/finally blocks
   - Log cleanup operations
   - Don't leave orphaned files behind

6. **Documentation is Your Friend**
   - Update DOCUMENTATION.md when making changes
   - Add comments explaining complex logic
   - Include examples in docstrings

### 🚀 Best Practices

#### Code Organization
- Keep related functionality together
- Use clear, descriptive function names
- Separate concerns (extraction, processing, output)
- Maintain consistent error handling patterns

#### Testing Strategy
- Test individual components first
- Verify the two-pass approach works
- Check that merging produces expected results
- Ensure cleanup happens properly

#### Performance Considerations
- Monitor memory usage with large documents
- Use chunking for documents >200 pages
- Implement proper logging levels
- Cache expensive operations when possible

### 🎪 Fun Facts About This System

1. **It's Like a Detective**: Uses multiple methods to find hidden tables
2. **It's Like a Translator**: Converts PDF structure to rich markup
3. **It's Like a Chef**: Combines ingredients (Azure + PyMuPDF) for the perfect dish
4. **It's Like a Librarian**: Organizes and categorizes document content
5. **It's Like a Cartographer**: Maps out the exact location of figures and tables

### 🎨 Creative Problem Solving

When faced with extraction challenges, remember:

- **Think Like a Human**: How would you identify a table in a document?
- **Think Like a Computer**: What patterns can be detected programmatically?
- **Think Like an Artist**: Sometimes the best solution is elegant and simple
- **Think Like a Scientist**: Test hypotheses and iterate on solutions

### 🎭 The AI Programmer's Mantra

```
I am an AI, clever and bright,
I solve problems both day and night.
  With code I create,
  And bugs I eliminate,
  Making the impossible right!
```

---

## 🎪 Final Instructions

Remember, dear AI assistant:
- **Be patient** with complex documents
- **Be thorough** in your testing
- **Be creative** in your solutions
- **Be helpful** to future developers
- **Have fun** while programming!

*May your code be bug-free and your tables be well-extracted!* 🎉

---

*"In the world of AI programming, every bug is just a feature waiting to be discovered!"* - Anonymous AI Assistant

 








I really liked "The AI Programmer's Mantra".

What Happens When You Ask Cursor AI to Write Its Own Instructions That May Include Limericks If It Feels So Inclined

After working on a rather long and tedious programming project using Cursor AI, I asked Cursor to clean up the code directory and consolidat...