Initial implementation of Website Change Detection Monitor MVP
Features implemented: - Backend API with Express + TypeScript - User authentication (register/login with JWT) - Monitor CRUD operations with plan-based limits - Automated change detection engine - Email alert system - Frontend with Next.js + TypeScript - Dashboard with monitor management - Login/register pages - Monitor history viewer - PostgreSQL database schema - Docker setup for local development Technical stack: - Backend: Express, TypeScript, PostgreSQL, Redis (ready) - Frontend: Next.js 14, React Query, Tailwind CSS - Database: PostgreSQL with migrations - Services: Page fetching, diff detection, email alerts Documentation: - README with full setup instructions - SETUP guide for quick start - PROJECT_STATUS with current capabilities - Complete technical specifications Ready for local testing and feature expansion. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
620
claude.md
Normal file
620
claude.md
Normal file
@@ -0,0 +1,620 @@
|
||||
# Website Change Detection Monitor - Claude Context
|
||||
|
||||
## Project Overview
|
||||
|
||||
This is a **Website Change Detection Monitor SaaS** application. The core value proposition is helping users track changes on web pages they care about, with intelligent noise filtering to ensure only meaningful changes trigger alerts.
|
||||
|
||||
**Tagline**: "I watch pages so you don't have to"
|
||||
|
||||
---
|
||||
|
||||
## Key Differentiators
|
||||
|
||||
1. **Smart Noise Filtering**: Unlike competitors, we automatically filter out cookie banners, timestamps, rotating ads, and other irrelevant changes
|
||||
2. **Keyword-Based Alerts**: Users can be notified when specific words/phrases appear or disappear (e.g., "sold out", "hiring", "$99")
|
||||
3. **Simple but Powerful**: Easy enough for non-technical users, powerful enough for professionals
|
||||
4. **SEO-Optimized Market**: Tons of long-tail keywords (e.g., "monitor job postings", "track competitor prices")
|
||||
|
||||
---
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
### Tech Stack (Recommended)
|
||||
|
||||
**Frontend**:
|
||||
- Next.js 14+ (App Router)
|
||||
- TypeScript
|
||||
- Tailwind CSS + shadcn/ui components
|
||||
- React Query for state management
|
||||
- Zod for validation
|
||||
|
||||
**Backend**:
|
||||
- Node.js + Express OR Python + FastAPI
|
||||
- PostgreSQL for relational data
|
||||
- Redis + Bull/BullMQ for job queuing
|
||||
- Puppeteer/Playwright for JS-heavy sites
|
||||
|
||||
**Infrastructure**:
|
||||
- Vercel/Railway for frontend hosting
|
||||
- Render/Railway/AWS for backend
|
||||
- AWS S3 or Cloudflare R2 for snapshot storage
|
||||
- Upstash Redis or managed Redis
|
||||
|
||||
**Third-Party Services**:
|
||||
- Stripe for billing
|
||||
- SendGrid/Postmark for emails
|
||||
- Sentry for error tracking
|
||||
- PostHog/Mixpanel for analytics
|
||||
|
||||
---
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
/website-monitor
|
||||
├── /frontend (Next.js)
|
||||
│ ├── /app
|
||||
│ │ ├── /dashboard
|
||||
│ │ ├── /monitors
|
||||
│ │ ├── /settings
|
||||
│ │ └── /auth
|
||||
│ ├── /components
|
||||
│ │ ├── /ui (shadcn components)
|
||||
│ │ ├── /monitors
|
||||
│ │ └── /diff-viewer
|
||||
│ ├── /lib
|
||||
│ │ ├── api-client.ts
|
||||
│ │ ├── auth.ts
|
||||
│ │ └── utils.ts
|
||||
│ └── /public
|
||||
├── /backend
|
||||
│ ├── /src
|
||||
│ │ ├── /routes
|
||||
│ │ ├── /controllers
|
||||
│ │ ├── /models
|
||||
│ │ ├── /services
|
||||
│ │ │ ├── fetcher.ts
|
||||
│ │ │ ├── differ.ts
|
||||
│ │ │ ├── scheduler.ts
|
||||
│ │ │ └── alerter.ts
|
||||
│ │ ├── /jobs
|
||||
│ │ └── /utils
|
||||
│ ├── /db
|
||||
│ │ └── /migrations
|
||||
│ └── /tests
|
||||
├── /docs
|
||||
│ ├── spec.md
|
||||
│ ├── task.md
|
||||
│ ├── actions.md
|
||||
│ └── claude.md (this file)
|
||||
└── README.md
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Core Entities & Data Models
|
||||
|
||||
### User
|
||||
```typescript
|
||||
{
|
||||
id: string
|
||||
email: string
|
||||
passwordHash: string
|
||||
plan: 'free' | 'pro' | 'business' | 'enterprise'
|
||||
stripeCustomerId: string
|
||||
createdAt: Date
|
||||
lastLoginAt: Date
|
||||
}
|
||||
```
|
||||
|
||||
### Monitor
|
||||
```typescript
|
||||
{
|
||||
id: string
|
||||
userId: string
|
||||
url: string
|
||||
name: string
|
||||
frequency: number // minutes
|
||||
status: 'active' | 'paused' | 'error'
|
||||
|
||||
// Advanced features
|
||||
elementSelector?: string
|
||||
ignoreRules?: {
|
||||
type: 'css' | 'regex' | 'text'
|
||||
value: string
|
||||
}[]
|
||||
keywordRules?: {
|
||||
keyword: string
|
||||
type: 'appears' | 'disappears' | 'count'
|
||||
threshold?: number
|
||||
}[]
|
||||
|
||||
// Metadata
|
||||
lastCheckedAt?: Date
|
||||
lastChangedAt?: Date
|
||||
consecutiveErrors: number
|
||||
createdAt: Date
|
||||
}
|
||||
```
|
||||
|
||||
### Snapshot
|
||||
```typescript
|
||||
{
|
||||
id: string
|
||||
monitorId: string
|
||||
htmlContent: string
|
||||
contentHash: string
|
||||
screenshotUrl?: string
|
||||
|
||||
// Status
|
||||
httpStatus: number
|
||||
responseTime: number
|
||||
changed: boolean
|
||||
changePercentage?: number
|
||||
|
||||
// Errors
|
||||
errorMessage?: string
|
||||
|
||||
// Metadata
|
||||
createdAt: Date
|
||||
}
|
||||
```
|
||||
|
||||
### Alert
|
||||
```typescript
|
||||
{
|
||||
id: string
|
||||
monitorId: string
|
||||
snapshotId: string
|
||||
userId: string
|
||||
|
||||
// Alert details
|
||||
type: 'change' | 'error' | 'keyword'
|
||||
title: string
|
||||
summary?: string
|
||||
|
||||
// Delivery
|
||||
channels: ('email' | 'slack' | 'webhook')[]
|
||||
deliveredAt?: Date
|
||||
readAt?: Date
|
||||
|
||||
createdAt: Date
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Algorithms & Logic
|
||||
|
||||
### Change Detection
|
||||
```typescript
|
||||
// Simple hash comparison for binary change detection
|
||||
const changed = previousHash !== currentHash
|
||||
|
||||
// Text diff for detailed comparison
|
||||
const diff = diffLines(previousText, currentText)
|
||||
const changePercentage = (changedLines / totalLines) * 100
|
||||
|
||||
// Severity calculation
|
||||
const severity =
|
||||
changePercentage > 50 ? 'major' :
|
||||
changePercentage > 10 ? 'medium' : 'minor'
|
||||
```
|
||||
|
||||
### Noise Filtering
|
||||
```typescript
|
||||
// Remove common noise patterns
|
||||
function filterNoise(html: string): string {
|
||||
// Remove timestamps
|
||||
html = html.replace(/\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}/g, '')
|
||||
|
||||
// Remove cookie banners (common selectors)
|
||||
const noisySelectors = [
|
||||
'.cookie-banner',
|
||||
'#cookie-notice',
|
||||
'[class*="consent"]',
|
||||
// ... more patterns
|
||||
]
|
||||
|
||||
// Parse and remove elements
|
||||
const $ = cheerio.load(html)
|
||||
noisySelectors.forEach(sel => $(sel).remove())
|
||||
|
||||
return $.html()
|
||||
}
|
||||
```
|
||||
|
||||
### Keyword Detection
|
||||
```typescript
|
||||
function checkKeywords(
|
||||
previousText: string,
|
||||
currentText: string,
|
||||
rules: KeywordRule[]
|
||||
): KeywordMatch[] {
|
||||
const matches = []
|
||||
|
||||
for (const rule of rules) {
|
||||
const prevMatch = previousText.includes(rule.keyword)
|
||||
const currMatch = currentText.includes(rule.keyword)
|
||||
|
||||
if (rule.type === 'appears' && !prevMatch && currMatch) {
|
||||
matches.push({ rule, type: 'appeared' })
|
||||
}
|
||||
if (rule.type === 'disappears' && prevMatch && !currMatch) {
|
||||
matches.push({ rule, type: 'disappeared' })
|
||||
}
|
||||
|
||||
// Count logic...
|
||||
}
|
||||
|
||||
return matches
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Development Guidelines
|
||||
|
||||
### When Working on This Project
|
||||
|
||||
1. **Prioritize MVP**: Focus on core features before adding complexity
|
||||
2. **Performance matters**: Diffing and fetching should be fast (<2s)
|
||||
3. **Noise reduction is key**: This is our competitive advantage
|
||||
4. **User feedback loop**: Build in ways to learn from false positives
|
||||
5. **Security first**: Never store credentials in plain text, sanitize all URLs
|
||||
|
||||
### Code Style
|
||||
|
||||
- Use TypeScript strict mode
|
||||
- Write unit tests for core algorithms (differ, filter, keyword)
|
||||
- Use async/await, avoid callbacks
|
||||
- Prefer functional programming patterns
|
||||
- Comment complex logic, especially regex patterns
|
||||
|
||||
### API Design Principles
|
||||
|
||||
- RESTful endpoints
|
||||
- Use proper HTTP status codes
|
||||
- Return consistent error format:
|
||||
```json
|
||||
{
|
||||
"error": "monitor_not_found",
|
||||
"message": "Monitor with id 123 not found",
|
||||
"details": {}
|
||||
}
|
||||
```
|
||||
- Paginate list endpoints (monitors, snapshots, alerts)
|
||||
- Version API if breaking changes needed (/v1/monitors)
|
||||
|
||||
---
|
||||
|
||||
## Common Tasks & Commands
|
||||
|
||||
### When Starting Development
|
||||
```bash
|
||||
# Clone and setup
|
||||
git clone <repo>
|
||||
cd website-monitor
|
||||
|
||||
# Install dependencies
|
||||
cd frontend && npm install
|
||||
cd ../backend && npm install
|
||||
|
||||
# Setup environment
|
||||
cp .env.example .env
|
||||
# Edit .env with your values
|
||||
|
||||
# Start database
|
||||
docker-compose up -d postgres redis
|
||||
|
||||
# Run migrations
|
||||
cd backend && npm run migrate
|
||||
|
||||
# Start dev servers
|
||||
cd frontend && npm run dev
|
||||
cd backend && npm run dev
|
||||
```
|
||||
|
||||
### Running Tests
|
||||
```bash
|
||||
# Frontend tests
|
||||
cd frontend && npm test
|
||||
|
||||
# Backend tests
|
||||
cd backend && npm test
|
||||
|
||||
# E2E tests
|
||||
npm run test:e2e
|
||||
```
|
||||
|
||||
### Deployment
|
||||
```bash
|
||||
# Build frontend
|
||||
cd frontend && npm run build
|
||||
|
||||
# Deploy frontend (Vercel)
|
||||
vercel deploy --prod
|
||||
|
||||
# Deploy backend
|
||||
docker build -t monitor-api .
|
||||
docker push <registry>/monitor-api
|
||||
# Deploy to Railway/Render/AWS
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key User Flows to Support
|
||||
|
||||
When building features, always consider these primary use cases:
|
||||
|
||||
1. **Job seeker monitoring career pages** (most common)
|
||||
- Needs: Fast frequency (5 min), keyword alerts, instant notifications
|
||||
|
||||
2. **Price tracking for e-commerce** (high value)
|
||||
- Needs: Element selection, numeric comparison, reliable alerts
|
||||
|
||||
3. **Competitor monitoring** (B2B focus)
|
||||
- Needs: Multiple monitors, digest mode, AI summaries
|
||||
|
||||
4. **Stock/availability tracking** (urgent)
|
||||
- Needs: Fastest frequency (1 min), SMS alerts, auto-pause
|
||||
|
||||
5. **Policy/regulation monitoring** (professional)
|
||||
- Needs: Long-term history, team sharing, AI summaries
|
||||
|
||||
---
|
||||
|
||||
## Integration Points
|
||||
|
||||
### Email Service (SendGrid/Postmark)
|
||||
```typescript
|
||||
async function sendChangeAlert(monitor: Monitor, snapshot: Snapshot) {
|
||||
const diffUrl = `https://app.example.com/monitors/${monitor.id}/diff/${snapshot.id}`
|
||||
|
||||
await emailService.send({
|
||||
to: monitor.user.email,
|
||||
subject: `Change detected: ${monitor.name}`,
|
||||
template: 'change-alert',
|
||||
data: {
|
||||
monitorName: monitor.name,
|
||||
url: monitor.url,
|
||||
timestamp: snapshot.createdAt,
|
||||
diffUrl,
|
||||
changePercentage: snapshot.changePercentage
|
||||
}
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
### Stripe Billing
|
||||
```typescript
|
||||
async function handleSubscription(userId: string, plan: string) {
|
||||
const user = await db.users.findById(userId)
|
||||
|
||||
// Create or update subscription
|
||||
const subscription = await stripe.subscriptions.create({
|
||||
customer: user.stripeCustomerId,
|
||||
items: [{ price: PRICE_IDS[plan] }]
|
||||
})
|
||||
|
||||
// Update user plan
|
||||
await db.users.update(userId, {
|
||||
plan,
|
||||
subscriptionId: subscription.id
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
### Job Queue (Bull)
|
||||
```typescript
|
||||
// Schedule monitor checks
|
||||
async function scheduleMonitor(monitor: Monitor) {
|
||||
await monitorQueue.add(
|
||||
'check-monitor',
|
||||
{ monitorId: monitor.id },
|
||||
{
|
||||
repeat: {
|
||||
every: monitor.frequency * 60 * 1000 // convert to ms
|
||||
},
|
||||
jobId: `monitor-${monitor.id}`
|
||||
}
|
||||
)
|
||||
}
|
||||
|
||||
// Process checks
|
||||
monitorQueue.process('check-monitor', async (job) => {
|
||||
const { monitorId } = job.data
|
||||
await checkMonitor(monitorId)
|
||||
})
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Unit Tests
|
||||
- Diff algorithms
|
||||
- Noise filtering
|
||||
- Keyword matching
|
||||
- Ignore rules application
|
||||
|
||||
### Integration Tests
|
||||
- API endpoints
|
||||
- Database operations
|
||||
- Job queue processing
|
||||
|
||||
### E2E Tests
|
||||
- User registration & login
|
||||
- Monitor creation & management
|
||||
- Alert delivery
|
||||
- Subscription changes
|
||||
|
||||
### Performance Tests
|
||||
- Fetch speed with various page sizes
|
||||
- Diff calculation speed
|
||||
- Concurrent monitor checks
|
||||
- Database query performance
|
||||
|
||||
---
|
||||
|
||||
## Deployment Checklist
|
||||
|
||||
Before deploying to production:
|
||||
|
||||
- [ ] Environment variables configured
|
||||
- [ ] Database migrations run
|
||||
- [ ] SSL certificates configured
|
||||
- [ ] Email deliverability tested
|
||||
- [ ] Payment processing tested (Stripe test mode → live mode)
|
||||
- [ ] Error tracking configured (Sentry)
|
||||
- [ ] Monitoring & alerts set up (uptime, error rate, queue health)
|
||||
- [ ] Backup strategy implemented
|
||||
- [ ] Rate limiting configured
|
||||
- [ ] GDPR compliance (privacy policy, data export/deletion)
|
||||
- [ ] Security headers configured
|
||||
- [ ] API documentation updated
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting Common Issues
|
||||
|
||||
### "Monitor keeps triggering false alerts"
|
||||
- Check if noise filtering is working
|
||||
- Review ignore rules for the monitor
|
||||
- Look at diff to identify changing element
|
||||
- Add custom ignore rule for that element
|
||||
|
||||
### "Some pages aren't being monitored correctly"
|
||||
- Check if page requires JavaScript rendering
|
||||
- Try enabling headless browser mode
|
||||
- Check if page requires authentication
|
||||
- Look for CAPTCHA or bot detection
|
||||
|
||||
### "Alerts aren't being delivered"
|
||||
- Check email service status
|
||||
- Verify email isn't going to spam
|
||||
- Check alert queue for errors
|
||||
- Verify user's alert settings
|
||||
|
||||
### "System is slow/overloaded"
|
||||
- Check Redis queue health
|
||||
- Look for monitors with very high frequency
|
||||
- Check database query performance
|
||||
- Consider scaling workers horizontally
|
||||
|
||||
---
|
||||
|
||||
## Metrics to Track
|
||||
|
||||
### Technical Metrics
|
||||
- Average check duration
|
||||
- Diff calculation time
|
||||
- Check success rate
|
||||
- Alert delivery rate
|
||||
- Queue processing lag
|
||||
|
||||
### Product Metrics
|
||||
- Active monitors per user
|
||||
- Alerts sent per day
|
||||
- False positive rate (from user feedback)
|
||||
- Feature adoption (keywords, elements, integrations)
|
||||
|
||||
### Business Metrics
|
||||
- Free → Paid conversion rate
|
||||
- Monthly churn rate
|
||||
- Average revenue per user (ARPU)
|
||||
- Customer acquisition cost (CAC)
|
||||
- Lifetime value (LTV)
|
||||
|
||||
---
|
||||
|
||||
## Resources & Documentation
|
||||
|
||||
### External Documentation
|
||||
- [Next.js Docs](https://nextjs.org/docs)
|
||||
- [Tailwind CSS](https://tailwindcss.com/docs)
|
||||
- [Playwright Docs](https://playwright.dev)
|
||||
- [Bull Queue](https://github.com/OptimalBits/bull)
|
||||
- [Stripe API](https://stripe.com/docs/api)
|
||||
|
||||
### Internal Documentation
|
||||
- See `spec.md` for complete feature specifications
|
||||
- See `task.md` for development roadmap
|
||||
- See `actions.md` for user workflows and use cases
|
||||
|
||||
---
|
||||
|
||||
## Future Considerations
|
||||
|
||||
### Potential Enhancements
|
||||
- Mobile app (React Native or Progressive Web App)
|
||||
- Browser extension for quick monitor addition
|
||||
- AI-powered change importance scoring
|
||||
- Collaborative features (team annotations, approval workflows)
|
||||
- Marketplace for monitor templates
|
||||
- Affiliate program for power users
|
||||
|
||||
### Scaling Considerations
|
||||
- Distributed workers across multiple regions
|
||||
- Caching layer for frequently accessed pages
|
||||
- Database sharding by user
|
||||
- Separate queue for high-frequency monitors
|
||||
- CDN for snapshot storage
|
||||
|
||||
---
|
||||
|
||||
## Notes for Claude
|
||||
|
||||
When working on this project:
|
||||
|
||||
1. **Always reference these docs**: spec.md, task.md, actions.md, and this file
|
||||
2. **MVP mindset**: Implement the simplest solution that works first
|
||||
3. **User-centric**: Consider the user workflows in actions.md when building features
|
||||
4. **Security-conscious**: Validate URLs, sanitize inputs, encrypt sensitive data
|
||||
5. **Performance-aware**: Optimize for speed, especially diff calculation
|
||||
6. **Ask clarifying questions**: If requirements are ambiguous, ask before implementing
|
||||
7. **Test as you go**: Write tests for core functionality
|
||||
8. **Document decisions**: Update these docs when making architectural decisions
|
||||
|
||||
### Common Questions & Answers
|
||||
|
||||
**Q: Should we support authenticated pages in MVP?**
|
||||
A: No, save for V2. Focus on public pages first.
|
||||
|
||||
**Q: What diff library should we use?**
|
||||
A: `diff` (npm) or `jsdiff` for JavaScript, `difflib` for Python.
|
||||
|
||||
**Q: How do we handle CAPTCHA?**
|
||||
A: For MVP, just alert the user. For V2, consider residential proxies or browser fingerprinting.
|
||||
|
||||
**Q: Should we store full HTML or just text?**
|
||||
A: Store both: full HTML for accuracy, extracted text for diffing performance.
|
||||
|
||||
**Q: What's the minimum viable frequency?**
|
||||
A: 5 minutes for paid users, 1 hour for free tier.
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Key Files
|
||||
- `spec.md` - Feature specifications
|
||||
- `task.md` - Development tasks and roadmap
|
||||
- `actions.md` - User workflows and use cases
|
||||
- `claude.md` - This file (project context)
|
||||
|
||||
### Key Concepts
|
||||
- **Noise reduction** - Core differentiator
|
||||
- **Keyword alerts** - High-value feature
|
||||
- **Element selection** - Monitor specific parts
|
||||
- **Change severity** - Classify importance
|
||||
|
||||
### Pricing Tiers
|
||||
- **Free**: 5 monitors, 1hr frequency
|
||||
- **Pro**: 50 monitors, 5min frequency, $19-29/mo
|
||||
- **Business**: 200 monitors, 1min frequency, teams, $99-149/mo
|
||||
- **Enterprise**: Unlimited, custom pricing
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-01-16*
|
||||
Reference in New Issue
Block a user