ROLLerUP AI Implementation Guide
project/docs/implementation-guide.md document
Updated: 2026-03-07 02:45
# ROLLerUP AI Implementation Guide
> Complete documentation for setting up Zoho data extraction and knowledge base infrastructure.
> Created: 2026-02-12 | Author: Shraga (AI) + Eugene (Human)
---
## Table of Contents
1. [Overview](#overview)
2. [Prerequisites](#prerequisites)
3. [Phase 1: Zoho API Setup](#phase-1-zoho-api-setup)
4. [Phase 2: Database Setup](#phase-2-database-setup)
5. [Phase 3: Extraction Scripts](#phase-3-extraction-scripts)
6. [Troubleshooting](#troubleshooting)
7. [Appendix: Full Schema](#appendix-full-schema)
---
## Overview
This system extracts data from the Zoho ecosystem into a PostgreSQL database for AI-powered analysis and knowledge base construction.
**Zoho Apps Integrated:**
- Zoho CRM (Leads, Contacts, Accounts, Deals, Activities, Notes)
- Zoho Books (Contacts, Invoices, Estimates, Payments, Sales Orders)
- Zoho SalesIQ (Chat Conversations)
- Zoho Desk (Tickets, Articles) — pending
- Zoho Mail (Messages) — pending
- Zoho Connect (Posts, Groups) — pending
- Zoho WorkDrive (Files) — pending
**Infrastructure:**
- PostgreSQL on DigitalOcean Managed Database
- Python extraction scripts on DigitalOcean Droplet
- OpenClaw AI for orchestration
---
## Prerequisites
### Required Accounts
- Zoho account with admin access to all apps
- DigitalOcean account
- API Console access: https://api-console.zoho.com/
### Server Requirements
- Ubuntu 22.04+ droplet (or similar Linux server)
- Python 3.10+
- Network access to Zoho APIs and DigitalOcean database
---
## Phase 1: Zoho API Setup
### Step 1.1: Create Self Client in API Console
1. Go to https://api-console.zoho.com/
2. Click **Add Client** → **Self Client**
3. Note your **Client ID** and **Client Secret**
### Step 1.2: Generate Scopes
Different Zoho apps require different OAuth scopes. Generate separate tokens if needed.
**Main Token Scopes (CRM, Books, Desk, Mail, Connect, WorkDrive):**
```
ZohoCRM.modules.ALL,ZohoCRM.settings.ALL,ZohoCRM.users.ALL,ZohoCRM.org.ALL,ZohoCRM.bulk.ALL,ZohoCRM.coql.READ,ZohoCRM.Files.READ,ZohoCRM.modules.notes.ALL,ZohoCRM.modules.attachments.all,ZohoBooks.fullaccess.all,Desk.tickets.ALL,Desk.contacts.ALL,Desk.activities.READ,Desk.articles.READ,Desk.settings.READ,Desk.search.READ,Desk.basic.READ,Desk.feeds.READ,ZohoMail.messages.ALL,ZohoMail.folders.ALL,ZohoMail.accounts.READ,ZohoMail.tags.READ,ZohoPulse.post.READ,ZohoPulse.feedList.READ,ZohoPulse.grouplist.READ,ZohoPulse.networklist.READ,ZohoPulse.files.READ,ZohoPulse.blogs.READ,WorkDrive.files.READ,WorkDrive.teamfolders.READ,WorkDrive.team.READ,WorkDrive.organization.READ,WorkDrive.libraries.READ
```
**SalesIQ Token Scopes (separate token required):**
```
SalesIQ.portals.READ,SalesIQ.conversations.READ,SalesIQ.chattranscript.READ,SalesIQ.visitors.READ
```
### Step 1.3: Generate Grant Token
1. In API Console → Self Client
2. Paste scopes
3. Set duration: 10 minutes
4. Click **Create**
5. Copy the grant token immediately
### Step 1.4: Exchange for Refresh Token
**IMPORTANT:** Do this within 10 minutes of generating the grant token!
```bash
curl -X POST "https://accounts.zoho.com/oauth/v2/token" \
-d "grant_type=authorization_code" \
-d "client_id=YOUR_CLIENT_ID" \
-d "client_secret=YOUR_CLIENT_SECRET" \
-d "code=YOUR_GRANT_TOKEN"
```
**For Canadian data center, use:** `https://accounts.zoho.ca/oauth/v2/token`
Response:
```json
{
"access_token": "...",
"refresh_token": "1000.xxxxx.xxxxx", ← SAVE THIS
"expires_in": 3600
}
```
**Save the refresh_token** — this is permanent and used for all API calls.
### Token Flow Explained
```
Grant Token (10 min) → Exchange → Refresh Token (permanent)
↓
Access Token (1 hour, auto-renewed)
```
---
## Phase 2: Database Setup
### Step 2.1: Create DigitalOcean Managed Database
1. DigitalOcean Dashboard → **Databases** → **Create Database**
2. Choose **PostgreSQL 16**
3. Select region (same as droplet)
4. Choose size ($15/month starter is fine)
5. Name: `openclaw-knowledge-db`
### Step 2.2: Configure Trusted Sources
**IMPORTANT:** DigitalOcean blocks all connections by default.
1. Database → **Settings** → **Trusted Sources**
2. Add your droplet (select from dropdown)
3. Optionally add your IP for local access
### Step 2.3: Get Connection Details
From database overview page:
- **Host:** `xxx.db.ondigitalocean.com`
- **Port:** `25060`
- **Database:** `defaultdb` (or create new one)
- **User:** `doadmin`
- **Password:** (shown in dashboard)
### Step 2.4: Connect and Create Schema
Connect via droplet (internal network works even if external blocked):
```bash
# Install psql client
apt-get install -y postgresql-client
# Connect
psql "host=YOUR_HOST port=25060 dbname=YOUR_DB user=doadmin password=YOUR_PASSWORD sslmode=require"
```
Create schema:
```sql
CREATE SCHEMA IF NOT EXISTS zoho;
```
### Step 2.5: Create Tables
See [Appendix: Full Schema](#appendix-full-schema) for complete SQL.
Key tables:
- `zoho.crm_leads`
- `zoho.crm_contacts`
- `zoho.crm_accounts`
- `zoho.crm_deals`
- `zoho.crm_activities`
- `zoho.crm_notes`
- `zoho.books_contacts`
- `zoho.books_invoices`
- `zoho.books_estimates`
- `zoho.books_payments`
- `zoho.books_sales_orders`
- `zoho.salesiq_chats`
- `zoho.salesiq_visitors`
- `zoho.desk_tickets`
- `zoho.desk_articles`
- `zoho.mail_messages`
- `zoho.connect_posts`
- `zoho.workdrive_files`
---
## Phase 3: Extraction Scripts
### Step 3.1: Setup Python Environment
On your droplet:
```bash
# Install dependencies
apt-get install -y python3-pip python3.12-venv
# Create project directory
mkdir -p /opt/zoho-extract
cd /opt/zoho-extract
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install packages
pip install requests psycopg2-binary python-dotenv
```
### Step 3.2: Create Credentials File
```bash
cat > /opt/zoho-extract/.env << 'EOF'
ZOHO_CLIENT_ID=your_client_id
ZOHO_CLIENT_SECRET=your_client_secret
ZOHO_REFRESH_TOKEN=your_main_refresh_token
ZOHO_SALESIQ_REFRESH_TOKEN=your_salesiq_refresh_token
DB_HOST=your_db_host.db.ondigitalocean.com
DB_PORT=25060
DB_NAME=openclaw_kb
DB_USER=doadmin
DB_PASSWORD=your_db_password
EOF
```
**IMPORTANT:** Keep `.env` secure. Never commit to git.
### Step 3.3: CRM Extraction Script
File: `/opt/zoho-extract/extract_crm.py`
Key features:
- Gets fresh access token using refresh token
- Paginates through all records (200 per page)
- Upserts records (insert or update on conflict)
- Stores raw JSON for full data preservation
**Run:**
```bash
cd /opt/zoho-extract
source venv/bin/activate
python extract_crm.py
```
### Step 3.4: Books Extraction Script
File: `/opt/zoho-extract/extract_books.py`
**Note:** Requires Organization ID. Find it in Zoho Books → Settings → Organization, or the script will auto-detect.
**Run:**
```bash
python extract_books.py
```
### Step 3.5: SalesIQ Extraction Script
File: `/opt/zoho-extract/extract_salesiq.py`
**Notes:**
- Uses separate refresh token (different scopes)
- Uses `screenname` from portal (e.g., "rollerup")
- Timestamps are in milliseconds, need conversion
- Visitor data may not be available on all plans
**Run:**
```bash
python extract_salesiq.py
```
### Step 3.6: Running All Extractions
Create a master script:
```bash
cat > /opt/zoho-extract/extract_all.sh << 'EOF'
#!/bin/bash
cd /opt/zoho-extract
source venv/bin/activate
echo "=== Starting Full Extraction ==="
echo "Started: $(date)"
python extract_crm.py
python extract_books.py
python extract_salesiq.py
# python extract_desk.py # Coming soon
# python extract_mail.py # Coming soon
echo "=== Completed ==="
echo "Finished: $(date)"
EOF
chmod +x /opt/zoho-extract/extract_all.sh
```
### Step 3.7: Schedule Regular Extractions (Optional)
Add to crontab for daily extraction:
```bash
crontab -e
```
Add:
```
0 2 * * * /opt/zoho-extract/extract_all.sh >> /var/log/zoho-extract.log 2>&1
```
---
## Troubleshooting
### "Invalid OAuthScope" Error
- The refresh token doesn't have the required scope
- Generate new grant token with correct scopes
- Exchange for new refresh token
### "invalid_code" Error
- Grant token expired (10 min limit)
- Refresh token is wrong (might have pasted grant token)
- Generate fresh grant token and exchange immediately
### Connection Timeout to Database
- Check Trusted Sources in DigitalOcean
- Add your IP or droplet to allowed sources
- Internal DO network usually works; external may be blocked
### "Could not resolve host"
- DNS issue or no internet
- Try using IP address directly
- Check network configuration
### psql Hangs / Connection Stuck
- Port 25060 may be blocked by firewall
- Try from droplet (internal DO network)
- Check if database is online in DO dashboard
### Timestamps Out of Range
- Zoho returns milliseconds, not seconds
- Convert: `datetime.fromtimestamp(ms/1000)`
---
## Appendix: Full Schema
See file: `project/database/schema.sql`
### Quick Table Count Query
```sql
SELECT 'crm_leads' as tbl, COUNT(*) FROM zoho.crm_leads
UNION ALL SELECT 'crm_contacts', COUNT(*) FROM zoho.crm_contacts
UNION ALL SELECT 'crm_accounts', COUNT(*) FROM zoho.crm_accounts
UNION ALL SELECT 'crm_deals', COUNT(*) FROM zoho.crm_deals
UNION ALL SELECT 'crm_activities', COUNT(*) FROM zoho.crm_activities
UNION ALL SELECT 'crm_notes', COUNT(*) FROM zoho.crm_notes
UNION ALL SELECT 'books_contacts', COUNT(*) FROM zoho.books_contacts
UNION ALL SELECT 'books_invoices', COUNT(*) FROM zoho.books_invoices
UNION ALL SELECT 'salesiq_chats', COUNT(*) FROM zoho.salesiq_chats
ORDER BY 1;
```
---
## Next Steps
1. **Complete remaining extractors:** Desk, Mail, Connect, WorkDrive
2. **Build knowledge base:** Process extracted data into KB articles
3. **Deploy AI agents:** SalesIQ bots, Desk ZIA, phone bots
4. **Create dashboards:** Zoho Analytics integration
5. **Implement feedback loops:** Continuous learning from interactions
---
## Key Files Reference
| File | Purpose |
|------|---------|
| `/opt/zoho-extract/.env` | Credentials (never share!) |
| `/opt/zoho-extract/extract_crm.py` | CRM extraction |
| `/opt/zoho-extract/extract_books.py` | Books extraction |
| `/opt/zoho-extract/extract_salesiq.py` | SalesIQ extraction |
| `/opt/zoho-extract/venv/` | Python virtual environment |
---
*Document version: 1.0*
*Last updated: 2026-02-12*