Why Your social.org Files Can Have Millions of Lines Without Any Performance Issues
As Org Social grows, users follow more feeds, and individual social.org files accumulate hundreds of posts over time. Traditional approaches that download entire feeds sequentially create two major bottlenecks:
- Bandwidth waste: Downloading complete files when users only need recent posts
- Time inefficiency: Sequential downloads that block the user interface
This article explores how Org Social 2.3+ solves both problems with a sophisticated combination of concurrent queue processing and HTTP Range-based partial fetching while maintaining complete compatibility with all servers.
The Challenge
flowchart TB
A[User Opens Timeline] --> B[20 Feeds to Download]
B --> C[Traditional Approach: Sequential Downloads]
C --> D[Feed 1: 27KB 150 posts]
D --> E[Feed 2: 15KB 80 posts]
E --> F[Feed 3: 12KB 60 posts]
F --> G[... 17 more feeds]
G --> H[Total: ~300KB and ~1500 posts]
H --> I[Filter to last 14 days]
I --> J[Actually needed: ~10 posts=first page]
style C fill:#ffcccc,color:black
style H fill:#ffcccc,color:black
style J fill:#ccffcc,color:black
Downloading 300KB and processing 1500 posts to get 10 posts... It is not good!
Optimization
Org-social.el implements a sophisticated three-layer approach:
Layer 1: Concurrent Queue Processing
A process queue is a data structure that manages tasks to be executed. In Org Social, each feed to download is added to the queue as a pending task. The system then processes these tasks concurrently (multiple at the same time) using a worker pool—a limited number of threads that execute downloads in parallel.
This smart queue system manages parallel downloads without overwhelming system resources.
flowchart LR
A[Feed Queue] --> B[Worker Pool Max 20 concurrent]
B --> C[Worker 1 Feed A]
B --> D[Worker 2 Feed B]
B --> E[Worker 3 Feed C]
B --> F[...]
B --> G[Worker 20 Feed T]
C --> H{Done?}
D --> H
E --> H
G --> H
H -->|Yes| I[Process Next Pending Feed]
H -->|Error| J[Mark Failed, Continue]
I --> B
J --> I
style B fill:#e1f5ff,color:black
style H fill:#fff4e1,color:black
style I fill:#ccffcc,color:black
Key Features:
- Configurable concurrency:
org-social-max-concurrent-downloads(default: 20) - Non-blocking threads: Each download runs in a separate thread
- Automatic recovery: Failed downloads don't block the queue
- Smart scheduling: New downloads start immediately when slots free up
Layer 2: HTTP Range-Based Partial Fetching
HTTP has a built-in feature that allows downloading only specific parts of a file using the Range header. When a client sends a request with Range: bytes=0-999, the server responds with just the first 1000 bytes of the file instead of the entire content. This capability is commonly used for video streaming and resumable downloads, but it can also be used to paginate files—downloading them in chunks rather than all at once.
Instead of downloading entire social.org files, Org Social uses HTTP Range requests to fetch only what's needed: the header section and recent posts.
This system is not compatible with all providers. While most traditional web servers (Apache, Nginx, Caddy) support HTTP Range requests natively, some hosting platforms have limitations:
-
Cloudflare CDN: Does not provide
Content-LengthorContent-Rangeheaders, making it impossible to determine file size or download specific byte ranges. The system automatically falls back to downloading the complete file and filtering client-side. -
Codeberg.org: Implements aggressive rate limiting when multiple Range requests are made in quick succession. When HTTP 429 (Too Many Requests) is detected, the system falls back to a full download without filtering to avoid being blocked.
-
GitHub Raw Content: Provides proper HTTP Range support and works optimally with partial downloads.
The system detects these limitations automatically and adapts its strategy to ensure 100% compatibility across all hosting platforms.
sequenceDiagram
participant C as Client
participant S as Server
participant F as social.org (27KB, 150 posts)
Note over C,F: Step 1: Find Header
C->>S: Range: bytes=0-999
S->>C: First 1KB (headers)
Note over C: Found "* Posts" at byte 800
Note over C,F: Step 2: Get File Size
C->>S: Range: bytes=0-0
S->>C: Content-Range: bytes 0-0/27656
Note over C: Total size: 27656 bytes
Note over C,F: Step 3: Fetch Recent Posts
C->>S: Range: bytes=26656-27655
S->>C: Last 1KB (recent posts)
C->>S: Range: bytes=25656-26655
S->>C: Previous 1KB
Note over C: Found post older than 14 days
Note over C,F: Result: Downloaded 3KB instead of 27KB
Algorithm:
- Header Discovery (bytes 0 → forwards)
- Download the first 1KB chunk (bytes 0-999)
- If
* Postsis not found, download the next 1KB chunk (bytes 1000-1999) - Continue downloading subsequent chunks until
* Postsis found -
Typical header size: 500-1500 bytes
-
Backward Post Fetching (end → backwards)
- Start from the end of the file (most recent posts)
- Download 1KB chunks moving backwards
- Parse each post's
:ID:property (e.g.,:ID: 2025-10-24T10:00:00+0200) -
Stop when reaching posts older than
org-social-max-post-age-days(default: 14 days) -
Date Filtering
- Parse post IDs (RFC 3339 timestamps)
- Keep only posts ≥ start date
- Discard older posts without downloading
Trick for Range Support Detection
The system sends a test request with the Range: bytes=0-0 header to check if the server responds with Content-Range or Accept-Ranges: bytes headers, indicating Range support.
Edge Cases and Fallbacks: Compressed Content
Servers using gzip compression (e.g., Caddy) report compressed sizes:
Problem: HEAD request returns compressed size, but content arrives uncompressed
Solution: Use a Range request (bytes=0-0) for size detection instead. The server responds with Content-Range: bytes 0-0/TOTAL where TOTAL is the actual uncompressed file size.
Layer 3: UI Pagination
Even after downloading only recent posts, rendering all of them at once would overwhelm Emacs. The Org Social UI uses widgets to create an interactive interface with buttons, images, and formatted text. Each widget consumes memory and processing power.
To keep the interface responsive, the system implements pagination that displays only 10 posts per page. This means:
- When a user opens the timeline, only the first 10 most recent posts are rendered
- Images, avatars, and interactive widgets are created only for these 10 visible posts
- The remaining downloaded posts stay in memory but aren't rendered
- Users can navigate to the next page, which then renders the next 10 posts
flowchart LR
A[Downloaded Posts: 50] --> B[Page 1: Render 10 posts]
A --> C[Page 2: 10 posts in memory]
A --> D[Page 3: 10 posts in memory]
A --> E[Page 4: 10 posts in memory]
A --> F[Page 5: 10 posts in memory]
B --> G[User sees: 10 posts with widgets & images]
C -.->|User clicks Next| H[Render next 10 posts]
style A fill:#e1f5ff,color:black
style B fill:#ccffcc,color:black
style G fill:#d5ffe1,color:black
style H fill:#ffe1cc,color:black
Emacs widgets and image rendering are resource-intensive. Rendering 50 posts with avatars and buttons could slow down the editor. By rendering only 10 at a time, the UI stays fast and responsive regardless of how many posts were downloaded.
This is the final optimization layer: even if your social.org has 10,000 posts, and you download 50 recent ones, you only render 10 on screen. The rest wait in memory until needed.
Performance Benchmarks
The following table shows how the system scales with different feed sizes, assuming an average post size of 250 bytes and a 14-day filter (capturing approximately 20-30 recent posts):
| Scenario | Total Posts | Full File Size | With Partial Fetch (14 days) | Description |
|---|---|---|---|---|
| Empty feed | 0 | ~1 KB | ~1 KB | Only headers downloaded |
| New user | 1 | ~1.5 KB | ~1.5 KB | Single post, no optimization needed |
| Light user | 10 | ~3.5 KB | ~3.5 KB | All posts fit in 14-day window |
| Regular user | 100 | ~26 KB | ~8 KB | Headers + ~30 recent posts |
| Active user | 1,000 | ~250 KB | ~8 KB | Headers + ~30 recent posts |
| Power user | 10,000 | ~2.5 MB | ~8 KB | Headers + ~30 recent posts |
Key insight: Once a feed exceeds ~100 posts, partial fetching maintains consistent download sizes (~8 KB) regardless of total feed size. A feed with 10,000 posts downloads the same amount of data as one with 1,000 posts.
Tuning Recommendations
Users can customize two main parameters:
org-social-max-concurrent-downloads: Maximum parallel downloads (default: 20)org-social-max-post-age-days: Maximum age of posts to fetch in days (default: 14)
So...
- Fast connection + many feeds: Increase to 30 concurrent
- Slow connection: Decrease to 10 concurrent
- Large feeds + limited bandwidth: Decrease
max-post-age-daysto 7 - Small feeds: Set
max-post-age-daysto nil (no optimization needed)
Conclusion
The three-layer optimization approach (concurrent queue processing, HTTP Range-based partial fetching, and UI pagination) provides a significant bandwidth optimization on large feeds with date filtering and non-blocking UI. This architecture positions Org Social to scale efficiently as both individual feeds and follower counts grow, while maintaining the simplicity and decentralization that make Org Social unique.
Your social.org can have millions of lines because:
- Only recent posts are downloaded (14 days by default)
- Downloads happen in parallel without blocking
- Only 10 posts are rendered on screen at once
Enjoy it!
Technical References
- HTTP Range Requests: RFC 7233
- RFC 3339 Timestamps: RFC 3339
- Org Social Specification: github.com/tanrax/org-social
- The Challenge
- Optimization
- Layer 1: Concurrent Queue Processing
- Layer 2: HTTP Range-Based Partial Fetching
- Trick for Range Support Detection
- Edge Cases and Fallbacks: Compressed Content
- Layer 3: UI Pagination
- Performance Benchmarks
- Tuning Recommendations
- Conclusion
- Technical References
This work is under a Attribution-NonCommercial-NoDerivatives 4.0 International license.
Will you buy me a coffee?
You can use the terminal.
ssh customer@andros.dev -p 5555