Why Your social.org Files Can Have Millions of Lines Without Any Performance Issues

As Org Social grows, users follow more feeds, and individual social.org files accumulate hundreds of posts over time. Traditional approaches that download entire feeds sequentially create two major bottlenecks:

Bandwidth waste: Downloading complete files when users only need recent posts
Time inefficiency: Sequential downloads that block the user interface

This article explores how Org Social 2.3+ solves both problems with a sophisticated combination of concurrent queue processing and HTTP Range-based partial fetching while maintaining complete compatibility with all servers.

The Challenge

flowchart TB
    A[User Opens Timeline] --> B[20 Feeds to Download]
    B --> C[Traditional Approach: Sequential Downloads]
    C --> D[Feed 1: 27KB 150 posts]
    D --> E[Feed 2: 15KB 80 posts]
    E --> F[Feed 3: 12KB 60 posts]
    F --> G[... 17 more feeds]
    G --> H[Total: ~300KB and ~1500 posts]
    H --> I[Filter to last 14 days]
    I --> J[Actually needed: ~10 posts=first page]

    style C fill:#ffcccc,color:black
    style H fill:#ffcccc,color:black
    style J fill:#ccffcc,color:black

Downloading 300KB and processing 1500 posts to get 10 posts... It is not good!

Optimization

Org-social.el implements a sophisticated three-layer approach:

Layer 1: Concurrent Queue Processing

A process queue is a data structure that manages tasks to be executed. In Org Social, each feed to download is added to the queue as a pending task. The system then processes these tasks concurrently (multiple at the same time) using a worker pool—a limited number of threads that execute downloads in parallel.

This smart queue system manages parallel downloads without overwhelming system resources.

flowchart LR
    A[Feed Queue] --> B[Worker Pool Max 20 concurrent]
    B --> C[Worker 1 Feed A]
    B --> D[Worker 2 Feed B]
    B --> E[Worker 3 Feed C]
    B --> F[...]
    B --> G[Worker 20 Feed T]

    C --> H{Done?}
    D --> H
    E --> H
    G --> H

    H -->|Yes| I[Process Next Pending Feed]
    H -->|Error| J[Mark Failed, Continue]

    I --> B
    J --> I

    style B fill:#e1f5ff,color:black
    style H fill:#fff4e1,color:black
    style I fill:#ccffcc,color:black

Key Features:

Configurable concurrency: org-social-max-concurrent-downloads (default: 20)
Non-blocking threads: Each download runs in a separate thread
Automatic recovery: Failed downloads don't block the queue
Smart scheduling: New downloads start immediately when slots free up

Layer 2: HTTP Range-Based Partial Fetching

HTTP has a built-in feature that allows downloading only specific parts of a file using the Range header. When a client sends a request with Range: bytes=0-999, the server responds with just the first 1000 bytes of the file instead of the entire content. This capability is commonly used for video streaming and resumable downloads, but it can also be used to paginate files—downloading them in chunks rather than all at once.

Instead of downloading entire social.org files, Org Social uses HTTP Range requests to fetch only what's needed: the header section and recent posts.

This system is not compatible with all providers. While most traditional web servers (Apache, Nginx, Caddy) support HTTP Range requests natively, some hosting platforms have limitations:

Cloudflare CDN: Does not provide Content-Length or Content-Range headers, making it impossible to determine file size or download specific byte ranges. The system automatically falls back to downloading the complete file and filtering client-side.
Codeberg.org: Implements aggressive rate limiting when multiple Range requests are made in quick succession. When HTTP 429 (Too Many Requests) is detected, the system falls back to a full download without filtering to avoid being blocked.
GitHub Raw Content: Provides proper HTTP Range support and works optimally with partial downloads.

The system detects these limitations automatically and adapts its strategy to ensure 100% compatibility across all hosting platforms.

sequenceDiagram
    participant C as Client
    participant S as Server
    participant F as social.org (27KB, 150 posts)

    Note over C,F: Step 1: Find Header
    C->>S: Range: bytes=0-999
    S->>C: First 1KB (headers)
    Note over C: Found "* Posts" at byte 800

    Note over C,F: Step 2: Get File Size
    C->>S: Range: bytes=0-0
    S->>C: Content-Range: bytes 0-0/27656
    Note over C: Total size: 27656 bytes

    Note over C,F: Step 3: Fetch Recent Posts
    C->>S: Range: bytes=26656-27655
    S->>C: Last 1KB (recent posts)

    C->>S: Range: bytes=25656-26655
    S->>C: Previous 1KB
    Note over C: Found post older than 14 days

    Note over C,F: Result: Downloaded 3KB instead of 27KB

Algorithm:

Header Discovery (bytes 0 → forwards)
Download the first 1KB chunk (bytes 0-999)
If * Posts is not found, download the next 1KB chunk (bytes 1000-1999)
Continue downloading subsequent chunks until * Posts is found
Typical header size: 500-1500 bytes
Backward Post Fetching (end → backwards)
Start from the end of the file (most recent posts)
Download 1KB chunks moving backwards
Parse each post's :ID: property (e.g., :ID: 2025-10-24T10:00:00+0200)
Stop when reaching posts older than org-social-max-post-age-days (default: 14 days)
Date Filtering
Parse post IDs (RFC 3339 timestamps)
Keep only posts ≥ start date
Discard older posts without downloading

Trick for Range Support Detection

The system sends a test request with the Range: bytes=0-0 header to check if the server responds with Content-Range or Accept-Ranges: bytes headers, indicating Range support.

Edge Cases and Fallbacks: Compressed Content

Servers using gzip compression (e.g., Caddy) report compressed sizes:

Problem: HEAD request returns compressed size, but content arrives uncompressed

Solution: Use a Range request (bytes=0-0) for size detection instead. The server responds with Content-Range: bytes 0-0/TOTAL where TOTAL is the actual uncompressed file size.

Layer 3: UI Pagination

Even after downloading only recent posts, rendering all of them at once would overwhelm Emacs. The Org Social UI uses widgets to create an interactive interface with buttons, images, and formatted text. Each widget consumes memory and processing power.

To keep the interface responsive, the system implements pagination that displays only 10 posts per page. This means:

When a user opens the timeline, only the first 10 most recent posts are rendered
Images, avatars, and interactive widgets are created only for these 10 visible posts
The remaining downloaded posts stay in memory but aren't rendered
Users can navigate to the next page, which then renders the next 10 posts

flowchart LR
    A[Downloaded Posts: 50] --> B[Page 1: Render 10 posts]
    A --> C[Page 2: 10 posts in memory]
    A --> D[Page 3: 10 posts in memory]
    A --> E[Page 4: 10 posts in memory]
    A --> F[Page 5: 10 posts in memory]

    B --> G[User sees: 10 posts with widgets & images]

    C -.->|User clicks Next| H[Render next 10 posts]

    style A fill:#e1f5ff,color:black
    style B fill:#ccffcc,color:black
    style G fill:#d5ffe1,color:black
    style H fill:#ffe1cc,color:black

Emacs widgets and image rendering are resource-intensive. Rendering 50 posts with avatars and buttons could slow down the editor. By rendering only 10 at a time, the UI stays fast and responsive regardless of how many posts were downloaded.

This is the final optimization layer: even if your social.org has 10,000 posts, and you download 50 recent ones, you only render 10 on screen. The rest wait in memory until needed.

Performance Benchmarks

The following table shows how the system scales with different feed sizes, assuming an average post size of 250 bytes and a 14-day filter (capturing approximately 20-30 recent posts):

Scenario	Total Posts	Full File Size	With Partial Fetch (14 days)	Description
Empty feed	0	~1 KB	~1 KB	Only headers downloaded
New user	1	~1.5 KB	~1.5 KB	Single post, no optimization needed
Light user	10	~3.5 KB	~3.5 KB	All posts fit in 14-day window
Regular user	100	~26 KB	~8 KB	Headers + ~30 recent posts
Active user	1,000	~250 KB	~8 KB	Headers + ~30 recent posts
Power user	10,000	~2.5 MB	~8 KB	Headers + ~30 recent posts

Key insight: Once a feed exceeds ~100 posts, partial fetching maintains consistent download sizes (~8 KB) regardless of total feed size. A feed with 10,000 posts downloads the same amount of data as one with 1,000 posts.

Tuning Recommendations

Users can customize two main parameters:

org-social-max-concurrent-downloads: Maximum parallel downloads (default: 20)
org-social-max-post-age-days: Maximum age of posts to fetch in days (default: 14)

So...

Fast connection + many feeds: Increase to 30 concurrent
Slow connection: Decrease to 10 concurrent
Large feeds + limited bandwidth: Decrease max-post-age-days to 7
Small feeds: Set max-post-age-days to nil (no optimization needed)

Conclusion

The three-layer optimization approach (concurrent queue processing, HTTP Range-based partial fetching, and UI pagination) provides a significant bandwidth optimization on large feeds with date filtering and non-blocking UI. This architecture positions Org Social to scale efficiently as both individual feeds and follower counts grow, while maintaining the simplicity and decentralization that make Org Social unique.

Your social.org can have millions of lines because:

Only recent posts are downloaded (14 days by default)
Downloads happen in parallel without blocking
Only 10 posts are rendered on screen at once

Enjoy it!

Technical References

HTTP Range Requests: RFC 7233
RFC 3339 Timestamps: RFC 3339
Org Social Specification: github.com/tanrax/org-social

The Challenge
Optimization
Layer 1: Concurrent Queue Processing
Layer 2: HTTP Range-Based Partial Fetching
Trick for Range Support Detection
Edge Cases and Fallbacks: Compressed Content
Layer 3: UI Pagination
Performance Benchmarks
Tuning Recommendations
Conclusion
Technical References

This work is under a Attribution-NonCommercial-NoDerivatives 4.0 International license.

Will you buy me a coffee?

You can use the terminal.

ssh customer@andros.dev -p 5555