Pratyush Mishra

Posted on Apr 13

The problem with every watch-party app ever made

#selfhosted #jellyfin #linux #docker

You open Teleparty. Your friend opens Teleparty. You both navigate to the same Netflix URL. You count to three. Someone's internet hiccups. Now you're 4 seconds ahead and the joke lands for only one of you.

The fundamental issue isn't synchronization. It's architecture. Every mainstream watch-party tool works by syncing a cursor position on top of a third-party stream. You're both still pulling separate streams from Netflix's CDN, hoping latency is kind, and papering over the cracks with a shared play/pause event.

SameRow approaches this differently. Instead of syncing a cursor on someone else's platform, it syncs playback state between two self-hosted Jellyfin instances. Each user streams true 4K from their own server, to their own screen. The only thing traveling over the network is a lightweight state signal — play, pause, seek, timestamp. No screen capture. No transcoding penalty. No DRM fights.

This is the technical breakdown of how it works.

The Architecture at a Glance

Before diving into individual components, here's what the full system looks like:

User A                          Signaling Server              User B
┌─────────────────┐            ┌──────────────────┐          ┌─────────────────┐
│  Jellyfin       │            │  Room State      │          │  Jellyfin       │
│  Instance (4K)  │            │  WebSocket Hub   │          │  Instance (4K)  │
│                 │◄──sync─────│                  │─────sync►│                 │
│  SameRow Client │            │  Clock Sync      │          │  SameRow Client │
│  (WebRTC)       │◄──p2p─────────────────────────────p2p───►│  (WebRTC)       │
└─────────────────┘            └──────────────────┘          └─────────────────┘
        │                                                              │
        │                                                              │
Cloudflare Tunnel                                            Cloudflare Tunnel
(CGNAT bypass)                                               (CGNAT bypass)

Three layers working simultaneously:

Signaling layer — a lightweight server managing room state and clock synchronization
P2P layer — WebRTC direct connection for video calling and screen sharing
Media layer — each client's local Jellyfin instance, playing content independently but in sync

The CGNAT Problem and Why It Matters

Most home internet connections in India — and increasingly everywhere — use Carrier-Grade NAT. Your ISP assigns you a private IP shared with hundreds of other subscribers. Port forwarding is impossible. Your Jellyfin server is invisible to the public internet.

The standard advice is "just buy a VPS and reverse proxy it." That works but it routes all your 4K media traffic through a server you're paying for by the gigabyte. Expensive and unnecessarily slow.

SameRow uses a split-tunneling approach instead:

Cloudflare Tunnels for Jellyfin access:

# Install cloudflared
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -o cloudflared
chmod +x cloudflared

# Authenticate and create tunnel
./cloudflared tunnel login
./cloudflared tunnel create samerow-jellyfin

# Configure the tunnel
cat > ~/.cloudflared/config.yml << EOF
tunnel: <YOUR_TUNNEL_ID>
credentials-file: /root/.cloudflared/<YOUR_TUNNEL_ID>.json

ingress:
  - hostname: jellyfin.yourdomain.com
    service: http://localhost:8096
  - service: http_status:404
EOF

# Run as service
./cloudflared tunnel run samerow-jellyfin

This gives each user a stable public HTTPS endpoint for their Jellyfin instance with zero open ports and zero VPS costs. The media streams directly from their machine to their own browser. Only the Jellyfin API calls — the lightweight state signals — travel through the tunnel.

Tailscale for the signaling server:

# Install Tailscale
curl -fsSL https://tailscale.com/install.sh | sh
sudo tailscale up

# Signaling server is now accessible at the Tailscale IP
# No public exposure needed for the coordination layer

The Signaling Server

The signaling server has one job: coordinate room state between clients. It does not touch media. It does not proxy streams. It is intentionally thin.

Built with Node.js and Socket.io:

// server/index.js
const express = require('express')
const { createServer } = require('http')
const { Server } = require('socket.io')

const app = express()
const httpServer = createServer(app)
const io = new Server(httpServer, {
  cors: { origin: '*' }
})

// Room state store
const rooms = new Map()

io.on('connection', (socket) => {
  console.log(`Client connected: ${socket.id}`)

  // Room creation
  socket.on('create-room', ({ roomId, jellyfinUrl }) => {
    rooms.set(roomId, {
      host: socket.id,
      jellyfinUrl,
      playbackState: {
        isPlaying: false,
        currentTime: 0,
        itemId: null,
        lastUpdated: Date.now()
      },
      clients: new Set([socket.id])
    })
    socket.join(roomId)
    socket.emit('room-created', { roomId })
  })

  // Room joining
  socket.on('join-room', ({ roomId }) => {
    const room = rooms.get(roomId)
    if (!room) return socket.emit('error', { message: 'Room not found' })

    room.clients.add(socket.id)
    socket.join(roomId)

    // Send current state to joining client
    socket.emit('room-state', room.playbackState)
    socket.to(roomId).emit('peer-joined', { peerId: socket.id })
  })

  // Playback state sync
  socket.on('playback-update', ({ roomId, state }) => {
    const room = rooms.get(roomId)
    if (!room) return

    // Only host can update state
    // (prevents feedback loops from multiple simultaneous updates)
    if (socket.id !== room.host) return

    room.playbackState = { ...state, lastUpdated: Date.now() }
    socket.to(roomId).emit('playback-sync', room.playbackState)
  })

  socket.on('disconnect', () => {
    rooms.forEach((room, roomId) => {
      room.clients.delete(socket.id)
      if (room.clients.size === 0) rooms.delete(roomId)
    })
  })
})

httpServer.listen(3001)

The host-only write pattern on line 47 is important. It's what prevents the feedback loop problem — when multiple clients can all emit state updates, you get an infinite ping-pong of play/pause events. One source of truth, broadcast to everyone else.

WebRTC: Video Calls and Screen Sharing

The media synchronization and the video calling are separate concerns in SameRow. WebRTC handles the human layer — seeing your friend's face, sharing your screen — while Jellyfin handles the content layer.

// client/webrtc.js
class SameRowPeer {
  constructor(socket, roomId) {
    this.socket = socket
    this.roomId = roomId
    this.peers = new Map()
  }

  async initializeMedia() {
    // Get camera and microphone
    this.localStream = await navigator.mediaDevices.getUserMedia({
      video: { width: 1280, height: 720 },
      audio: true
    })
    return this.localStream
  }

  async startScreenShare() {
    // Capture display — this is traditional screen sharing
    // but in SameRow it's used for the UI overlay, not the media
    this.screenStream = await navigator.mediaDevices.getDisplayMedia({
      video: { frameRate: 30 },
      audio: true
    })
    return this.screenStream
  }

  async createPeerConnection(peerId) {
    const pc = new RTCPeerConnection({
      iceServers: [
        { urls: 'stun:stun.l.google.com:19302' },
        // Add TURN server here for production
      ]
    })

    // Add local tracks
    this.localStream.getTracks().forEach(track => {
      pc.addTrack(track, this.localStream)
    })

    // ICE candidate handling
    pc.onicecandidate = ({ candidate }) => {
      if (candidate) {
        this.socket.emit('ice-candidate', {
          roomId: this.roomId,
          peerId,
          candidate
        })
      }
    }

    // Handle incoming tracks
    pc.ontrack = ({ streams }) => {
      const remoteVideo = document.getElementById('remote-video')
      remoteVideo.srcObject = streams[0]
    }

    this.peers.set(peerId, pc)
    return pc
  }

  async makeOffer(peerId) {
    const pc = await this.createPeerConnection(peerId)
    const offer = await pc.createOffer()
    await pc.setLocalDescription(offer)

    this.socket.emit('webrtc-offer', {
      roomId: this.roomId,
      peerId,
      offer
    })
  }

  async handleOffer(peerId, offer) {
    const pc = await this.createPeerConnection(peerId)
    await pc.setRemoteDescription(offer)

    const answer = await pc.createAnswer()
    await pc.setLocalDescription(answer)

    this.socket.emit('webrtc-answer', {
      roomId: this.roomId,
      peerId,
      answer
    })
  }
}

The Jellyfin Integration

This is where SameRow diverges from every other watch-party implementation.

Instead of capturing and re-encoding your screen, SameRow reads playback state from Jellyfin's API and replicates it on the other client's Jellyfin instance. Both users are playing the same file from their own library. The streams never leave their respective machines.

// client/jellyfin.js
class JellyfinSync {
  constructor(serverUrl, apiKey) {
    this.serverUrl = serverUrl
    this.apiKey = apiKey
    this.headers = {
      'X-Emby-Token': apiKey,
      'Content-Type': 'application/json'
    }
  }

  // Poll current playback state
  async getPlaybackState(sessionId) {
    const response = await fetch(
      `${this.serverUrl}/Sessions?api_key=${this.apiKey}`
    )
    const sessions = await response.json()
    const session = sessions.find(s => s.Id === sessionId)

    if (!session?.NowPlayingItem) return null

    return {
      itemId: session.NowPlayingItem.Id,
      currentTime: session.PlayState.PositionTicks / 10000000, // Convert ticks to seconds
      isPlaying: !session.PlayState.IsPaused,
      mediaTitle: session.NowPlayingItem.Name
    }
  }

  // Apply playback state to local Jellyfin instance
  async applyPlaybackState(sessionId, state) {
    const positionTicks = Math.floor(state.currentTime * 10000000)

    // Seek to position
    await fetch(
      `${this.serverUrl}/Sessions/${sessionId}/Playing/Seek`,
      {
        method: 'POST',
        headers: this.headers,
        body: JSON.stringify({ SeekPositionTicks: positionTicks })
      }
    )

    // Play or pause
    const command = state.isPlaying ? 'Unpause' : 'Pause'
    await fetch(
      `${this.serverUrl}/Sessions/${sessionId}/Playing/${command}`,
      { method: 'POST', headers: this.headers }
    )
  }

  // Start polling for state changes (host only)
  startPolling(sessionId, onStateChange, interval = 1000) {
    this.pollingInterval = setInterval(async () => {
      const state = await this.getPlaybackState(sessionId)
      if (state) onStateChange(state)
    }, interval)
  }

  stopPolling() {
    clearInterval(this.pollingInterval)
  }
}

The polling approach is used here rather than webhooks for simplicity. Jellyfin does support a webhooks plugin for real-time push events — that's the production-grade version — but for an MVP, 1-second polling introduces acceptable latency and is far easier to implement and debug.

The Drift Compensation System

This is the most technically interesting part of SameRow and the problem most WebRTC tutorials skip entirely.

When two clients receive a "seek to timestamp X" command, they don't execute it at exactly the same moment. Network latency means Client B receives the command some milliseconds after Client A. Over time, these small offsets compound into visible desync.

SameRow handles this with a three-tier system:

Tier 1: NTP-Style Clock Offset Calculation

On session start, both clients calculate the true network offset between them:

// client/clockSync.js
class ClockSync {
  constructor(socket) {
    this.socket = socket
    this.offset = 0
    this.rtt = 0
  }

  async calculateOffset() {
    return new Promise((resolve) => {
      const t1 = Date.now()

      this.socket.emit('clock-ping', { t1 })

      this.socket.once('clock-pong', ({ t1, t2, t3 }) => {
        const t4 = Date.now()

        // NTP offset formula
        this.rtt = (t4 - t1) - (t3 - t2)
        this.offset = ((t2 - t1) + (t3 - t4)) / 2

        console.log(`Clock offset: ${this.offset}ms, RTT: ${this.rtt}ms`)
        resolve(this.offset)
      })
    })
  }

  // Schedule playback to start at a future agreed timestamp
  // Both clients receive the same startAt value
  // Network delay is already accounted for in the offset
  schedulePlayback(startAt) {
    const localStartAt = startAt + this.offset
    const delay = localStartAt - Date.now()

    if (delay > 0) {
      setTimeout(() => this.triggerPlayback(), delay)
    } else {
      this.triggerPlayback()
    }
  }
}

Tier 2: Gradual Drift Correction (The Silent Fix)

For small ongoing drift under 2 seconds, SameRow adjusts playback rate rather than seeking. This is the same technique streaming platforms use — imperceptibly playing at 1.05x or 0.95x until the clients converge:

// client/driftCompensation.js
class DriftCompensation {
  constructor(jellyfinClient) {
    this.jellyfin = jellyfinClient
    this.checkInterval = null
  }

  start(sessionId, getExpectedTime) {
    this.checkInterval = setInterval(async () => {
      const state = await this.jellyfin.getPlaybackState(sessionId)
      if (!state || !state.isPlaying) return

      const expectedTime = getExpectedTime()
      const drift = state.currentTime - expectedTime

      await this.compensate(sessionId, drift)
    }, 1000)
  }

  async compensate(sessionId, drift) {
    const absDrift = Math.abs(drift)

    if (absDrift < 0.1) {
      // Under 100ms — within acceptable tolerance, do nothing
      return
    }

    if (absDrift >= 0.1 && absDrift < 0.5) {
      // 100ms to 500ms — silent rate adjustment
      // User never notices a 5% speed change
      const rate = drift > 0 ? 0.95 : 1.05
      await this.jellyfin.setPlaybackRate(sessionId, rate)

    } else if (absDrift >= 0.5 && absDrift < 2.0) {
      // 500ms to 2s — more aggressive rate adjustment
      const rate = drift > 0 ? 0.90 : 1.10
      await this.jellyfin.setPlaybackRate(sessionId, rate)

    } else {
      // Over 2s — hard resync, pause both clients
      await this.hardResync(sessionId)
    }
  }

  async hardResync(sessionId) {
    // Pause, seek to correct position, resume
    // This is the last resort — visible to user but necessary
    console.log('Drift exceeded 2s threshold — executing hard resync')
    // Implementation: emit resync event to signaling server
    // Server broadcasts pause + seek + resume to all clients
  }

  stop() {
    clearInterval(this.checkInterval)
  }
}

Tier 3: Hard Resync

Only triggered when drift exceeds 2 seconds — network congestion, a client that was backgrounded, a machine that went to sleep. At this point invisible correction isn't possible and both clients pause, seek to the correct timestamp, and resume together.

Docker Deployment

The entire stack ships as a single docker-compose.yml. Any user with Docker installed can run SameRow with their own Jellyfin instance in under five minutes:

# docker-compose.yml
version: '3.8'

services:
  signaling:
    build: ./signaling
    ports:
      - "3001:3001"
    environment:
      - NODE_ENV=production
    restart: unless-stopped

  client:
    build: ./client
    ports:
      - "3000:3000"
    environment:
      - NEXT_PUBLIC_SIGNALING_URL=http://localhost:3001
    depends_on:
      - signaling
    restart: unless-stopped

networks:
  default:
    driver: bridge

# signaling/Dockerfile
FROM node:18-alpine

WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

COPY . .
EXPOSE 3001

CMD ["node", "index.js"]

The Jellyfin instances are not containerized here — they're external. Users bring their own. The JELLYFIN_URL and JELLYFIN_API_KEY are runtime environment variables, meaning SameRow works with any Jellyfin instance anywhere, including behind a Cloudflare Tunnel.

# One command deployment
JELLYFIN_URL=https://jellyfin.yourdomain.com \
JELLYFIN_API_KEY=your_api_key_here \
docker-compose up -d

Key Features Summary

Feature	Implementation	Why it matters
Synchronized playback	Jellyfin API polling + signaling server	True 4K, no quality loss
CGNAT bypass	Cloudflare Tunnels	Works on any home connection
Drift compensation	Three-tier rate adjustment	No jarring pause-and-resync
Video calling	WebRTC P2P	See your friend while watching
Screen sharing	`getDisplayMedia()`	Share UI context, not media
Room management	Socket.io + host authority model	Prevents feedback loops
Portable deployment	Single Docker Compose file	Anyone can self-host it

What's Next

The current implementation uses polling to read Jellyfin state. The production upgrade is Jellyfin's webhooks plugin — real-time push events instead of 1-second polls, dropping the baseline latency from ~1000ms to near-zero.

The TURN server situation also needs addressing for production. STUN works when both clients have relatively open NATs. Behind stricter firewalls — corporate networks, some mobile connections — WebRTC P2P fails and you need a TURN relay. Coturn is the standard self-hosted option and slots into the Docker Compose setup cleanly.

The GitHub repository with full source, deployment documentation, and architecture diagrams is at: github.com/devpratyushh/samerow

SameRow is open source. If you run a Jellyfin instance and want synchronized watch parties without giving up 4K quality, this is the setup.

DEV Community