Content freshness is critical for a platform that shows trending videos. If your data is 24 hours old, it's not "trending" anymore. Here's how I built a cron-based content pipeline for DailyWatch that keeps content fresh across 8 regions while staying within API quotas.
The Pipeline Architecture
The entire fetch process runs as a single PHP script called by cron:
20 */2 * * * php /var/www/html/cron/fetch_videos.php >> /var/log/fetch.log 2>&1
The script runs 6 sequential steps:
// fetch_videos.php - Main content pipeline
$startTime = microtime(true);
$db = Database::get();
echo "[" . date('Y-m-d H:i:s') . "] Starting fetch pipeline\n";
// Step 1: Fetch popular/trending videos (global)
echo "STEP 1: Popular videos\n";
$popular = fetchPopularVideos($db, $apiKey);
echo " Fetched {$popular} popular videos\n";
// Step 2: Update category list
echo "STEP 2: Categories\n";
$categories = fetchCategories($db, $apiKey);
echo " Updated {$categories} categories\n";
// Step 3: Multi-region fetch
echo "STEP 3: Regional trending\n";
$regions = explode(',', getenv('FETCH_REGIONS') ?: 'US,GB');
foreach ($regions as $region) {
$count = fetchRegionTrending($db, $apiKey, trim($region));
echo " {$region}: {$count} videos\n";
usleep(300000); // 300ms between regions
}
// Step 4: Refresh stale video data
echo "STEP 4: Stale refresh\n";
$refreshed = refreshStaleVideos($db, $apiKey, maxAge: 86400);
echo " Refreshed {$refreshed} stale videos\n";
// Step 5: Cleanup old content
echo "STEP 5: Cleanup\n";
$cleaned = cleanupOldVideos($db, maxAge: 604800); // 7 days
echo " Removed {$cleaned} expired videos\n";
// Step 6: Rebuild search index
echo "STEP 6: FTS rebuild\n";
rebuildSearchIndex($db);
echo " Search index updated\n";
// Step 7: Clear page cache
echo "STEP 7: Cache clear\n";
clearPageCache();
// Step 8: Submit new URLs to IndexNow
echo "STEP 8: IndexNow\n";
$submitted = submitToIndexNow($db);
echo " Submitted {$submitted} URLs\n";
$elapsed = round(microtime(true) - $startTime, 2);
echo "Pipeline complete in {$elapsed}s\n\n";
Step 3 in Detail: Regional Fetching
function fetchRegionTrending(PDO $db, string $apiKey, string $region): int {
$url = 'https://www.googleapis.com/youtube/v3/videos?' . http_build_query([
'part' => 'snippet,statistics,contentDetails',
'chart' => 'mostPopular',
'regionCode' => $region,
'maxResults' => 50,
'key' => $apiKey,
]);
$response = @file_get_contents($url);
if ($response === false) {
echo " WARNING: Failed to fetch region {$region}\n";
return 0;
}
$data = json_decode($response, true);
if (!isset($data['items'])) return 0;
$count = 0;
$db->beginTransaction();
foreach ($data['items'] as $item) {
$videoId = $item['id'];
$snippet = $item['snippet'];
// Upsert video
$db->prepare('
INSERT INTO videos (video_id, title, description, channel_title,
channel_id, category_id, thumbnail_url, published_at,
duration, view_count)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
ON CONFLICT(video_id) DO UPDATE SET
view_count = MAX(view_count, excluded.view_count),
fetched_at = datetime("now")
')->execute([
$videoId,
$snippet['title'],
mb_substr($snippet['description'] ?? '', 0, 500),
$snippet['channelTitle'] ?? '',
$snippet['channelId'] ?? '',
(int)($snippet['categoryId'] ?? 0),
$snippet['thumbnails']['medium']['url'] ?? '',
$snippet['publishedAt'] ?? '',
$item['contentDetails']['duration'] ?? '',
(int)($item['statistics']['viewCount'] ?? 0),
]);
// Tag region
$db->prepare('
INSERT OR IGNORE INTO video_regions (video_id, region, fetched_at)
VALUES (?, ?, datetime("now"))
')->execute([$videoId, $region]);
$count++;
}
$db->commit();
return $count;
}
Step 4: Stale Content Refresh
function refreshStaleVideos(PDO $db, string $apiKey, int $maxAge): int {
// Find videos with outdated data
$stale = $db->query("
SELECT video_id FROM videos
WHERE fetched_at < datetime('now', '-{$maxAge} seconds')
ORDER BY view_count DESC
LIMIT 50
")->fetchAll(PDO::FETCH_COLUMN);
if (empty($stale)) return 0;
// Batch fetch updated data (50 IDs per API call = 1 quota unit)
$ids = implode(',', $stale);
$url = 'https://www.googleapis.com/youtube/v3/videos?' . http_build_query([
'part' => 'statistics',
'id' => $ids,
'key' => $apiKey,
]);
$response = @file_get_contents($url);
if ($response === false) return 0;
$data = json_decode($response, true);
$updated = 0;
foreach ($data['items'] ?? [] as $item) {
$db->prepare('
UPDATE videos SET
view_count = ?,
fetched_at = datetime("now")
WHERE video_id = ?
')->execute([
(int)($item['statistics']['viewCount'] ?? 0),
$item['id'],
]);
$updated++;
}
return $updated;
}
Monitoring Output
A typical cron run at dailywatch.video produces:
[2026-02-28 20:20:01] Starting fetch pipeline
STEP 1: Popular videos
Fetched 200 popular videos
STEP 2: Categories
Updated 16 categories
STEP 3: Regional trending
US: 50 videos
GB: 50 videos
DE: 50 videos
FR: 50 videos
IN: 50 videos
BR: 50 videos
AU: 50 videos
CA: 50 videos
STEP 4: Stale refresh
Refreshed 48 stale videos
STEP 5: Cleanup
Removed 127 expired videos
STEP 6: FTS rebuild
Search index updated
STEP 7: Cache clear
STEP 8: IndexNow
Submitted 89 URLs
Pipeline complete in 28.34s
The entire pipeline completes in under 30 seconds and uses approximately 12 API quota units per run. At 12 runs per day (every 2 hours), that's 144 units out of a 10,000 daily budget.
Top comments (0)