Draft Date: 2026-03-03
Status: DRAFT — 게시 금지 (파일 저장만)
Target Platform: Medium (The Startup 또는 Towards Data Science)
Target Audience: 출판 업계 분석가, 트렌드 리서처, K-문학 관심 개발자, 번역 에이전트
Actor URL: https://apify.com/oxygenated_quagmire/yes24-book-scraper
Introduction
Every week, millions of South Koreans decide what to read next — and most of them check YES24 first.
YES24 is South Korea's largest online bookstore. It's not just a retailer; it's a cultural barometer. The books that climb its bestseller charts reflect what a society of 52 million people is thinking about, worrying about, and curious about. Self-help surges before exam season. Political memoirs spike after elections. English-language study guides never leave the top 20.
For publishers, literary agents, translators, and data analysts, this creates an extraordinary opportunity: Korea's reading habits are publicly visible, updated in real-time, and organized into 13 clean categories — from literature to science to cooking.
There's just one problem: YES24 has no public API.
The YES24 Book Scraper on Apify solves this. Built on YES24's SSR-rendered HTML and JSON-LD structured data, it extracts clean book data across three modes — bestseller charts, keyword search, and individual book details — without requiring a full browser automation stack.
Why YES24 Data is Valuable
Korea's Reading Economy in Numbers
YES24 commands roughly 40-45% of South Korea's online book retail market (2025 estimates). In a country where physical bookstores have declined sharply, YES24 is where the market reveals itself:
- 13 bestseller categories updated weekly (Literature, Economy/Business, Self-Help, Children's, Comics, Science, History, Arts, Religion, Foreign Language, Cooking, Travel, and more)
- 100,000+ new titles/year listed
- Customer reviews with star ratings on most titles
- Sales rank updates continuously for top-ranked books
Beyond commerce, YES24 data is a cultural intelligence layer:
- Which genres are growing vs. declining in Korea?
- What does Korea's business reading list say about economic sentiment?
- Which Korean titles might be candidates for international licensing?
- How does a new Korean author's book rank against established names in its genre?
The Translation and Publishing Opportunity
For international publishers and literary agents, YES24 bestseller data is particularly valuable:
Rights acquisition intelligence: Korean books that sustain Top 10 positions for 4+ weeks in Fiction or Non-Fiction are proven commodities. Many become candidates for translation rights deals — but identifying them early requires monitoring the charts consistently.
Genre trend arbitrage: Certain genres that dominate Korean bestseller lists often presage global trends. Korean self-help and personal finance books began their global crossover (think The Courage to Be Disliked) well before Western publishers noticed. The data was always there.
K-content ecosystem alignment: As Korean films, dramas, and music achieve global reach, books by the same authors or on the same themes often lag 12-18 months in international attention. Early chart monitoring = early rights opportunities.
Why Standard Tools Fail
- No public API — YES24 doesn't expose sales rank data via API
- SSR rendering — Unlike SPAs, YES24 uses server-side rendering, making the HTML parseable — but the structure changes frequently enough that maintaining a custom scraper is brittle
- JSON-LD structured data — YES24 embeds rich structured data for SEO purposes, which the actor exploits for clean extraction
- Volume — Monitoring 13 categories weekly, with 50-100 books each, is too large for manual tracking
The Solution: YES24 Book Scraper on Apify
The YES24 Book Scraper is a cloud-hosted Apify actor with three operational modes designed for different research needs.
Three Operating Modes
1. Bestseller Mode — Pull the current bestseller chart for any of 13 categories
{
"mode": "bestseller",
"category": "literature",
"limit": 50
}
2. Search Mode — Search the YES24 catalog by keyword
{
"mode": "search",
"keyword": "인공지능",
"limit": 100
}
3. Detail Mode — Extract full metadata for a specific book by URL
{
"mode": "detail",
"bookUrl": "https://www.yes24.com/Product/Goods/12345678"
}
What You Get
For each book, the actor extracts:
{
"rank": 1,
"title": "불편한 편의점",
"author": "김호연",
"publisher": "나무옆의자",
"publishedDate": "2021-04-20",
"isbn": "9791190932264",
"price": 14000,
"discountedPrice": 12600,
"rating": 4.8,
"reviewCount": 12847,
"description": "따뜻한 위로와 공감의 이야기...",
"category": "소설",
"rank": 1,
"coverImageUrl": "https://image.yes24.com/goods/...",
"bookUrl": "https://www.yes24.com/Product/Goods/...",
"tags": ["소설", "한국소설", "베스트셀러"],
"scraped_at": "2026-03-03T06:00:00+09:00"
}
Available Categories (13)
| Category ID | Korean Name | Description |
|---|---|---|
literature |
소설/시/희곡 | Fiction, Poetry, Drama |
economy |
경제경영 | Business, Economics |
self_help |
자기계발 | Self-Help, Personal Development |
children |
어린이 | Children's Books |
comics |
만화 | Comics, Manga |
science |
과학 | Science, Technology |
history |
역사 | History, Archaeology |
arts |
예술 | Arts, Music, Film |
religion |
종교 | Religion, Philosophy |
foreign_lang |
외국어 | Language Learning |
cooking |
가정/요리 | Cooking, Lifestyle |
travel |
여행 | Travel, Geography |
teens |
청소년 | Young Adult |
Step-by-Step: How to Use the YES24 Book Scraper
Step 1: Create an Apify Account
- Go to apify.com and sign up (free tier available)
- Free tier includes $5/month credit — sufficient for 10,000+ book records
- No credit card required to start
Step 2: Open the Actor
Navigate to:
👉 https://apify.com/oxygenated_quagmire/yes24-book-scraper
Click "Try for free" to open the input console.
[Screenshot: Actor page with "YES24 Book Scraper" title, 3-mode selector, category dropdown]
Step 3: Configure Your Query
For a weekly bestseller snapshot across all business books:
{
"mode": "bestseller",
"category": "economy",
"limit": 100
}
For tracking a specific topic across the catalog:
{
"mode": "search",
"keyword": "ChatGPT 활용",
"limit": 200
}
Key configuration fields:
| Field | Description | Default |
|---|---|---|
mode |
bestseller, search, or detail
|
Required |
category |
Category for bestseller mode | literature |
keyword |
Search term for search mode | Required in search mode |
limit |
Max number of books to return | 100 |
bookUrl |
Direct URL for detail mode | Required in detail mode |
[Screenshot: Input console showing mode selector set to "bestseller" and category set to "economy"]
Step 4: Run and Export
- Click "Start" — runs on Apify cloud infrastructure in Seoul region
- Monitor progress in the Live Log tab
- Typical runtime: ~30 seconds for 100 books
- Export as JSON, CSV, or XLSX from the Results tab
[Screenshot: Results table showing columns: rank, title, author, price, rating, reviewCount]
Step 5: Analyze with Python
import pandas as pd
# Load exported CSV
df = pd.read_csv('yes24_bestsellers.csv')
# Quick overview
print(f"Books scraped: {len(df)}")
print(f"\nTop 10 by rating:")
print(df.nlargest(10, 'rating')[['rank', 'title', 'author', 'rating', 'reviewCount']])
# Price analysis
print(f"\nAverage list price: ₩{df['price'].mean():,.0f}")
print(f"Average discount: {((df['price'] - df['discountedPrice']) / df['price'] * 100).mean():.1f}%")
# Most prolific publishers
print(f"\nTop publishers in bestseller chart:")
print(df['publisher'].value_counts().head(10))
Real-World Use Cases
Use Case 1: Weekly Publishing Intelligence Report
Scenario: A global literary agency wants to identify Korean fiction titles for international rights acquisition before they appear on mainstream radar.
Approach:
- Schedule weekly runs of the scraper in Bestseller mode across Literature and Self-Help
- Flag any book that:
- Has been in Top 20 for 3+ consecutive weeks
- Has 500+ reviews with a rating above 4.5
- Is not yet translated into English
- Pull full detail records for flagged titles
- Feed into a rights-inquiry workflow
Result: Early identification of titles like 불편한 편의점 (Convenience Store by the Sea) — months before its English translation was announced.
Data cost: ~$1/week for 400 books across 4 categories
Use Case 2: Korean Market Sentiment Tracker
Scenario: A macroeconomic research firm wants to track Korean consumer sentiment via reading behavior — what people choose to read reflects economic anxiety, optimism, or social mood.
Approach:
- Scrape Economy/Business and Self-Help bestseller charts weekly
- Classify books into thematic buckets: survival/frugality, wealth-building, career anxiety, entrepreneurship
- Track the share of "anxiety-driven" vs. "opportunity-driven" content over time
- Cross-reference with economic indicators (KOSPI, unemployment data)
Key hypothesis: Periods of high economic uncertainty see rises in "frugality/survival" self-help books; bull markets see "wealth-building/entrepreneurship" books climb.
Data cost: ~$2/month for continuous weekly monitoring
Use Case 3: K-Literature Trend Analysis for Translators
Scenario: A literary translator specializing in Korean-to-English wants to identify which Korean genres and themes have the highest potential for Western audiences.
Approach:
- Collect 6 months of monthly bestseller snapshots across Literature, Self-Help, and Science
- Extract keywords from book descriptions using NLP
- Compare trending themes in Korea to current Western publishing trends (using Goodreads/Amazon data as baseline)
- Identify gap opportunities: themes popular in Korea not yet prominent in Western publishing
Example finding: Environmental philosophy and "slow living" themes appear consistently in Korean self-help charts 12-18 months before equivalent titles appear in Western bestseller lists.
Data cost: ~$5 for 6-month historical snapshot
Use Case 4: Academic Research — Digital Bookstore as Cultural Mirror
Scenario: A cultural studies researcher studying how algorithmic curation shapes Korean reading culture.
Data collected:
- 12 months of weekly bestseller charts across all 13 categories
- Full metadata including review counts, ratings, publisher, and publication date
- ~15,000 book-week data points
Research questions:
- How long does a book stay on the chart? Is there a "decay curve"?
- Do publisher size and pre-publication marketing correlate with initial chart position?
- How does YES24's recommendation system affect new author discoverability?
Sample analysis:
import pandas as pd
import matplotlib.pyplot as plt
# Load 12 months of weekly data
df = pd.read_csv('yes24_12months.csv')
df['week'] = pd.to_datetime(df['scraped_at']).dt.isocalendar().week
# Chart decay analysis: weeks a book spent in Top 20
longevity = df[df['rank'] <= 20].groupby('isbn')['week'].nunique().reset_index()
longevity.columns = ['isbn', 'weeks_in_top20']
# Plot distribution
longevity['weeks_in_top20'].hist(bins=20, figsize=(10, 6))
plt.title('Distribution: How Long Books Stay in YES24 Top 20')
plt.xlabel('Weeks in Top 20')
plt.ylabel('Number of Books')
plt.savefig('chart_longevity.png', dpi=150)
Data cost: ~$10 for full 12-month research dataset
Python Integration: Full Pipeline Example
from apify_client import ApifyClient
import pandas as pd
from datetime import datetime
client = ApifyClient("YOUR_APIFY_API_TOKEN")
# === STEP 1: Scrape weekly bestsellers across 4 key categories ===
CATEGORIES = ["literature", "economy", "self_help", "science"]
all_books = []
for category in CATEGORIES:
print(f"Scraping: {category}...")
run_input = {
"mode": "bestseller",
"category": category,
"limit": 100
}
run = client.actor("oxygenated_quagmire/yes24-book-scraper").call(
run_input=run_input
)
items = client.dataset(run["defaultDatasetId"]).list_items().items
for item in items:
item["scrape_category"] = category
all_books.extend(items)
print(f" → {len(items)} books extracted")
df = pd.DataFrame(all_books)
print(f"\nTotal books: {len(df)}")
# === STEP 2: Flag Rights Acquisition Candidates ===
candidates = df[
(df['rank'] <= 20) &
(df['rating'] >= 4.5) &
(df['reviewCount'] >= 500)
].copy()
print(f"\n=== Rights Acquisition Candidates ===")
print(f"Books meeting criteria: {len(candidates)}")
print(candidates[['rank', 'title', 'author', 'rating', 'reviewCount', 'scrape_category']].to_string())
# === STEP 3: Export ===
timestamp = datetime.now().strftime("%Y%m%d")
filename = f"yes24_bestsellers_{timestamp}.csv"
df.to_csv(filename, index=False, encoding='utf-8-sig') # utf-8-sig for Excel compatibility
print(f"\nSaved to {filename}")
Scheduling Weekly Monitoring
# Schedule weekly run every Monday at 9:00 AM KST (0:00 UTC)
schedule = client.schedules().create(
name="yes24-weekly-bestsellers",
cron_expression="0 0 * * 1", # Monday 00:00 UTC = 09:00 KST
actor_id="oxygenated_quagmire/yes24-book-scraper",
run_input={
"mode": "bestseller",
"category": "literature",
"limit": 100
}
)
print(f"Weekly schedule created: {schedule['id']}")
Understanding the Output
Key Fields for Analysis
rank: Current chart position. Changes weekly. Combining rank + scraped_at creates time-series rank trajectory data.
reviewCount: Proxy for commercial success. Korean readers are prolific reviewers — 1,000+ reviews typically indicates a sustained bestseller.
rating: YES24 ratings skew slightly higher than Western equivalents (4.0+ is strong, 4.5+ is excellent). Low rating with high rank = divisive/controversial book — often culturally significant.
publishedDate: Combined with rank data, reveals how quickly a book rose to the charts. A book published 3 years ago still in Top 20 = a cultural touchstone, not just a recent release.
description: Rich text source for NLP. Korean book descriptions tend to be more detailed than Western equivalents, often including chapter-level summaries.
Pricing
The YES24 Book Scraper charges $0.50 per 1,000 items.
| Use Case | Estimated Items | Est. Cost |
|---|---|---|
| Single category snapshot | 100 books | ~$0.10 |
| All 13 categories | 1,300 books | ~$0.70 |
| Weekly monitoring (4 categories) | ~400/week | ~$0.20/week |
| 12-month research dataset | ~25,000 records | ~$12.50 |
The Apify free tier ($5/month) covers continuous weekly monitoring of all 13 categories for approximately 6 months.
Conclusion
YES24's bestseller charts are one of the most accessible windows into Korean cultural consumption available anywhere. What Koreans choose to read — and which books sustain their chart positions — encodes signals about economic sentiment, social trends, and the early movements of ideas that often become global.
The YES24 Book Scraper makes this data accessible to anyone: publishers scouting for translation rights, researchers studying Korean cultural production, analysts tracking sentiment, or developers building publishing intelligence tools.
Whether you're:
- Identifying Korean titles for international licensing before they appear on Western radar
- Tracking economic sentiment through the lens of business and self-help reading patterns
- Building a publishing analytics tool that includes Korea in its coverage
- Researching K-content culture beyond film and music
...Korea's reading data is now a query away.
Get Started
👉 Try the YES24 Book Scraper: https://apify.com/oxygenated_quagmire/yes24-book-scraper
Free Apify account includes $5/month — extract your first bestseller chart within minutes.
Questions or feature requests? Leave a review on the actor page.
The author maintains a portfolio of Korean data infrastructure actors on Apify. All 12 actors available at: https://apify.com/oxygenated_quagmire
Tags: #Korea #WebScraping #YES24 #Korean #Books #Publishing #DataScience #Apify #Python #KCulture #LiteraryAgency #Publishing
Suggested Publication: The Startup, Towards Data Science, Better Programming
Top comments (0)