DEV Community

NexGenData
NexGenData

Posted on • Originally published at thenextgennexus.com

Scraping China A-Share Stock Data from Eastmoney

If you cover Chinese equities for a living, you already know the data problem. Bloomberg's mainland coverage is exhaustive but priced for the buy-side. Refinitiv and FactSet are similar. Wind and Choice are excellent if you can read Chinese, negotiate a domestic license, and route payment onshore. And the free Western feeds -- Yahoo Finance, Google Finance, Alpha Vantage -- have spotty Shenzhen coverage and almost no Beijing Stock Exchange tickers at all. For a quant team trying to build a multi-factor model across the full A-share universe, that gap is a real research blocker.

This post walks through the Eastmoney China A-Shares Screener actor on Apify -- what it pulls, how to wire it into a screening workflow, and how to combine it with adjacent China and Hong Kong data sources to build a coherent EM research stack without paying terminal-class fees.

1. The problem: China A-share data coverage is broken outside China

The structural issue is simple. Eastmoney (东方财富, eastmoney.com) is the de facto retail and semi-professional data portal for Chinese equities. It aggregates exchange data from the Shanghai Stock Exchange (SSE), Shenzhen Stock Exchange (SZSE), and the Beijing Stock Exchange (BSE), plus derived analytics, sector classifications, northbound flow snapshots, and consensus estimates. If you ask a Chinese sell-side analyst where they check a quote intraday, the answer is usually Eastmoney or Tonghuashun.

The catch: Eastmoney's UI is Chinese-only, there is no documented public REST API, and the underlying JSON endpoints rotate, paginate inconsistently, and gzip-encode with non-standard headers. Western vendors either skip the full A-share universe or charge enterprise prices. Bloomberg with mainland entitlements runs $24,000+ per seat-year. Free vendors like Yahoo Finance and Alpha Vantage carry the SSE Composite and a curated slice of large caps but miss the Shenzhen Main Board long tail, almost all of ChiNext, most of the STAR Market, and effectively none of the Beijing Stock Exchange.

MSCI's progressive A-share inclusion has dragged passive flow into mainland names since 2018, but coverage of the long tail -- Shenzhen mid-caps, ChiNext growth names, STAR semis, and the smaller BSE listings -- remains expensive or unavailable through Western channels. Scraping Eastmoney directly is the pragmatic workaround, and an Apify actor is the cleanest way to do it without maintaining your own headless-browser pipeline.

2. Why this data matters for global allocators

China's A-share market is the world's second-largest equity market by capitalization, behind only the United States. Combined market cap across SSE, SZSE, and BSE sits in the USD 10-12 trillion range depending on the day, with roughly 5,300 listed companies as of mid-2026. For comparison, that is more than four times the listed universe of Hong Kong and roughly twice the listed count of Japan.

Several investor archetypes need clean A-share data:

  • EM and global allocators rebalancing against MSCI EM, MSCI ACWI, and FTSE Global All Cap need security-master and fundamental data for every A-share constituent. MSCI's A-share inclusion factor sits at 20% for large caps; further upweights will mechanically move billions of passive flow.
  • Stock Connect strategists running northbound and southbound flows need eligibility flags and daily quota awareness. Northbound daily quota is RMB 52 billion per channel; southbound RMB 42 billion. Quota utilization is a tracked positioning signal.
  • QFII and RQFII funds holding A-shares directly need fundamentals normalized against international accounting concepts (IFRS / US GAAP) rather than Chinese Accounting Standards.
  • China-focused hedge funds running long-short books need the full universe, not the curated index slice. Alpha in China concentrates outside the largest 300 names where sell-side coverage is thinnest.
  • Quant researchers backtesting factor strategies (value, quality, momentum, low-vol) need historical fundamentals across the full listed history, including delisted tickers to avoid survivorship bias.

In every case, missing the bottom half of the cap distribution -- where alpha is most likely to live -- is a non-starter.

3. What the Eastmoney actor extracts

The Eastmoney China A-Shares Screener returns a normalized JSON record per ticker with both Chinese and English-language fields. Coverage spans all three mainland exchanges (Shanghai, Shenzhen including Main Board and ChiNext, and Beijing) plus the STAR Market segment of Shanghai. The current field set includes:

  • ticker -- six-digit code (e.g. 600519 Kweichow Moutai, 000858 Wuliangye, 300750 CATL, 688981 SMIC)
  • name_cn / name_en -- Chinese and best-effort English company name
  • exchange -- SH (Shanghai), SZ (Shenzhen), BJ (Beijing)
  • board -- Main, ChiNext (创业板), STAR (科创板), BSE (北交所)
  • sector / industry -- CSRC top-level sector and Eastmoney sub-industry classification
  • market_cap_rmb -- total market cap in CNY
  • free_float_market_cap_rmb -- float-adjusted cap, relevant for index-weighted exposure
  • pe_ttm, pb, ps_ttm -- valuation ratios
  • dividend_yield -- trailing twelve-month dividend yield
  • roe_ttm, net_margin_ttm -- quality factors
  • revenue_growth_yoy, eps_growth_yoy -- growth factors
  • eps_ttm, bvps -- per-share metrics
  • price, change_pct, volume, turnover_rmb -- live quote fields
  • week_52_high, week_52_low -- trailing range
  • stock_connect_eligible -- boolean for northbound (HK -> mainland) eligibility

A sample output record looks like this:


    {
      "ticker": "600519",
      "name_cn": "贵州茅台",
      "name_en": "Kweichow Moutai",
      "exchange": "SH",
      "board": "Main",
      "sector": "Consumer Staples",
      "industry": "Baijiu",
      "market_cap_rmb": 1932000000000,
      "free_float_market_cap_rmb": 1816000000000,
      "pe_ttm": 22.4,
      "pb": 7.8,
      "ps_ttm": 11.2,
      "dividend_yield": 0.0341,
      "roe_ttm": 0.342,
      "net_margin_ttm": 0.521,
      "revenue_growth_yoy": 0.151,
      "eps_growth_yoy": 0.139,
      "eps_ttm": 68.42,
      "bvps": 197.30,
      "price": 1538.20,
      "change_pct": -0.0042,
      "volume": 1820000,
      "turnover_rmb": 2802000000,
      "week_52_high": 1812.50,
      "week_52_low": 1428.10,
      "stock_connect_eligible": true
    }
Enter fullscreen mode Exit fullscreen mode

Pricing is being repriced to $0.10 per stock record effective May 26, 2026, under Apify's Pay-Per-Event model. A full A-share universe pull of roughly 5,300 names lands at about $530 -- a small fraction of any single seat of a Western terminal that covers the same universe, and orders of magnitude below the multi-year contracts demanded by Wind or Choice for offshore institutions.

4. Example workflow: a deep-value screen across the A-share universe

A canonical use case for this dataset is a classical Graham-style value screen. Most Western screeners can't even express this query against the full A-share universe because half the tickers are missing.

The recipe:

  1. Run the Eastmoney actor with no ticker filter to pull the full universe.
  2. Filter for pe_ttm < 15, pb < 1.5, and dividend_yield > 0.03.
  3. Drop names with negative trailing EPS, negative free float, or market cap below RMB 5 billion (basic liquidity screen).
  4. Tag each survivor with its board (Main / ChiNext / STAR / BSE) using the board field, and segment results by board to see where value is concentrated.
  5. Optionally enrich with the China ETF Flow Tracker to see whether sector ETFs covering survivors are seeing accumulation or redemption -- a useful confirmation signal for sector-level value rotation.
  6. Cross-reference survivors against the China A-Share Insider Trades actor to flag names where the executive cohort is buying (高管增持). Insider buys are a high-information signal in the mainland market because mandatory disclosure thresholds are tighter than in many comparable jurisdictions.
  7. For each survivor with a Hong Kong dual listing, join the H-share equivalent and compute the AH Premium spread -- useful both as a relative-value entry signal and as a hedge mechanism.
  8. Export the joined dataset to CSV and load into your portfolio tooling -- Portfolio Visualizer, a custom risk model, or a Jupyter notebook running pandas, numpy, and statsmodels.

A typical screen against May 2026 data returns 180-220 names. Banks, steel, coal, highways, and selected consumer staples dominate. ChiNext and STAR names are largely absent -- growth-oriented boards rarely trade below book. Invert the screen -- revenue_growth_yoy > 0.25, roe_ttm > 0.15, pe_ttm < 30, restricted to ChiNext and STAR -- and you get a very different list: semi equipment, EV supply chain, biotech, industrial robotics. Both screens take a single actor run plus a few lines of pandas.

5. Use cases across the EM research stack

  • China A-share quant research: build value, quality, momentum, and low-vol factor portfolios against the full mainland universe with monthly rebalances and CSRC sector neutralization.
  • EM portfolio rebalancing: reconcile your benchmark weights when MSCI updates A-share inclusion factors or when index providers reclassify names between the Main, ChiNext, and STAR boards.
  • Stock Connect through-channel analysis: filter on stock_connect_eligible to isolate the Hong Kong-accessible subset, then overlay HKEX disclosed northbound positioning to track foreign flow concentration by name.
  • Sector rotation backtests: group by CSRC sector and build long-only or long-short sector momentum strategies; the industry field gives finer granularity for sub-sector trades like baijiu within consumer staples or rare-earth processors within materials.
  • QFII and RQFII strategy backtests: foreign institutional investors operating under the QFII/RQFII regime need fundamental data normalized for international comparison, particularly around accounting differences between CAS and IFRS.
  • China-focused factor models: construct Fama-French-style factor portfolios (SMB, HML, RMW, CMA) calibrated to A-share data rather than re-using developed-market loadings, which empirically misprice the value premium in China.
  • Cross-listing arbitrage with H-shares: pair the A-share series with the corresponding H-share Hong Kong listing and trade the Hang Seng AH Premium spread mechanically, with the A-share fundamentals as your anchor.
  • Sentiment overlays: combine fundamentals with retail sentiment from the China Trends Tracker (Weibo, Baidu, Douyin) to detect retail-driven momentum on small-cap names before it shows up in price.
  • Insider-flow event studies: use the insider-trades actor to test whether executive buying predicts forward returns on A-share names, conditioning on board and market-cap quintile.
  • Investigative journalism: reporters covering Chinese listed companies -- particularly around accounting concerns, related-party transactions, or sanctioned entities -- need a quick way to pull fundamentals and ownership color without a Bloomberg seat.

6. Run it on Apify

The fastest way to start is to run the actor directly against the Shanghai or Shenzhen ticker space and inspect the output before wiring it into a pipeline. The interface is straightforward: paste a ticker list (or leave blank to pull the full universe), pick the fields you want, hit run.

Run the Eastmoney China A-Shares Screener on Apify->

The actor runs on Apify's standard infrastructure -- no proxy setup, no headless-browser maintenance, no Chinese-language UI to navigate. Pull a single ticker for a sanity check, then scale to the full universe. Results land in Apify's dataset format and can export to JSON, CSV, Excel, S3, Google Sheets, BigQuery, Snowflake, or a webhook. Schedule end-of-day for daily snapshots and you have a self-maintaining A-share data archive.

For broader context see our Chinese-language deep dive 免费抓取中国A股数据, the Best Free Stock Market APIs guide, and the FX Dashboard with Apify and Google Sheets.

7. Related actors for an integrated China and Asia stack

The Eastmoney actor is the foundation, but a serious China research workflow usually combines several feeds. The following are all publicly available on Apify under the same publisher and stitch together cleanly via shared ticker keys (mainland six-digit codes for A-shares, four-digit codes for HKEX, exchange-specific codes for KRX and NSE).

  • China ETF Flow Tracker (东方财富ETF资金流向) -- daily ETF subscription and redemption flows. Useful as a sector-level demand signal that can lead price by hours to days, particularly for thematic ETFs covering semis, EV, and biotech.
  • China A-Share Insider Trades (高管增减持) -- executive and major-shareholder buying and selling disclosed to CSRC. The canonical insider-flow dataset for the mainland market.
  • HKEX Insider Trades and Short Interest Tracker -- pairs naturally with the A-share insider feed when you are trading the A/H cross-listing spread or hedging A-share exposure via HKEX shorts.
  • HKEX IPO Calendar -- Hong Kong new listings, including secondary listings of mainland names returning home via H-share dual listings and the increasing flow of US-listed Chinese ADRs converting to Hong Kong primary listings.
  • KOSPI Stock Screener -- Korea is the obvious adjacency for a North Asia equity strategy; structure is similar to the Eastmoney actor and Korean semis are often correlated with Chinese supply chain names.
  • NSE India Stock Indices Screener -- the other big EM Asia market; useful for cross-region factor work and for any allocator running a barbelled EM Asia book.

8. FAQ

How fresh is the data?

Eastmoney refreshes intraday quotes in near real time during mainland trading hours (09:30-11:30 and 13:00-15:00 CST). The actor pulls live values at run time. Fundamentals refresh daily for ratios and quarterly for underlying statements per CSRC deadlines.

Does this include Shenzhen ChiNext and the Beijing Stock Exchange?

Yes. The actor covers all three mainland exchanges: Shanghai (SH), Shenzhen Main Board and ChiNext (SZ), the STAR Market on Shanghai, and the Beijing Stock Exchange (BJ). BSE coverage is the main differentiator versus most Western vendors, which historically skipped BSE entirely. BSE launched in 2021 for innovation-oriented SMEs and remains effectively invisible to most non-Chinese feeds.

Can I get historical price data?

The current actor focuses on snapshot screener fields. For historical OHLCV, schedule the actor end-of-day and build a time series in object storage or your warehouse. A dedicated historical-price actor with adjusted prices and corporate-action handling is on the roadmap.

Are foreign-language names included?

Yes. Each record includes name_cn (official Chinese name) and name_en (best-effort English name). English names are not always present for the smallest BSE listings, in which case fall back to a transliteration or the company's own annual-report English name.

How does pricing compare to Bloomberg or Wind?

At $0.10 per stock record, a full A-share universe pull is roughly $530. Bloomberg seats covering mainland China run $24,000+ per year. Wind pricing typically lands in the high four to low five figures per year in RMB and requires onshore payment routing. The Apify approach is two orders of magnitude cheaper.

Can I screen by Shanghai-Hong Kong Stock Connect eligibility?

Yes. The stock_connect_eligible boolean flags whether a name is northbound-accessible across both Shanghai and Shenzhen Connect channels. Re-pull after each quarterly Stock Connect list revision. Current northbound daily quota is RMB 52 billion per channel.

Does the actor handle Chinese character encoding correctly?

All Chinese fields are returned as UTF-8 strings. Python pandas, JavaScript, Excel (with UTF-8 BOM), and BigQuery handle the output natively. Use Excel's Data -> From Text/CSV wizard with explicit UTF-8 to avoid character mangling.

Can I use this dataset for production trading?

Yes, with caveats. Reconcile against your prime broker's security master and official exchange feeds (SSE Datafeed, SZSE Datafeed) where regulatory requirements apply. Best suited to research, backtesting, screening, and reporting.

See also:New -- Dividend Aristocrats Tracker

See also:New -- Short Interest Tracker

See also: New -- Kuaishou Trending Tracker

Top comments (0)