Back to index

AI Value Capture Contract Cycle

Stance

Directionally, we believe SemiAnalysis is right about the operating regime change: useful agentic tokens have become more valuable while the cost to serve them is falling, so a model-lab surplus window is real. The record supports the existence of that window through Anthropic's public run-rate above $30B, SemiAnalysis's stronger claim of $44B-plus ARR, the alleged inference-infrastructure gross-margin move from 38% to above 70%, and gigawatt-scale capacity commitments.

We do not believe that proves durable public-market value capture by model labs or workflow owners. The marginal surplus is now entering the 2H26-2027 contract reset across Rubin/SOCAMM, HBM/DRAM, TSMC N3/CoWoS, cloud capacity, neocloud terms, and post-reset lab margins. That reset is the case, not the existence of demand.

The current action is no live public-market position. The ranked research docket is MU and memory peers first for memory/HBM/SOCAMM pass-through, NVDA second for Rubin/SOCAMM quote structure, TSM third for N3/CoWoS allocation and prepayments, AVGO fourth for custom-silicon and networking discipline, and cloud/neocloud/lab proxies only after contract denominators or post-reset margins become visible.

No-position is the sizing decision, not the prose stance. The substance view is that supplier-side evidence moves first, but first observable is not investable. The trade stays at zero until pass-through, market, valuation, instrument, event, and contract denominators appear in the same packet.

What The Source Or Consensus Argues

The source argues that AI value capture has shifted toward model labs because agentic AI changed the customer ROI curve. It says tasks that once took hours and thousands of dollars can now be done in minutes with a few dollars of tokens, and it uses SemiAnalysis's own token consumption to illustrate willingness to pay. It claims Anthropic ARR moved from roughly $9B to more than $44B and inference-infrastructure gross margin moved from 38% to more than 70%.

The source's production-cost framework is that tokens are getting cheaper to produce. It names software and hardware as the joint mechanism: software can produce up to 14x throughput on the same B300 setup, the most optimized GB300 NVL72 setup is about 17x faster than the most optimized H100 setup in FP8 and 32x in FP4, and GB300 TCO per GPU is only about 70% higher than H100. The claim is not merely "demand up"; it is "value per token up while cost per token down."

The source's model-provider margin claim is that the low-margin era for frontier labs is over. It argues that Opus price cuts do not imply lower realized economics because input-to-output ratios near 300:1 and cache hit rates above 90% push realized blended price toward a lower-cost cached-token mix, while customers shift volume to higher-value frontier SKUs. That source framework is plausible but not independently settled in the round record.

The source's TSMC/Nvidia underpricing claim is that both firms are venting value downstream. It says TSMC could raise prices or demand larger prepayments in an N3 and advanced-packaging shortage, and it says Nvidia has not repriced Rubin-class systems to capture the full value that its hardware enables. The round record preserves upstream scarcity because Nvidia FY2026 Data Center revenue was $193.7B, Q4 Data Center revenue was $62.3B, TSMC Q1 2026 revenue was $35.90B, TSMC gross margin was 66.2%, advanced nodes were 74% of wafer revenue, and DRAM fabs were above 90% utilization.

The source's SOCAMM framework is Nvidia's next margin lever. It argues that Rubin's socketed SOCAMM module can be priced separately from the base system. The source model has Nvidia SOCAMM contract cost around $8/GB in 1Q26, possible exit-2026 SOCAMM pricing above $13/GB, a reasonable cost assumption near $10/GB, and a possible 60% SOCAMM margin.

The source's capex/W stagnation puzzle is that GB300 moves to VR NVL72 with only a slight capex-per-watt increase, from $37.4/W to $38.1/W, despite chip TDP rising from 1400W to 2300W and performance per watt more than doubling. The source treats that as evidence Nvidia has pricing room because vendors usually capture some of the performance/W improvement in system price.

The source's networking-price-discrimination framework is that Nvidia already price discriminates through networking. It cites a survey where an SN5610 can cost a neocloud about 2x the hyperscaler price, while the total cluster cost impact is only around 10% because networking is a smaller share of rack-scale capital cost. The source then says the lever is real but may have limited remaining room.

The source's "Nvidia as the Central Bank of AI" framework is regulatory and strategic restraint. Nvidia may avoid fully pricing scarcity because aggressive repricing could increase antitrust scrutiny, upset customers, and accelerate alternative compute platforms. The source analogizes this to TSMC's historical policy of preserving ecosystem stability rather than fully pricing to scarcity. We accept the rationale as a possible explanation for delayed capture, not as proof that Nvidia will never capture more.

The source's TSMC framework, styled as "The Fairest and Most Just Company In The World," says TSMC leaves value on the table but can still capture through prepayments, longer-term agreements, guaranteed capacity commitments, allocation priority, and customer-funded capacity. The round record accepts TSMC as a durable upstream claimant but says that visibility is weaker if capture appears through non-price terms rather than dramatic headline price increases.

The source's cost-floor/value-ceiling derivation is its most useful trading framework. Cost-based pricing says a neocloud must charge at least about $4.92/hr/GPU on VR NVL72 to achieve the same 15.6% five-year project IRR with 15% prepay as a GB300 deployment. Value-based pricing says a GB300 parity rental cost of about $0.70/PFLOP implies a VR NVL72 ceiling near $12.25/hr/GPU, while a conservative $0.55/PFLOP case implies $9.63/hr/GPU. The gap between $4.92 and $9.63 to $12.25 is the source's claimed rent pool.

The source's One Chart to Rule Them All combines the cost floor, value ceiling, and neocloud IRR curve. Moving up and right on the curve gives neoclouds more bargaining power; shifting the curve up and left through higher system prices gives Nvidia and system suppliers more bargaining power. The chart is the source's answer to who captures the AI demand benefits.

What We Accept

We accept the operating premise that the surplus pool is larger than it was. Agentic workflows can raise willingness to pay, and the round record has enough support to treat model-lab surplus as real now: Anthropic's public run-rate exceeded $30B, SemiAnalysis alleges $44B-plus ARR, and SemiAnalysis alleges infrastructure gross margin moved from 38% to more than 70%.

We accept the token-cost mechanism directionally. The rounds did not independently audit the source's 14x software-only B300 throughput, 17x GB300-versus-H100 FP8 throughput, 32x FP4 throughput, 300:1 input/output ratio, 90% cache-hit rate, or $0.99/MTok blended Opus 4.7 estimate. The memo therefore treats those numbers as source claims that explain the mechanism, not as independently verified investable evidence. The accepted point is narrower: useful-token value and serving-cost improvement can coexist, so a downstream surplus window is not fantasy.

We accept that the first public evidence is more likely to be upstream than lab audited margins. Round 002 ranked memory/HBM/SOCAMM pass-through and Rubin/SOCAMM quote structure ahead of TSMC allocation, cloud/neocloud terms, and lab/workflow margin persistence because supplier margins, memory readiness, and Rubin timing are visible earlier.

We accept memory as the leading watch lane. Micron Q2 FY26 revenue was $23.86B, GAAP gross margin was 74.4%, Cloud Memory and Core Data Center gross margins were around 74%, and Q3 FY26 gross margin was guided near 81%. Micron also had Q1 2026 volume shipments of 12-high 36GB HBM4 for Nvidia Vera Rubin and 192GB SOCAMM2 in volume production. Those are harder public supplier datapoints than confidential cloud term sheets.

We accept Rubin/SOCAMM as the cleanest quote-watch lane. Rubin partner availability begins in 2H26, SOCAMM may be quoted separately, and the source's $8/GB to more than $13/GB SOCAMM curve creates a concrete gross-margin bridge test for Nvidia. Nvidia enters that test with Q4 FY26 revenue of $68.1B, Q4 Data Center revenue of $62.3B, FY26 Data Center revenue of $193.7B, and Q4 GAAP gross margin of 75.0%.

We accept the cost-floor/value-ceiling derivation as a useful map of possible rent, not as an allocation verdict. A $4.92/hr/GPU floor for 15.6% IRR and $9.63 to $12.25/hr/GPU value zone show that VR NVL72 may leave economic room among Nvidia, memory suppliers, neoclouds, hyperscalers, labs, and end users. The source's own One Chart says the same price gap can be captured by different actors depending on where the observed rental price and system price land.

We accept the capex/W stagnation puzzle as a live Nvidia-pricing question. If VR NVL72 only creeps from $37.4/W to $38.1/W while TDP rises from 1400W to 2300W and performance/W more than doubles, the price does look restrained relative to value delivered. But the round record does not yet show the actual buyer-tier quote, SOCAMM line item, rack ASP, networking attach, or customer acceptance, so the accepted conclusion is "watch the quote bridge," not "buy NVDA."

We accept the Nvidia central-bank rationale as a plausible reason for delayed repricing. Regulatory scrutiny, ecosystem stability, and fear of accelerating custom silicon are real enough to explain why Nvidia might not immediately take every dollar. That explanation does not negate the later repricing option, especially if customers can still see lower cost per useful token after higher system prices.

We accept TSMC as a durable upstream claimant. Q1 2026 revenue was $35.90B, gross margin was 66.2%, operating margin was 58.1%, 3nm was 25% of wafer revenue, advanced nodes were 74%, HPC was 61%, 2026 revenue growth expectations moved above 30% in U.S. dollar terms, and 2026 CapEx moved toward the high end of $52B-$56B. The accepted watch item is allocation economics, prepayments, LTAs, customer-funded capacity, N3 margin, CoWoS tightness, and capex recovery.

We accept the networking-price-discrimination framework only as evidence that Nvidia already knows how to price scarce system components to buyer willingness to pay. The source's claimed 2x SN5610 neocloud/hyperscaler price gap and only about 10% all-in cluster capital-cost impact make networking a useful analogy for SOCAMM. But the source itself says the current gap leaves limited incremental room, so networking is not the primary reset lane. The primary lane remains Rubin/SOCAMM quote structure and memory pass-through.

What We Reject

We reject the stronger source implication that model labs have already won durable residual ownership. The missing denominators are still central: ARR reconciliation, infrastructure gross-margin definition, useful-token volume, realized useful-token pricing, compute-capacity translation, supplier economics, contract pass-through, and market expression evidence.

We reject treating the $44B-plus ARR and 70%-plus infrastructure gross-margin claims as sufficient public-market proof. Anthropic's public hard marker in the round record is above $30B run-rate, not $44B-plus ARR, and the record does not establish whether "infrastructure gross margin" includes depreciation, power, networking, cloud credits, prepayments, idle capacity, serving staff, or amortized training.

We reject cloud and neocloud capacity headlines as current surplus-allocation evidence. Anthropic/AWS, Anthropic/Google, OpenAI/Stargate, CoreWeave/Anthropic, and Nebius/Meta show scale, duration, delivery, and product categories, but they omit minimum spend, take-or-pay mechanics, prepayments, cancellation protection, utilization floors, financing spread, power/networking pass-through, cloud-credit accounting, service credits, and residual GPU risk.

We reject "Nvidia alone wins" from the SOCAMM model. The source's $8/GB 1Q26 cost, possible above $13/GB exit-2026 price, $10/GB cost assumption, and 60% SOCAMM margin are useful arithmetic, but they do not prove actual quotes, buyer-tier pricing, volume, attach, customer acceptance, or whether memory suppliers keep the spread.

We reject "frontier model-provider margins will only rise" as an investable public-market conclusion. The source may be right that cache hits, 300:1 input/output ratios, frontier SKU mix, and hardware throughput gains lift lab gross margins. The round record still needs renewed-contract costs, depreciation, power, networking, prepayments, idle capacity, discounts, refunds, free-tier load, training amortization, and staff before we can say those margins survive the reset.

We reject "memory is already an investment" from Micron's margin signal. Micron's 74.4% Q2 FY26 GAAP gross margin and near-81% Q3 guide make MU the leading watch item, but the record lacks HBM/SOCAMM versus commodity DRAM split, contract pricing, duration, allocation, backlog, inventory, utilization, and pass-through.

We reject treating the One Chart to Rule Them All as a trade signal. The chart is a bargaining-power map. It says where rents might be captured once observed prices land between the cost floor and the value ceiling. The round record has not supplied observed VR NVL72 rental prices, actual Rubin rack prices, or public-market confirmation, so the chart cannot rank MU, NVDA, TSM, AVGO, cloud, or neocloud expected returns today.

We reject current live expression across the whole public proxy set. The Round 003 packet has no usable fresh factors, OHLCV, volume, spreads, options, event calendar, relative strength, valuation, revisions, defined baskets, beta controls, or hedge ratios. That blocks entry, sizing, stops, pair construction, and expected-return ranking.

We reject using Nvidia's central-bank restraint as a reason to short suppliers or ignore the supplier docket. Strategic restraint delays capture; it does not eliminate the option value of later repricing, SOCAMM line items, networking attach, prepayments, allocation priority, or customer-funded capacity. The right implication is patience for quote and contract evidence, not a negative supplier stance.

What We Refuse To Average

We refuse to average "labs are capturing value now" with "upstream bottlenecks have pricing power" into a harmless barbell. Round 001 made the right cut: labs have a surplus window, but durable public-market value capture must be adjudicated by contract-reset evidence, not demand scale alone.

We also refuse to average "cloud terms are decisive if disclosed" with "cloud headlines are already decisive." The former is true. The latter is false. Contract fields allocate surplus; gigawatts, dollar value, duration, and partner logos show urgency but not incidence.

Most important, we refuse to average "first observable" with "investable." The docket is ranked, but the position is no-position. A ranked evidence queue is not a disguised trade.

This is the core litigation holding. The source is strongest as a contract-cycle map. It is weakest when it moves from modeled economic headroom to public equity ownership without the denominators that tell us who absorbs memory inflation, who reprices racks, who preserves margin, and what is already priced.

What We'd Own, Avoid, Short, Wait For, Or No-Position

No-position: no live public-market trade, no position size, no entry, no stop, no hedge ratio, and no expected-return rank from this packet. The decision is research-only because market, valuation, pass-through, event, instrument, and contract denominators are missing.

Own, only after tripwire: MU or a memory expression becomes a candidate only if public evidence connects AI memory product mix to HBM4/SOCAMM2 pricing, duration, allocation, margin persistence, inventory, utilization, and pass-through, with fresh relative market confirmation. Micron is first in the diligence queue because it has issuer-specific hard numbers, not because the trade is already proven.

Own, only after tripwire: NVDA becomes a candidate if VR NVL72 quotes separate base system, SOCAMM, networking, power, service, delivery, buyer tier, and rack ASP, and if Nvidia provides or lets the market infer a gross-margin bridge showing memory inflation, SOCAMM markup, and customer acceptance. Until then, Rubin/SOCAMM is a quote watch, not a long.

Wait for TSM. TSMC may capture through N3/CoWoS allocation, prepayments, LTAs, customer-funded capacity, N3 margin, CoWoS tightness, or capex recovery. The record does not yet turn that into a direct trade because management restraint and non-price allocation can delay visible equity evidence.

Wait for AVGO. Custom silicon and networking can either capture value or discipline Nvidia pricing, but the record lacks fresh AVGO market data, TPU/ASIC disclosure, networking demand, hyperscaler capex read-through, customer concentration, revenue quality, and margin visibility.

Avoid cloud/neocloud capacity headline trades. Cloud and neocloud terms are future decision channels, not current proof. A $100B ten-year headline, multiple-gigawatt TPU capacity, a 10GW infrastructure target, a phased Anthropic/CoreWeave rollout, or a $12B plus possible $15B Nebius/Meta capacity agreement can all be economically ambiguous without minimums, cancellation, utilization, financing, pass-through, and residual-risk terms.

Avoid pure model-lab/workflow-owner public proxies. The clean economics sit in private labs or inside broad public companies whose price action can be dominated by ad, retail, legacy enterprise software, cloud capex, rates, QQQ beta, semiconductor beta, or memory cyclicality. The operating view may be right and the public proxy may still be wrong.

Short nothing from this packet. The record is a no-position because it lacks market data and denominators, not because the thesis is bearish. Shorting suppliers because Nvidia may act like a central bank of AI or because TSMC may restrain headline pricing would be the same denominator shortcut in reverse.

Wait for the One Chart to become observed rather than modeled. The investable version requires actual VR NVL72 rental prices, actual rack/system quotes, disclosed or inferable supplier cost changes, and the market's same-window reaction. Until then the chart identifies the courtroom, not the verdict.

Rejected Expressions And Why They Lose

Rejected expression: pure public-market long model-lab/workflow-owner basket. It loses because the strongest lab economics are private or diluted inside broad public platforms, and the record does not prove post-reset residual ownership after GPU, memory, cloud, power, networking, prepayment, idle-capacity, staff, and training-amortization costs.

Rejected expression: cloud/neocloud capacity headlines as decisive evidence. It loses because disclosed scale, dollar value, duration, delivery window, and product category omit the terms that allocate surplus. Minimums, take-or-pay, cancellation, prepayments, utilization floors, financing spread, power/network pass-through, service credits, cloud-credit treatment, and residual equipment risk are the actual evidence.

Rejected expression: long MU now from memory margin strength. It loses because first public signal is not the same as investable public edge. The record still lacks HBM/SOCAMM versus commodity DRAM split, AI contract prices, duration, allocation, inventory, utilization, margin pass-through, valuation, event timing, liquidity, options, and relative strength.

Rejected expression: long NVDA now from Rubin/SOCAMM optionality. It loses because the source's SOCAMM economics are modeled, not a quote packet. We need actual base-system price, SOCAMM line item, rack ASP, buyer-tier pricing, networking attach, delivery terms, customer acceptance, and a gross-margin bridge.

Rejected expression: TSMC headline-margin screen. It loses because TSMC may capture value through allocation, prepayments, LTAs, or customer-funded capacity rather than dramatic price changes. Without explicit N3/CoWoS economics and market confirmation, the expression is structurally plausible but not current.

Rejected expression: networking price discrimination as the main Nvidia trade. It loses because the source's own framework says the existing neocloud/hyperscaler networking gap is meaningful but has limited room left; the more important live question is whether SOCAMM and Rubin rack quotes become separable enough to show a fresh margin bridge.

Rejected expression: neutral barbell across labs, suppliers, cloud, neoclouds, and memory. It loses because it evades the decision. The correct decision is sharper: supplier-side lanes are first observable, but no public-market position is sized until denominators arrive.

Rejected expression: any intraday, level-based, options, or pair trade from the current packet. It loses because the packet has no usable factor state, no current OHLCV, no volume, no bid-ask spreads, no options context, no event calendar, no relative strength, no valuation/revision packet, and no hedge ratio.

Tripwires And Review Date

Upgrade MU or another memory expression only if disclosures provide AI memory product mix, HBM4/SOCAMM2 pricing, duration, allocation, margin persistence, inventory, utilization, and fresh relative market confirmation. The next evidence must separate contracted AI memory economics from commodity DRAM cyclicality.

Upgrade NVDA only if VR NVL72/Rubin quote evidence separates base system, SOCAMM, networking, power, service, delivery, buyer tier, and rack ASP, and if Nvidia preserves or explains gross margin after memory inflation. The source's $8/GB to above-$13/GB SOCAMM curve becomes actionable only when actual pricing and customer acceptance appear.

Upgrade TSM only if N3/CoWoS commentary includes explicit allocation economics, prepayments, LTAs, customer-funded capacity, capex recovery, or margin evidence, plus market confirmation. TSMC can be right structurally while still being too opaque for current expression.

Upgrade cloud/neocloud only with minimums, take-or-pay, cancellation rights, utilization floors, prepayments, financing spread, power/network pass-through, service-credit limits, cloud-credit treatment, and residual-risk allocation. The memo will not treat future capacity headlines as enough.

Upgrade lab/workflow capture only if post-reset serving margin includes cloud costs, GPU depreciation, power, networking, prepayments, idle capacity, free-tier load, discounts, refunds, reliability penalties, training amortization, and staff, and still remains high while realized useful-token pricing holds.

Any upgrade also needs same-window market confirmation: OHLCV, volume, liquidity, options, relative strength, beta controls, valuation, revisions, event timing, factor regime, and a defined instrument or basket.

Review date: review at the first public HBM4/SOCAMM2 contract disclosure, first VR NVL72/SOCAMM quote packet, next major memory supplier results, next Nvidia Rubin/SOCAMM commentary, next TSMC N3/CoWoS commentary, or by 2026-07-31, whichever comes first.

Stance Ledger