Wirestock raises $23M as AI labs scramble for training data

Abstract illustration of data marketplace infrastructure connecting multiple nodes through central hub

Wirestock, a content marketplace connecting creators with AI companies, has closed a $23 million Series A funding round to expand its supply of multimodal training data to frontier AI laboratories, according to TechCrunch. The raise underscores mounting pressure on AI developers to secure high-quality, legally compliant datasets as competition intensifies and regulatory scrutiny increases.

The funding round, led by Khosla Ventures with participation from existing investors including Gradient Ventures and Day One Ventures, values the company at approximately $100 million post-money, according to Crunchbase News. Wirestock operates a two-sided marketplace where content creators license images, video, audio and text to AI labs for model training, positioning itself as infrastructure for the data supply chain that underpins generative AI development.

Founded in 2019 as a distribution platform for stock content, Wirestock pivoted in 2023 to focus explicitly on AI training data after observing surging demand from model developers. The company now claims more than 500,000 creators on its platform contributing content across modalities, with revenue growing 340 per cent year-over-year according to company figures cited by SiliconANGLE.

The business model addresses a critical bottleneck in AI development: securing training data that is both high-quality and legally defensible. Multiple AI companies face ongoing litigation over alleged copyright infringement in their training datasets, including cases brought by Getty Images, The New York Times, and various artist coalitions. Wirestock’s approach—paying creators upfront for explicit licensing rights—offers labs a potential shield against such claims.

“The data supply chain for AI is fundamentally broken,” Wirestock CEO Alex Shvets told TechCrunch. “Labs are spending enormous resources either scraping the internet and dealing with legal consequences, or building internal data operations that don’t scale. We’re building the infrastructure layer that should have existed from the start.”

The company declined to disclose specific customers, but multiple sources familiar with the matter indicate Wirestock supplies data to at least three of the five largest AI labs by compute spending. Pricing varies by modality and exclusivity, with video content commanding premiums of 5-10x compared to static images, according to Techloy reporting.

Market implications

The funding signals investor confidence that data infrastructure represents a durable business opportunity rather than a temporary arbitrage during AI’s scaling phase. Labs gain access to vetted, licensed content without building procurement operations in-house. Content creators gain a new monetisation channel as traditional stock photography markets contract under pressure from AI-generated alternatives. Meanwhile, established stock content providers including Shutterstock and Getty Images face potential disintermediation if direct creator-to-lab marketplaces gain traction.

The raise also highlights a strategic shift in AI development economics. As model architectures converge and compute costs decline, data quality increasingly differentiates model performance. Labs are consequently allocating larger budgets to data acquisition—a trend that benefits specialised infrastructure providers like Wirestock whilst potentially disadvantaging smaller AI companies without comparable procurement resources.

Regulatory developments may further entrench this dynamic. The EU AI Act includes provisions requiring documentation of training data provenance, whilst the UK government’s consultation on copyright and AI, published on GOV.UK, suggests potential licensing requirements for commercial AI training. Such regulations would advantage platforms offering auditable licensing trails.

What comes next

Wirestock plans to deploy the funding primarily towards geographic expansion, particularly in Southeast Asia and Latin America where creator acquisition costs remain lower than in saturated North American and European markets. The company also intends to develop automated quality assessment tools to filter submissions before they reach lab customers, addressing concerns about dataset contamination and adversarial content.

The competitive landscape will likely intensify as adjacent players recognise the opportunity. Adobe, Shutterstock, and Getty Images all operate creator networks and have announced AI training licensing programmes, whilst startups including Scale AI and Snorkel AI approach the problem from different technical angles. Whether Wirestock’s creator-first marketplace model proves more durable than vertically integrated alternatives will depend largely on its ability to maintain content quality whilst scaling supply—a challenge that has defeated many two-sided marketplaces in other sectors.

The funding round confirms that data infrastructure, long overshadowed by model development in AI investment, now commands serious capital allocation as the industry matures beyond its experimental phase.