← Portfolio fitCOMPUTATIONAL MATERIALS DISCOVERYNegatives / eval-data license

Lattice Graph × Lila Sciences

Scientific superintelligence & autonomous experimentation

Lila's science-foundation-model platform needs hard negatives and calibrated uncertainty to make materials evals trustworthy.

Start the conversation →Give us a challenge Lila Sciences ↗

Why nowLila is scaling autonomous experimentation campaigns now, and every campaign run on a positives-only prior wastes scarce instrument time re-discovering documented failures — licensing the 23,000-edge negatives corpus and cross-engine trust signals before the next wave of campaigns is the fastest path to hardening Lila's evals while this data advantage remains proprietary and unmatched in the public market.

What our platform does for Lila Sciences

Lattice Graph operates a computational materials-discovery platform built around a knowledge graph that spans millions of compositions, linking structures, properties, synthesis routes, patent claims, and experimental outcomes into a single governed evidence graph. The core differentiator is not just breadth but depth of evidence: every node and edge carries provenance, so any property value can be traced back to the source calculation, experiment, or literature entry that produced it. For a company like Lila Sciences, whose foundation models need to learn not just what materials do but why certain predictions should be trusted and which experiments have already been run, that provenance chain is the layer that turns a static dataset into a queryable source of truth. Validation on the platform runs through multiple independent physics engines in parallel. Machine-learning interatomic potentials including MACE and CHGNet, combined with density-functional-theory calculations, produce phonon spectra, thermodynamic stability envelopes, and formation-energy predictions that are cross-checked against each other before any signal reaches downstream consumers. The disagreement between these engines is itself a signal: where MACE and DFT converge, confidence is high; where they diverge, a flag surfaces. That cross-engine adjudication layer gives science-foundation models something they rarely have access to in standard computed datasets — a calibrated uncertainty signal grounded in physics, not just statistical variance. Perhaps the most unusual asset on the platform is the labeled negative corpus: more than 23,000 failed-experiment and kill edges representing compositions, dopants, annealing conditions, and interfaces that were tried and did not work. Public materials databases and the scientific literature are overwhelmingly records of success; failures are suppressed by publication bias and largely absent from any training corpus built by re-scraping the open web. Lattice Graph captures these negatives through internal experimental pipelines, labels them with the same provenance discipline applied to positive results, and makes them available through a governed API — giving training and evaluation pipelines access to the side of materials reality that most models have never seen.

Why Lattice Graph × Lila Sciences

Lila Sciences is building toward scientific superintelligence: science-foundation models coupled to autonomous, self-driving experimentation loops intended to close the hypothesis-test-refine cycle far faster than human-paced R&D. That ambition is compelling, but it rests on a structural vulnerability in how publicly available materials data is distributed. The positive-result bias in the scientific literature means that any foundation model trained predominantly on public sources will develop an optimistic prior — it learns what works and is systematically underexposed to what does not. For a platform whose value proposition is the throughput and reliability of an autonomous experimental loop, that optimistic prior is not a minor calibration issue; it is a direct driver of wasted bench cycles and eroded credibility every time the loop proposes a known dead end. Lattice Graph's strategic fit with Lila is not as a competitor or a parallel discovery engine but as the missing data infrastructure layer. The three products that map most directly to Lila's needs address three distinct gaps: the absence of labeled failure data in training and evaluation sets, the absence of a principled uncertainty signal for ranking predictions before committing autonomous-lab time, and the absence of grounded provenance for the computed values Lila's models ingest and generate. These are precisely the gaps that Lila cannot close by scraping more literature or running more DFT, because the failures were never published and the uncertainty was never quantified in the sources that already exist. The commercial relationship Lattice Graph envisions for Lila is a data and evaluation license, not an asset sale. Lila keeps its models, its lab infrastructure, and its IP; it licenses the missing training and evaluation signal — the negatives, the disagreement flags, the evidence graph — and uses that signal to harden its own benchmarks and improve the fidelity of its own experiment-prioritization logic. That structure is straightforward to scope, price, and expand incrementally as the engagement matures and Lila identifies additional data products that move its metrics.

Lila Sciences business lines

→Scientific superintelligence platform
→Autonomous experimentation
→Materials & chemistry foundation models

Where we fit

Foundation models for science need hard negatives and calibrated uncertainty. License the negatives/eval atlas and trust signals to harden materials evals; query provenance and evidence via the KG API. $40–75K negatives audit to start.

The Lattice Graph fit for Lila Sciences

The challenge

Name a computational feat you think we can't do.

Here is the specific problem we would take on for Lila: given a family of candidate solid-state ionic conductors that Lila's foundation model has ranked in the top decile for predicted Li-ion conductivity, identify what fraction of those candidates have documented failure edges in the negatives corpus — whether from decomposition under synthesis conditions, interfacial instability with common electrode materials, or off-stoichiometry collapse — and for the subset with no documented failures, quantify the cross-engine disagreement on their formation energy and phonon stability to produce a calibrated confidence rank that Lila's autonomous loop could actually use to sequence experiments. If Lila's current benchmarks cannot tell a high-confidence top-candidate from a contested prediction that has quietly failed in three internal campaigns, this is the problem we solve first, and we can demonstrate the delta on Lila's own candidate set within the scope of a negatives audit.

Send us a challenge →

Data & eval products for Lila Sciences

Live data and API products running on our production platform — licensed to your team, with full schemas and access terms on request.

The first and most immediately valuable product for Lila is the Negatives and Eval-Data Atlas, a labeled corpus of more than 23,000 failed-experiment and kill edges representing the compositions, processing conditions, and material interfaces that were tried and documented not to work. Because most of these negatives originate from internal experimental pipelines and have never appeared in the published literature, they cannot be reproduced by competitors who re-scrape the open web. For Lila, the practical value is twofold: the negatives can be incorporated into training to reduce the model's optimistic bias, and they can be used as an independent evaluation set to measure how well the model's predictions cover the failure space — a benchmark Lila cannot build from public data alone. In the context of autonomous experimentation, the negatives database can also function as a pre-flight filter: before the loop commits instrument time to a new candidate, it can check whether that composition or processing route already appears as a documented failure, protecting throughput at the decision layer rather than after the fact. The Trust and Disagreement Signals product provides the calibration layer that autonomous prioritization requires. It surfaces cross-source disagreement between independent computational engines — situations where DFT and machine-learning potentials produce substantially different formation-energy or property estimates — along with calibrated prediction bounds derived from that multi-engine evidence. For a foundation model that ingests computed datasets as training signal or uses them for candidate scoring, knowing that a particular value is contested across engines is more useful than knowing only the value itself: it allows the model to propagate uncertainty forward into the experiment queue rather than treating all computed inputs as equally reliable. The Knowledge Graph API completes the picture by providing structured, queryable provenance for every signal: composition-to-structure relationships, property-to-evidence links, natural-language graph traversal, and source attribution so that any value the model encounters or produces can be anchored to the calculations and experiments that underlie it. Together the three products give Lila an independent, physics-grounded, and provenance-backed eval infrastructure — the opposite of grading its own models on their own optimistic priors.

Negatives & Eval-Data Atlas

23,196 failed-experiment / kill edges plus the honest-negatives set — the labeled negative results most models never see. License for training, eval, and benchmark hardening.

Trust & Disagreement Signals

Cross-source disagreement flags and calibrated prediction bounds — the uncertainty layer for eval pipelines and model QA.

Knowledge-Graph API

Provenance, composition-360, evidence neighborhoods, and natural-language graph queries across the materials knowledge graph.

In the platform for Lila Sciences

For a data-and-evaluation engagement, the most useful interfaces are the knowledge-graph explorer and the composition-intelligence reporting layer, which serve as the human-readable window onto the same signals Lila's team consumes through the API. Lila's ML engineers and research scientists would use composition-360 views and evidence-neighborhood traversal to audit any sampled material candidate — inspecting its stability evidence, the specific negatives attached to it, and the cross-engine disagreement profile on its computed properties. This makes eval failures traceable: instead of a black-box benchmark score, the team sees the specific provenance node or kill edge that drove a discrepancy, which is exactly the kind of interpretable feedback needed to iterate on training data curation and benchmark design. The batch-screening workflow and the formation-energy predictor complement the explorer for day-to-day use. Lila's team can run candidate families through batch screening and apply the negatives check and trust signals as a programmatic filter before routing candidates to autonomous-lab queues, while the explorer makes the kill and negatives data human-inspectable so the team can validate coverage, assess label quality, and sample the distribution of failure modes during an evaluation period. These are grounding and auditing tools positioned to harden Lila's own model and loop, not to replace the discovery logic Lila is building.

How an engagement works

The natural entry point for Lila is a scoped negatives audit: a bounded, time-limited engagement in which Lila evaluates the Negatives and Eval-Data Atlas and the Trust and Disagreement Signals against its own model outputs and benchmarks, with the Knowledge Graph API providing provenance context throughout. The goal is concrete and measurable — quantify how much the labeled-failure and calibrated-uncertainty data move Lila's eval metrics before committing to a broader license. Deliverables from the audit include a coverage report mapping the negatives corpus against Lila's current training and evaluation sets, a disagreement-signal integration study showing where cross-engine flags would have changed Lila's experiment-queue rankings, and a provenance-sampling exercise anchoring a representative slice of Lila's computed inputs to their underlying evidence. That audit is scoped in the range of forty to seventy-five thousand dollars as a starting engagement. Following a successful audit, the relationship converts to a recurring data and evaluation license covering the full suite of products, priced against seat count, query volume, and refresh cadence. The license is non-exclusive and refreshes as the negatives corpus and trust signals grow. If Lila later wants to pull in additional capabilities — freedom-to-operate and patent whitespace screening, synthesis-route intelligence, or the opportunity-identification engine — those are scoped as incremental add-ons to the base license rather than requiring a new negotiation from scratch. No IP transfer is implied at any stage; Lila licenses the missing signal, keeps its models and lab infrastructure, and retains full ownership of what it builds on top.

Build the Lila Sciences package

Request a sample of the negatives/eval set, the data dictionary, and license terms.

Email to discuss →All companies

Company names, logos, and trademarks are the property of their respective owners and are referenced here for identification and illustrative purposes only. Their inclusion reflects Lattice Graph's own analysis of where its portfolio may be relevant and does not imply any partnership, endorsement, affiliation, sponsorship, or existing commercial relationship.

Results are informational and should be validated by qualified professionals. See Terms of Service