<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Jiawei Li | UCSC OSPO</title><link>https://ucsc-ospo.github.io/author/jiawei-li/</link><atom:link href="https://ucsc-ospo.github.io/author/jiawei-li/index.xml" rel="self" type="application/rss+xml"/><description>Jiawei Li</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><lastBuildDate>Thu, 18 Jun 2026 00:00:00 +0000</lastBuildDate><image><url>https://ucsc-ospo.github.io/media/logo_hub6795c39d7c5d58c9535d13299c9651f_74810_300x300_fit_lanczos_3.png</url><title>Jiawei Li</title><link>https://ucsc-ospo.github.io/author/jiawei-li/</link></image><item><title>CauST: Causal Gene Intervention for Robust Spatial Domain Identification</title><link>https://ucsc-ospo.github.io/report/osre26/ucsc/caust/20260618-jiawei-li/</link><pubDate>Thu, 18 Jun 2026 00:00:00 +0000</pubDate><guid>https://ucsc-ospo.github.io/report/osre26/ucsc/caust/20260618-jiawei-li/</guid><description>&lt;h2 id="hello-world-">Hello, world 👋&lt;/h2>
&lt;p>I&amp;rsquo;m &lt;strong>Jiawei Li&lt;/strong>, a current graduate student at the University of Southern California working at the
intersection of machine learning and computational biology. I&amp;rsquo;m thrilled to be part of
the &lt;strong>Open Source Research Experience (OSRE'26)&lt;/strong> this year, contributing to
&lt;a href="https://ucsc-ospo.github.io/project/osre26/ucsc/caust">UC Santa Cruz OSPO&lt;/a> under the mentorship of
&lt;strong>Lijinghua Zhang&lt;/strong>. This post is a short introduction to me and to the project I&amp;rsquo;ll be
building over the coming months.&lt;/p>
&lt;h2 id="the-project-caust">The project: CauST&lt;/h2>
&lt;p>As part of OSRE'26, my &lt;a href="https://summerofcode.withgoogle.com/media/user/af71a455291d/proposal/gAAAAABqM_mFg7Tevk5gpESIoYTWQJEwp7ino2Sk1bL27ndGikmQyZzxHMXUir1n4mz7qNhu3UZpMPdclfY6baYaL_wWfsTcesvczmVeH0MfaEGJKFz2TMc=.pdf" target="_blank" rel="noopener">proposal&lt;/a>
introduces &lt;strong>CauST: Causal Gene Intervention for Robust Spatial Domain Identification&lt;/strong>.&lt;/p>
&lt;p>Spatial transcriptomics lets us measure gene expression while preserving the physical
location of each cell or spot within a tissue. A core task on this data is &lt;strong>spatial
domain identification&lt;/strong> — partitioning the tissue into coherent regions (for example, the
cortical layers of the human brain) by combining what genes are expressed with where they
are expressed.&lt;/p>
&lt;p>State-of-the-art methods, such as graph attention autoencoders, do this well on clean
data. But they remain vulnerable to &lt;strong>technical confounders&lt;/strong> — batch effects, platform
differences, and noise — that can be correlated with biology and quietly distort the
domains a model recovers. When that happens, the &amp;ldquo;domains&amp;rdquo; reflect the experiment as much
as the tissue.&lt;/p>
&lt;p>&lt;strong>CauST asks a causal question:&lt;/strong> which genes &lt;em>cause&lt;/em> a spot to belong to a given spatial
domain, as opposed to merely being &lt;em>associated&lt;/em> with it through some confounder? By
framing domain identification as a problem of &lt;strong>causal gene intervention&lt;/strong> — intervening
on gene expression and observing how domain assignments respond — CauST aims to learn
representations that are robust to these confounders and that generalize across tissue
sections, platforms, and batches.&lt;/p>
&lt;h2 id="why-it-matters">Why it matters&lt;/h2>
&lt;p>Robust, reproducible spatial domains are the foundation for downstream biology:
identifying disease-associated regions, mapping cell-type organization, and comparing
tissue across patients. If the domains shift when you change the scanner or the batch, so
does every conclusion built on top of them. Bringing causal reasoning into the pipeline is
a step toward results we can trust across labs and datasets.&lt;/p>
&lt;h2 id="a-first-look">A first look&lt;/h2>
&lt;p>The figure below shows spatial domains recovered on a human dorsolateral prefrontal cortex
(DLPFC) section — the kind of benchmark CauST is designed to handle. The left panel is the
model&amp;rsquo;s raw clustering, the middle panel aligns those clusters to the annotated cortical
layers, and the right panel is the expert &amp;ldquo;ground-truth&amp;rdquo; annotation. Getting that middle
panel to match the right one &lt;em>robustly&lt;/em> — across every section, not just the easy ones —
is exactly the problem CauST sets out to solve.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img alt="Spatial domains on DLPFC slice 151676: raw clusters, layer-matched clusters, and the manual ground-truth annotation." srcset="
/report/osre26/ucsc/caust/20260618-jiawei-li/featured_hu523d66e1ed6823469d24a6bd1644ca73_864093_33501b82af07718ce8a5a54ea004a7d1.webp 400w,
/report/osre26/ucsc/caust/20260618-jiawei-li/featured_hu523d66e1ed6823469d24a6bd1644ca73_864093_a8a83f12e8f7492e4fc1fb23e54d4898.webp 760w,
/report/osre26/ucsc/caust/20260618-jiawei-li/featured_hu523d66e1ed6823469d24a6bd1644ca73_864093_1200x1200_fit_q75_h2_lanczos_3.webp 1200w"
src="https://ucsc-ospo.github.io/report/osre26/ucsc/caust/20260618-jiawei-li/featured_hu523d66e1ed6823469d24a6bd1644ca73_864093_33501b82af07718ce8a5a54ea004a7d1.webp"
width="760"
height="230"
loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h2 id="whats-next">What&amp;rsquo;s next&lt;/h2>
&lt;p>Over the OSRE'26 program I&amp;rsquo;ll be:&lt;/p>
&lt;ul>
&lt;li>formalizing the causal model behind spatial domain identification,&lt;/li>
&lt;li>implementing the gene-intervention mechanism on top of a graph-based spatial encoder,&lt;/li>
&lt;li>and benchmarking robustness against existing methods across multiple tissue sections and
platforms.&lt;/li>
&lt;/ul>
&lt;p>I&amp;rsquo;ll be sharing progress, design decisions, and results here as the project develops.
Thanks for reading — and feel free to follow along!&lt;/p></description></item></channel></rss>