Pipettes GPUs, These are some notes for me

Sequence-First

Four summers ago AlphaFold 2 cracked single-chain protein folding. Now, AlphaFold 3 and a bunch of others do multi-chain assemblies, tri-hybrids, and even small-molecule docking. In drug discovery, it is known that structure is everything: if you find a pocket, you can design a key. And designing it is a lot like sudoku.

Sequence-first discovery, on the other hand, begins with a 20-mer sequence of RNA or DNA which design oligos (siRNA, antisense, CRISPR guides) that tell the cell what to do. Just a string of bases and instructions.

The pipeline consists of a bunch of regulatory ML tools that cover RNA structure, chemistry, delivery, immunology, and PK/PD — I’ll explain in-depth in another post. It obviously varies with what you’re dealing with, but a brief overview:

Target nomination/enumeration --> potency scoring (DeepSilencer) --> safety filters (using AttSiOff) --> secondary-structure & accessibilty --> chemical-modification design --> manufacturability --> synthesis/in-vitro screening/IND-enabling --> and finally clinical

Some metrics the models look for are:

  • GC-content sweet spot, usually around 40-60%
  • minimal self-folding ΔG so the guide stays linear
  • target-site accessibility
  • absences of seed complements (to heighten specificity)
  • chemical-mods that survive nucleases but keep RNA hybridization
  • predicted secondary structure during synthesis

DeepASO: A Squence-Centric Model

2023’s DeepASO coupled a convolutional encoder with bidirectional gated recurrent units (GRUs). The encoder spotted local motifs while the bidirectional GRUs understood what nucleotides were upstream and downstream of the motif to create global context. Basically a CNN + RNN combination. This is a sequence-centric model, as learning all happened in 1D letters-only space. It never needed 3D structural coordinates.

**Note:** while DeepASO never ingests 3D coordinates or AlphaFold models, it does include pre-computed secondary-structures and energy terms like what many other ASO/siRNA predictors use. Although these are formatted as 1D numeric summaries of folding, not a full 3D pipeline. So it's not a pure sequence-only model.

DeepASO predicted exon‑skipping efficiency for Duchenne antisense oligos and was trained on ~9000 splice-modulating experiments. Its results showed a +15% AUC compared to heuristics and it also flagged three novel sequences now being tested in mouse models.

I think sequence-first is important. It’s cheap and lets startups skip the structure bottleneck: You have the data, which I think is the hardest part. The atoms will follow.

I don’t know about a “sequence-only” pipeline though. Sequence-first serves as a good jumping point and tells you which oligos deserve the expensive chemistry (DeepASO is proof of this), but structural context, backbone chemistry, and delivery tech/safety are still just as important.

Like you have to ensure that siRNA won’t fold into a pesky G-quadruplex right? Or find out how that backbone chem-mod might alter shape? Structural generators are still needed.

Structure

Diffusion Graphs

DiffDock (2023) treated docking as an image-in-reverse problem — starting with the ligand as noise in 3D space (XYZ coordinates) and denoising step-by-step until it fits the binding pocket. This was different from the docking heuristics at the time which just went through like a million random poses per ligand and then scored them (search-based).

Tangent: Point Clouds

3D point clouds = list of atoms with x/y/z coordinates. Atoms are points, bonds become edges.

SE(3) = 3D “special euclidian” translations + rotations. Drug-binding physics are invariant.

Equivariance = rotations/translation causes latent features to rotate/translate the same way or unchanged if invariant.

**f(Sx) = Sf(x), where S is symmetry**

SE(3)-equivariant GNN = neural graph. Sees a rotated protein once and implictly knows every other orientation and maintains the original rotation/translation in every intermediate feature map.

The equivariant point network uses relative geometry only, meaning atoms use relative position vectors and distances which rotate consistently with the whole molecule. Features are stored as tensors and are multiplied by scalars to preserve equivariance. The network also pools over neighbors to keep track of how each vector would rotate and then aggregates them.

This shares similar priciples with 2D CNNs that are translation-equivariant. Except we’re extending the idea to 3D molecules. More on kernal point conversions in another post.

Okay Back to Diffusion

David Baker’s group developed RF Diffusion which could generate novel backbones that could accomplish a user-defined task:

  1. Start with random Ca traces (pure noise).
  2. Diffusion model denoises → plausible backbone respecting bond geometry. New coords for every atom.
  3. Secondary network fits side‑chains & checks rotamers.
  4. AlphaFold‑style confidence filter.
  5. X‑ray validation.

In their ‘23 Nature paper, ~30% of designs folded in vitro and many bound targets within low-nM. Keep in mind that this was an all-artificial scaffold.

Here in 2025, diffusion is going all in on chemistry and small-molecules. Conditioning is becoming smarter, and I’m interested in seeing how the sequence+structure pipeline improves.

Takeaway

The gap between screening and hit-generation is now measured in GPUs and weeks, not chemists and years. The regulatory drag still exists though.