Why this tutorial exists
About 200 million people worldwide are slowly losing their central vision to a disease called age-related macular degeneration (AMD). For the worst form of it — wet AMD — we already have drugs that can save sight, all of which target a single protein. This tutorial walks through how scientists found that protein, how they drug it, and how you could use modern AI tools like Boltz to design the next generation of medicines.
- Explain what part of the eye breaks down in AMD
- Name the protein (VEGFA) that drives the worst form of the disease, and why blocking it works
- Describe how a researcher uses a database called UniProt to grab a protein's amino-acid sequence
- Walk through a real Boltz workflow: create a project, add a UniProt target, kick off AI structure prediction
- Connect each click in the software to a real step in modern drug discovery
This is a companion to the KRAS-cancer tutorial. Same software, completely different disease — that's the point: the workflow generalizes.
The eye, the macula, and what goes wrong
Light enters your eye through the pupil, gets focused by the lens, and lands on the retina at the back. The retina is packed with photoreceptor cells that convert light into electrical signals your brain reads as vision.
Right in the center of the retina is a tiny spot — about 5 mm across — called the macula. The macula has the densest cluster of color-sensing cone cells in your whole eye. It's what you're using to read this sentence. Lose your macula and you lose the ability to read, recognize faces, drive, or do detail work — even though your peripheral vision still works.
Two flavors of AMD
Dry AMD is the slower kind. Over years, fatty deposits called drusen build up under the retina, and the macula's photoreceptor cells gradually die off. About 85–90% of AMD cases are dry. There's no cure, though we now have some drugs (pegcetacoplan, avacincaptad) that slow it down.
Wet AMD is the faster, more destructive kind. Abnormal new blood vessels grow from the layer behind the retina and leak fluid and blood into the macula, distorting and destroying vision sometimes within weeks. It's only 10–15% of AMD cases but causes most of the severe vision loss. Wet AMD is what this tutorial focuses on, because it's the form we know how to drug effectively.
Go deeper: why "wet" and "dry"?
The names come from what an ophthalmologist sees in the eye. In dry AMD the retina looks dry — just thinning and yellowish deposits. In wet AMD there's actual leaking fluid (and sometimes blood) pooling under and inside the retina. That fluid is what makes vision warp and disappear so fast. The technical name for the abnormal blood-vessel growth is choroidal neovascularization, abbreviated CNV.
Meet VEGFA — the protein at the center of it all
Why do abnormal blood vessels suddenly grow into the back of an aging eye? In healthy tissue, blood vessels grow only when something tells them to grow. That "something" is a small signaling protein your cells release when they're short on oxygen or nutrients, and it's called VEGFA (pronounced "veg-eff-A"). It stands for Vascular Endothelial Growth Factor A.
In the aging retina, the supporting layer underneath (called the RPE) starts producing too much VEGFA. The cells lining nearby blood vessels see this signal, switch on, and start sprouting new vessels — but they're leaky, fragile, and grow in the wrong place: right into the macula. Block VEGFA and you cut the "grow!" signal at the source.
Go deeper: VEGFA in normal biology
VEGFA isn't a "bad" protein. It's essential. Embryos need it to build the circulatory system. Adults need it for wound healing (you grow new tiny vessels to feed a healing cut), for the menstrual cycle, for exercising muscle. The problem in wet AMD is too much of it in the wrong place. That's why anti-VEGF drugs are given as eye injections — you only block VEGFA right at the back of the eye, not everywhere in the body.
Go deeper: VEGFA's UniProt entry
UniProt is the protein equivalent of Wikipedia for biologists — a free database with the full amino-acid sequence, structure, function, mutations, and references for every known protein. The entry for human VEGFA has the ID P15692. The protein is 395 amino acids long. That ID is the only thing you need to tell Boltz to pull the whole sequence — you'll see it happen in the walkthrough.
How wet AMD is treated today
The breakthrough came in the mid-2000s. Researchers reasoned: if VEGFA is the signal driving the bad blood-vessel growth, just intercept VEGFA before it can reach its receptor. The first drug to do this was ranibizumab (brand name Lucentis), approved by the FDA in 2006. It's a piece of an antibody — the part that latches onto VEGFA — and it's injected directly into the eye every 4–8 weeks.
Aflibercept (Eylea), approved in 2011, took a different approach: it's a "decoy receptor" — a lab-built protein that looks like the VEGFA receptor's grabbing end. VEGFA binds the decoy instead of the real receptor and is taken out of circulation. Aflibercept lasts longer in the eye, so injections can be every 8–16 weeks.
Bevacizumab (Avastin) is a full antibody originally designed for colon cancer that doctors discovered also worked beautifully for wet AMD. It's much cheaper than the eye-specific drugs and gets used off-label all over the world.
Go deeper: how good are these drugs?
Before anti-VEGF drugs, most wet AMD patients went legally blind in the affected eye within two years. With regular anti-VEGF injections, the average patient now keeps their vision over the same period, and many actually gain back some lost vision. It's one of the clearest success stories in modern medicine. The catch: injections every 1–2 months for the rest of your life, and the drugs don't work for everyone.
Go deeper: what's next?
Active areas of research include: (1) longer-lasting anti-VEGF formulations so patients need injections only twice a year (faricimab and brolucizumab are recent examples); (2) combination drugs that block VEGFA plus another growth factor (Ang-2) for harder cases; (3) gene therapy that turns the eye itself into a permanent anti-VEGF factory after a single injection; (4) oral small molecules that block VEGFA signaling — none approved yet for AMD, but the biggest prize, because pills are far easier than eye injections.
How to pick a target: UniProt 101
Before you can drug a protein, you need its sequence — the exact string of amino acids that fold into the 3D shape. That sequence is what AI tools like Boltz use to predict structure and where drugs could bind.
UniProt is the world's standard protein database. Every protein in every organism that's been studied has a UniProt entry. Each entry has a unique ID (called an accession number) — a short string of letters and digits like P15692. Give Boltz that ID and it pulls the entire sequence automatically.
Go deeper: what's an "accession number"?
UniProt accessions are 6-character codes for most proteins (newer ones are 10). They're permanent and unique — P15692 will mean human VEGFA forever, even if the protein gets renamed or reclassified. Scientists cite accession numbers in papers exactly the way you'd cite a book's ISBN. This standardization is what lets databases and software all talk to each other.
Go deeper: how did we pick VEGFA out of all proteins?
Working backward from the disease. We knew: wet AMD = abnormal blood vessel growth in the retina. We asked: what biological signal drives new blood vessel growth? Decades of basic research had answered that — VEGFA. So VEGFA became the prime suspect. This is the standard pattern in drug discovery: identify the disease mechanism first, then identify the protein at the controlling node of that mechanism, then drug that protein. Picking the wrong target is the most common reason new drugs fail in clinical trials.
Boltz walkthrough: building the project, end to end
This section recreates exactly what happens when you click through Boltz to set up a new VEGFA drug-discovery project. Each diagram below shows a real screen with arrows pointing at every button. The project was actually created in a live Boltz workspace — these mockups document the workflow step by step.
The choice of Small Molecule vs Protein matters. Current approved AMD drugs (aflibercept, ranibizumab) are all proteins. But proteins are expensive and need eye injections. A working small-molecule VEGFA blocker — a pill — would be a huge deal. So we pick Small Molecule and tell Boltz to hunt for chemicals.
Boltz greets a new project with a three-step roadmap so you never wonder what to do next: add a target → build an experiment → add candidate molecules. Same shape every project takes.
Boltz gives you four ways to hand it a protein. UniProt Import is the easiest — you just type the accession number you looked up earlier, and the entire amino-acid sequence comes in automatically. (RCSB Import is similar but pulls from the Protein Data Bank, where solved 3D structures live. Sequence and File are for when you have the data in hand already.)
In about two seconds, all 395 amino acids of human VEGFA show up in your project — never having to copy-paste a thing. This is the magic of standard identifiers like UniProt accessions: every database speaks the same language.
Go deeper: what do those letters mean?
Each letter is a one-letter abbreviation for one amino acid. M = methionine (every protein starts with M — it's the "start" amino acid coded by the start codon AUG in your mRNA). T = threonine, D = aspartate, R = arginine, Q = glutamine, and so on. There are 20 standard amino acids and each has a single-letter code. The order of letters is what determines the protein's 3D shape, and the 3D shape determines what the protein does. A 395-residue protein has a one-in-20-to-the-395th-power chance of existing by chance — astronomically unlikely. Every detail of that sequence was shaped by evolution.
This is the moment the AI earns its keep. Boltz takes your sequence and predicts the 3D fold — where every amino acid sits in space, where the binding pockets are, what shape the surface takes. Three minutes for an answer that, ten years ago, required experimental work that could take a graduate student a year.
Go deeper: why "unbound" structure?
Proteins move. They often look slightly different when nothing is stuck to them ("unbound" or "apo" form) than when a drug or another protein is sitting in their pocket ("bound" or "holo" form). Boltz starts by predicting the unbound shape so you can see the natural resting state of VEGFA, including the pocket where a drug would eventually go. Later steps will show the protein with candidate drugs docked in.
Go deeper: VEGFA is actually a dimer
VEGFA in real life doesn't float around as one chain — two copies of it lock together (one upside down relative to the other) to form a working "dimer." That's why the receptor on the blood-vessel cell binds two VEGFA chains at once. Boltz can model this if you tell it to add two copies of the sequence. For high-school purposes the single chain is fine, but it's a good reminder that real biology is usually messier than the simple picture.
How you actually build the drug
The previous section ended with Boltz predicting the 3D structure of VEGFA. Now what? Predicting a protein is interesting, but it's not a drug. This section walks through exactly how you go from "I have a target protein" to "I have a candidate molecule that blocks it" — using screens I just captured by clicking through the live Boltz interface.
- Generate with AI — let a neural network invent brand-new molecules custom-designed to fit your protein's pocket. Pick a quantity from 100 to 100,000.
- Screen a library — give Boltz a list of existing molecules (your own CSV file, or a pre-loaded one like the Enamine Kinase Inhibitor Library) and let it predict which ones bind your target.
- Draw your own — sketch a single molecule in the Design view, click Submit, and Boltz predicts how (and whether) it binds. This is the "manual" route a medicinal chemist uses to test specific hypotheses.
The minute Boltz finishes the structure prediction, it asks you the central question of drug discovery: do you want the AI to generate brand-new molecules for your target, or do you want it to evaluate molecules you already have in mind? In a real project you'd often do both — generate some, screen some.
Every experiment in Boltz is a self-contained run with a name and a hypothesis. The hypothesis box isn't decoration — writing one in plain English forces you to be specific about what you're testing. Good science makes a falsifiable claim before running the experiment, not after.
Go deeper: why the workflow is hypothesis-shaped
Drug-discovery projects can take 10+ years and cost billions. Every step you take should narrow down the space of "what we still don't know." Writing a hypothesis up front ("a molecule that fits in the switch-II pocket of VEGFA should block its binding to VEGFR2") makes it possible months later to look back and ask: did we test what we thought we were testing? Even AI tools work better when you tell them a clear goal — "design molecules that bind here, with these constraints" gives much better candidates than "make me something cool."
Go deeper: what does "Generative" actually mean?
Generative chemistry models are AI systems trained on millions of known molecules. They learn the grammar of chemistry — which atoms go next to which, which functional groups stabilize which others, what makes a molecule drug-like. Then, given a target protein's binding pocket, they propose new molecules that should fit that specific shape. Modern generative models can produce molecules nobody has ever seen before — but that obey all the rules of plausible chemistry. It's analogous to how language models generate sentences nobody has written, by learning the grammar of English.
Library mode is the other half of the workflow. You either upload a CSV of molecules in SMILES format (a text encoding of chemical structure — every molecule has a unique SMILES string), or pick from libraries the Boltz deployment has pre-loaded. The default option shown above is the Enamine Kinase Inhibitor Library; in a real project you'd find libraries for many drug classes.
Go deeper: SMILES, the language of molecules
SMILES stands for Simplified Molecular Input Line Entry System. It's a way of writing a molecule's structure as a string of characters. For example, caffeine is "CN1C=NC2=C1C(=O)N(C(=O)N2C)C" and aspirin is "CC(=O)OC1=CC=CC=C1C(=O)O." Every chemical database speaks SMILES. A CSV file with one SMILES per row is the universal way to hand a list of molecules to any computational chemistry tool. Boltz reads SMILES, builds the 3D shape of each molecule, and tries to dock it into your protein's pocket.
Once the screen finishes, you're back in the familiar Table / Triage / Design interface (covered in the KRAS tutorial). Filter by binding confidence, walk through the top hits in Triage, flag the promising ones, and either order them for actual lab testing or start a new generative round that uses your best hits as starting points. Real projects loop through this cycle many times before settling on a final candidate.
Go deeper: why iteration matters more than one big screen
A single 50,000-molecule screen gives you maybe 100 plausible hits. But the top hit usually isn't drug-good enough on its own — it might bind, but be too greasy, too large, or have a flaw that makes it toxic. What real drug chemists do is take the best hit and vary it: swap one atom, add a ring, shrink a tail. Each variation gets re-scored. After 5-10 cycles of "best hit → 100 variations → new best hit," you've optimized into something genuinely drug-like. This is called lead optimization, and it's where most of the real intellectual work of medicinal chemistry happens. Boltz can do this loop in days; the old way took years.
Go deeper: why most candidates still fail
About 90% of molecules that look great in computational screens fail in the lab — they don't actually bind, or they bind but can't get into cells, or they get into cells but get destroyed by liver enzymes within minutes, or they cause unrelated side effects. Drug discovery has gotten much better with AI but it's still fundamentally an empirical business. The goal isn't "find the perfect molecule on the computer" — it's "find the top 50 worth spending lab time on, instead of the top 50,000 you'd have to test blindly." That's the speedup.
To make this concrete, I actually ran a "Tiny" generative screen against the demo target while writing this tutorial. Boltz produced 152 candidate molecules in about ten minutes. Below is the actual top candidate it returned, plus the progress chart that filled in as molecules came back. These are real numbers from a real run — not mockups.
The green line is the best-scoring molecule so far. It jumped quickly — within the first 10 molecules Boltz had already found something scoring ~0.42 binding confidence — and then leveled off. The blue line (10th best) climbed more gradually as more candidates came in. The orange (100th best) is still flat at zero because at Tiny size you don't even have 100 molecules with non-zero binding scores yet.
Go deeper: why the top score was "only" 0.421
Three reasons. First, the target was a small 147-residue protein with a fairly shallow pocket — challenging compared to a deep enzyme active site. Second, we ran Tiny (100 molecules); the model needs more samples to find genuinely good binders. Third, we used no expert constraints — no known binders as starting points, no specified pocket residues, no chemistry filters. A real project would seed the AI with a known active molecule (a "warhead") and ask it to vary that, which dramatically improves results. So 0.421 from a no-guidance Tiny run is actually a reasonable starting point — it proves the system works end-to-end and gives you a feel for what to optimize next.
Go deeper: reading the structure of SM-2YDN1KSL
Decoding the molecule: at the center is a pyridine ring (a benzene with one nitrogen) — a very common drug scaffold because pyridines often hydrogen-bond nicely with protein side chains. Hanging off it: a nitrile group (C≡N), another phenyl ring, an isobutyl group, and an amide linker (NH-C=O) connecting to a chlorophenyl group. The molecular weight (403 Da) is in the typical drug range (Lipinski's rule of 5 caps it at 500). The slight ⚠ on CLogP means it's a touch too oily — a real chemist would swap one of the phenyl rings for something more polar before going further.
Discussion prompts
Use these in pairs, small groups, or as a whole-class conversation. Connect back to what you read above; there are no single "right" answers.
- Same tool, different disease. The KRAS tutorial and this AMD tutorial use the exact same software. What does that tell you about how modern drug discovery actually works as a process? Why might one general tool be more useful than many specialized ones?
- Why VEGFA? Once researchers understood that wet AMD was caused by abnormal blood-vessel growth, why did they zero in on VEGFA specifically instead of going after the receptor (VEGFR) or the blood-vessel cells themselves? What are the tradeoffs of attacking different points in the same pathway?
- Proteins vs. pills. The approved AMD drugs are all proteins given by injection into the eye. A new small-molecule drug could be a pill. What advantages would a pill have? What might be lost?
- Side effects of blocking VEGFA everywhere. If a small molecule blocked VEGFA throughout the body (rather than just in the eye, like an injection does), what biological processes might get disrupted? Use what you read about VEGFA's normal jobs.
- The UniProt step. Why does the entire modern biology and pharma industry agree on a database like UniProt? What would happen if every lab made up its own naming system for proteins?
- AI predictions vs. experiments. Boltz predicts a structure in 3 minutes. X-ray crystallography might take a year. Should we trust the AI's answer the same way we'd trust the experimental answer? When would you want both? Why?
- Equity and access. Anti-VEGF eye injections cost hundreds to thousands of dollars per dose, and people need them every month or two for life. In wealthy countries this is covered by insurance. In low-income countries many people who'd benefit don't get treated. What responsibilities do drug developers, governments, and citizens have here?
- Your turn. Pick any disease you care about. With everything you now know, what would you Google to figure out which protein to target? Where would you go next?