Molecular Systems Biology 3 Article number: 140 doi:10.1038/msb4100180
Published online: 16 October 2007
Citation: Molecular Systems Biology 3:140
Network-based classification of breast cancer metastasis
There is a News and Views associated with this document.
Han-Yu Chuang1,a, Eunjung Lee2,3,a, Yu-Tsueng Liu4, Doheon Lee3 & Trey Ideker1,2,4
- Bioinformatics Program, University of California San Diego, La Jolla, CA, USA
- Department of Bioengineering, University of California San Diego, La Jolla, CA, USA
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Korea
- Cancer Genetics Program, Moores Cancer Center, University of California San Diego, La Jolla, CA, USA
Correspondence to: Trey Ideker1,2,4 Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, USA. Tel.: +1 858 822 4558; Fax: +1 858 534 5722; Email: trey@bioeng.ucsd.edu
Received 11 June 2007; Accepted 20 August 2007; Published online 16 October 2007
This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits distribution, and reproduction in any medium, provided the original author and source are credited. This license does not permit commercial exploitation or the creation of derivative works without specific permission.
aThese authors contributed equally to this work
Abstract
Mapping the pathways that give rise to metastasis is one of the key challenges of breast cancer research. Recently, several large-scale studies have shed light on this problem through analysis of gene expression profiles to identify markers correlated with metastasis. Here, we apply a protein-network-based approach that identifies markers not as individual genes but as subnetworks extracted from protein interaction databases. The resulting subnetworks provide novel hypotheses for pathways involved in tumor progression. Although genes with known breast cancer mutations are typically not detected through analysis of differential expression, they play a central role in the protein network by interconnecting many differentially expressed genes. We find that the subnetwork markers are more reproducible than individual marker genes selected without network information, and that they achieve higher accuracy in the classification of metastatic versus non-metastatic tumors.
YEAR OF MIRACLES
Ever since 1900, when Gregor Mendel’s work on peas and inheritance was rediscovered, scientists have regarded the “gene” as the fundamental unit of heredity (just as the atom was regarded as the bedrock of pre-Einsteinian physics). Crick and Watson’s discovery of the DNA double helix as the carrier of hereditary information did little to disturb the status quo. In recent months, however, a perfect storm of new technology and research has blown apart 20th-century dogma. The notion of the Mendelian gene as a unit of heredity, scientists now realise, is a fiction.
Many scientists now believe that heredity is the result of an incredibly complex interplay among the basic components of the genome, scattered among many different genes and even the vast stretches of “junk DNA” once thought to serve no purpose. Biology has been building up to this insight for years, but some big puzzle pieces have now fallen into place. Once scientists abandoned their preconceived notions of genes and looked instead at individual DNA “letters” in the genome—the four bases A, C, T and G— they immediately began to see cause-and-effect connections to myriad diseases and human traits.
The result of this seemingly modest conceptual breakthrough has been a torrent of new discoveries. In five months this year, from April through August, geneticists at the Harvard/MIT Broad Institute, founded by Eric Lander; at deCODE Genetics in Iceland, founded by Kari Stefansson, and several other institutions have published papers suggesting that the key to a deeper understanding of the human genome may finally be in hand. These scientists have identified specific alterations in the sequence of DNA that play causative roles in a broad range of common diseases, including type 1 and type 2 diabetes; schizophrenia; bipolar disorder; glaucoma; inflammatory bowel disease; rheumatoid arthritis; hypertension; restless legs syndrome; susceptibility to gallstone formation; lupus; multiple sclerosis; coronary heart disease; colorectal, prostate and breast cancer, and the pace at which HIV infection causes full-blown AIDS. Unlike so many previous “disease gene” discoveries, these findings are being replicated and validated. “The race to discover disease-linked genes reaches fever pitch,” declared the leading British science journal Nature. Its American counterparts at Science chimed in: “After years of chasing false leads, gene hunters feel they have finally cornered their prey. They are experiencing a rush this spring as they find, time after time, that a new strategy is enabling them to identify genetic variations that likely lie behind common diseases.” That the world’s top two scientific journals still use the old language of “genes” to describe these discoveries shows how new the new thinking really is.
These findings are just a prelude to what’s shaping up as a true conceptual and technological revolution. Just as physics shocked the world in the 20th century, it is now clear that the life sciences will shake up the world in the 21st. In a handful of years, your doctor may be able to run a computer analysis of your personal genome to get a detailed profile of your health. This goes well beyond merely making predictions. A new technology called RNA interference may also allow doctors to control how your DNA is “expressed,” helping you circumvent potential health risks. Many common diseases that have preyed on humans for eons—devastating neurological conditions such as Alzheimer’s, Parkinson’s, cancer—could be eradicated. If this sounds outrageously optimistic, so did the promise of eliminating smallpox and polio to previous generations.
Why is all this happening now?What has changed between this year and last? To answer these questions, we need to trace the story of how mainstream biomedical scientists tried to link the cause of diseases to single genes and, despite early success, hit a brick wall. Meanwhile, a handful of renegade scientists, pursuing their own pet projects, happened to develop exactly the intellectual tools needed to break through that wall. These biologists are now the leaders of the new revolution in biomedical science.
The seeds of our new understanding were first sown in the 1960s, when molecular biologists figured out how genetic information is organised, regulated and reproduced inside single-cell bacteria. In bacteria, a gene is a discrete segment of DNA that contains the “code” that tells the cell how to make a particular type of protein. Bacterial genes are arranged along a single DNA molecule, with only tiny gaps in between. Since all organisms have DNA and work by essentially the same biochemistry, scientists assumed that a human genome would look like a larger version of a bacterium’s.
Clues that something was amiss came quickly with the development of DNA-sequencing methods in the 1970s. The first surprising result was: genes accounted for only 2 per cent of the human genome —the rest of the DNA didn’t seem to have any purpose at all. Biologists Phillip Sharp and Richard Roberts made things worse with a discovery that won them a Nobel Prize in 1993. If the gene were the basic unit of heredity, the DNA required to make any particular protein should be contained in its corresponding gene. But Sharp and Roberts found DNA codes for individual proteins are often split and scattered throughout the genome.
A visionary physician-scientist named Leroy Hood, now at the Institute for Systems Biology in Seattle, grasped the far-reaching implications of a fundamental fact: while even the simplest organism is immensely complicated, the primary structures of its most complicated parts — DNA and proteins — are very simple. The alphabet of DNA contains only the four chemical letters (or bases) A, C, G and T, and proteins are made from just 21 amino acids. Hood saw that this simplicity would make it possible for robots and computers to read and write DNA and proteins more quickly, accurately and cheaply than human beings.
Hood completely transformed the biomedical enterprise. DNA-writing machines give genetic engineers an unlimited capacity to create novel genes that can be studied in test tubes or added to the genomes of living organisms. And protein-writing and -reading machines provided drug firms with the ability to create a new generation of protein-based drugs. The DNA-reading machines suddenly made it conceivable to crack the 3 billion-base sequence of an entire human genome. In 1990 the U.S. government embarked on a 15-year, $3 billion project to do just that.
Eight years later, however, the project — parceled out to many US scientists — was still less than 10 percent complete. Now it was biotech entrepreneur Craig Venter who was frustrated. Convinced that government-funded workers were the problem rather than the solution, Venter enlisted private funding of $200 million to build an enormous lab filled with hundreds of automated machines, working 24/7, overseen by a handful of technicians. Within three years, the first reading of a human genome was essentially complete.
Armed with data from the genome project, scientists figured they’d surely be able to crack the really hard diseases, like cancer and heart disease. But a funny thing happened when they began to look closely at this vast storehouse of genetic information. Geneticists Andrew Fire and Craig Melo galvanised the field by discovering a key mechanism that had been completely overlooked— the cellular process of RNA interference. (They shared a Nobel Prize in 2006 for the work.)
Geneticists had taken for granted that the machinery of cells involved genes directing the production of proteins, and proteins doing the work of the cell. Here was a process that didn’t involve proteins at all. Instead, tens of thousands of hitherto mysterious regions of the human genome — part of the so-called junk DNA — directed the production of specific molecules called microRNAs (consisting of bits of RNA, a well-known component of cells). These microRNAs then oversaw a whole new process, called RNA interference (RNAi), that served to modulate the expression of DNA.
The good news was that RNAi could open up a whole new approach to biomedical therapy (more on that later). But RNAi also made it clear that the fundamental unit of heredity and genetic function is not the gene but the position of each individual DNA letter.
To make it all harder to fathom, each bit of DNA is susceptible to mutation and variation among individuals. Of the 3 billion DNA bases in the human genome, geneticists identified about one tenth of one percent (millions) that differ from one person to another. Variations in these particular letters — called “snips,” or SNPs, for single nucleotide polymorphisms — have replaced genes as the unit of heredity.
Many scientists responded to this devastating realization by going into a funk. Fortunately, another visionary scientist, Kari Stefansson of Iceland, was already blazing a trail out of this thicket. If the genome was far more complex than scientists had thought, they would need to test for many more variables, and to do that they would need more test subjects. To find the cause of diseases would now require the participation of very large groups of genetically related people.
Stefansson decided to solve this problem by taking aim at the largest well-documented extended family that he knew — his own. Nearly all the 300,000 citizens of Iceland can trace their ancestors back, through detailed, public genealogical records, to the Vikings who settled this desolate European island more than 1,000 years ago. He persuaded the Icelandic government to provide his company, decode, with exclusive access to the health records of its citizens in return for bringing investment capital and hi-tech jobs to the capital, Reykjavik. So far, more than 100,000 Icelandic volunteers have donated their DNA to deCODE.
The power of large numbers was soon apparent. In a study of obesity, Stefannson directed his software to look for SNPs associated with subsets of the population who were either extremely overweight or very thin. Within just a few hours, it began finding evidence that variations among particular DNA letters indeed played a causative role, confirming SNPs as the new unit of inheritance.
As of September, deCODE has made progress in identifying SNPs that may play a role in 28 common diseases, including glaucoma, schizophrenia, diabetes, heart disease, prostate cancer, hypertension and stroke. In some cases, such as glaucoma and prostate cancer, deCODE’s findings could lead to diagnostic tests for identifying people at risk of developing the disease. In other instances, such as schizophrenia, links to particular proteins have led to insight about the cause of the disease, which could lead to therapies.
Buoyed by Stefansson’s success, other geneticists were eager to perform large-scale family studies, yet few had similar access to ancient genealogical records. But serendipity would deliver an epiphany: it’s possible to study the entire human population as a single extended family, provided scientists collect enormous amounts of data. Eric Lander, an MIT professor and the intellectual leader of the US government effort to sequence the first human genome, realised scaling up would require a new approach. In 2004, Lander persuaded MIT and Harvard to combine their enormous resources toward the creation of the Broad Institute. Backed by $200 million from billionaire philanthropists Eli and Edythe Broad, the institute is driving the development of ever more advanced genetic technologies. One technology, based on computer-chip fabrication, can identify DNA base letters present at 500,000 SNPs in the genomes of 40,000 or more people.
Think of this as a spreadsheet with 500,000 columns (each representing a specific SNP) and 40,000 rows (one for each person). To hunt for a genetic basis for, say, bipolar disease, the computer searches rows of people who have the disorder, checking column by column for an unusually high frequency of particular letters in comparison with people without the disease. As it turns out, a collaboration of American and German researchers has done this work—and found that variations of DNA letters in 20 different positions are influential in bipolar disease.
Incredibly, most disease-causing variants are the most common ones present in the human population: the strongest-acting one, for instance, exists in 80 percent of people without bipolar disease and 85 percent of people with the disease. The implication is that these variants are beneficial in some way, and cause problems only when their number exceeds a threshold.
To make sense of this complexity, scientists would like ultimately to build a vast international database that contains the complete sequence of DNA bases in the genomes of hundreds of millions of people. Ideally, such a database would be available for analysis by all biomedical researchers and would provide the foundation for understanding the genetic components of all human traits. That sounds like a lot of data — think of a spreadsheet with 3 billion columns and 100 million rows — but computing power is getting cheaper by the year.
The explosion of genetic discoveries shows no sign of letting up any time soon. New diseases are being added to the list every month, and biologists are rapidly parlaying gene- and SNP-disease links into a deeper understanding of how proteins and other molecules can misbehave to cause different medical problems in different people. Other scientists are working to advance the biology revolution (accompanying interviews). As a result of their efforts, many children born this year could very well be alive and healthy at the dawn of the next century, when they may look back in awe at the annus mirabilis of biomedical genetics in 2007.
-LEE SILVER (Newsweek)
MIT and Novartis in new partnership aimed at transforming pharmaceutical manufacturing
September 28, 2007
Novartis and the Massachusetts Institute of Technology have launched a long-term research collaboration aimed at transforming the way pharmaceuticals are produced.
The 10-year partnership, known as the Novartis-MIT Center for Continuous Manufacturing, will work to develop new technologies that could replace the conventional batch-based system in the pharmaceuticals industry - which often includes many interruptions and work at separate sites - with continuous manufacturing processes from start to finish.
The Novartis-MIT Center for Continuous Manufacturing combines the industrial expertise of Novartis with MIT's leadership in scientific and technological innovation. Novartis will invest USD 65 million in research activities at MIT over the next 10 years.
"This partnership demonstrates our commitment to lead not only in discovering innovative treatments for patients but also in improving manufacturing processes, which are critical to ensuring a high-quality, efficient and reliable supply of medicines to patients. Our collaboration with MIT, a worldwide leader in developing cutting edge technologies, holds the promise to achieve a quantum leap in the production of pharmaceuticals, a field which has received rather little attention in the past," said Dr. Daniel Vasella, Chairman and CEO of Novartis.
Novartis and MIT expect the technologies created in this collaboration will benefit patients and healthcare providers through a positive impact on supply availability and the quality of medicines. These technologies will also seek to reduce the environmental impact of manufacturing activities.
"The Novartis-MIT Center for Continuous Manufacturing has the potential to revolutionize drug development and production," said Susan Hockfield, MIT President. "We are delighted to collaborate with Novartis to help improve the way that drugs are manufactured so that patients have quicker and more reliable access to the medications they need. The new educational opportunities that this program will provide for our students make this partnership even more exciting."
The pharmaceutical industry currently uses batch-based manufacturing that has been common for several years, even though other industries have moved to continuous manufacturing.
In this often time-consuming process, pharmaceutical active ingredients are synthesized in a chemical manufacturing plant. These ingredients are then shipped to a manufacturing facility, often at another site, where they are converted through defined processes into giant batches of pills, liquid or cream. With multiple interruptions, including transport to separate locations, each batch may take weeks to produce. In addition, manufacturing design and scale-up for a new drug are very costly and time-consuming.
Expected benefits of continuous manufacturing include accelerating the introduction of new drugs by designing production processes earlier; using smaller production facilities, with lower building and capital costs; minimizing waste, energy consumption and raw material use; monitoring quality assurance on a continuous basis instead of post-production batch-based testing; and enhancing process reliability and flexibility to respond to market needs.
The initial research of the Novartis-MIT Center for Continuous Manufacturing will be conducted primarily through Ph.D. programs at MIT laboratories, and then transferred to Novartis for further development to industrial-scale projects.
The partners expect the Center's work to involve seven to ten MIT faculty members, as well as students, postdoctoral fellows and staff scientists. Novartis will commit its manufacturing and R&D resources and will pilot new manufacturing processes with one of its pharmaceutical products.
| Harvard Medical School researchers have successfully synthesized a DNA-based memory loop in yeast cells, findings that mark a significant step forward in the emerging field of synthetic biology. |
After constructing genes from random bits of DNA, researchers in the lab of Professor Pamela Silver, a faculty member in Harvard Medical School’s Department of Systems Biology, not only reconstructed the dynamics of memory, but also created a mathematical model that predicted how such a memory “device” might work. “Synthetic biology is an incredibly exciting field, with more possibilities than many of us can imagine,” says Silver, lead author of the paper to be published in the September 15 issue of the journal Genes and Development. “While this proof-of-concept experiment is simply one step forward, we’ve established a foundational technology that just might set the standard of what we should expect in subsequent work.” Like many emerging fields, there’s still a bit of uncertainty over what, exactly, synthetic biology is. Ask any three scientists for a definition, and you’ll probably get four answers. Some see it as a means to boost the production of biotech products, such as proteins for pharmaceutical uses or other kinds of molecules for, say, environmental clean-up. Others see it as a means to creating computer platforms that may bypass many of the onerous stages of clinical trials. In such a scenario, a scientist would type the chemical structure of a drug candidate into a computer, and a program containing models of cellular metabolism could generate information on how people would react to that compound. Either way, at it’s core, synthetic biology boils down to gleaning insights into how biological systems work by reconstructing them. If you can build it, it forces you to understand it. A team in Silver’s Harvard Medical School lab led by Caroline Ajo-Franklin, now at Lawrence Berkeley National Laboratory, and postdoctoral scientist David Drubin decided to demonstrate that not only could they construct circuits out of genetic material, but they could also develop mathematical models whose predictive abilities match those of any electrical engineering system. “That’s the litmus test,” says Drubin, “namely, building a biological device that does precisely what you predicted it would do.” The components of this memory loop were simple: two genes that coded for proteins called transcription factors. Transcription factors regulate gene activity. Like a hand on a faucet, the transcription factor will grab onto a specific gene and control how much, or how little, of a particular protein the gene should make. The researchers placed two of these newly synthesized, transcription factor-coding genes into a yeast cell, and then exposed the cell to galactose (a kind of sugar). The first gene, which was designed to switch on when exposed to galactose, created a transcription factor that grabbed on to, and thus activated, the second gene. It was at this point that the feedback loop began. The second gene also created a transcription factor. But this transcription factor, like a boomerang, swung back around and bound to that same gene from which it had originated, reactivating it. This caused the gene to once again create that very same transcription factor, which once again looped back and reactivated the gene. In other words, the second gene continually switched itself on via the very transcription factor it created when it was switched on. The researchers then eliminated the galactose, causing the first synthetic gene, the one that had initiated this whole process, to shut off. Even with this gene gone, the feedback loop continued. “Essentially what happened is that the cell remembered that it had been exposed to galactose, and continued to pass this memory on to its descendents,” says Ajo-Franklin. “So after many cell divisions, the feedback loop remained intact without galactose or any other sort of molecular trigger.” Most important, the entire construction of the device was guided by the mathematical model that the researchers developed. “Think of how engineers build bridges,” says Silver. “They design quantitative models to help them understand what sorts of pressure and weight the bridge can withstand, and then use these equations to improve the actual physical model. We really did the same thing. In fact, our mathematical model not only predicted exactly how our memory loop would work, but it informed how we synthesized the genes.” For synthetic biology, this kind of specificity is crucial. “If we ever want to create biological black boxes, that is, gene-based circuits like this one that you can plug into a cell and have it perform a specified task, we need levels of mathematical precision as exact as the kind that go into creating computer chips,” she adds. The researchers are now working to scale-up the memory device into a larger, more complex circuit, one that can, for example, respond to DNA damage in cells. “One day we’d like to have a comprehensive library of these so-called black boxes,” says Drubin. “In the same way you take a component off the shelf and plug it into a circuit and get a predicted reaction, that’s what we’d one day like to do in cells.” Source: Harvard Medical School |
BioMarket Trends: The Future of Genome Synthesis and Design
Implications for U.S. Economy
Rob Carlson, Ph.D., Jim Newcomb, Steven Aldrich
Rapid advances in biological engineering are poised to
dramatically impact the economy. Significant improvements in key
technologies used to study and manipulate biological systems at the
molecular level—in particular, tools for sequencing and synthesizing
DNA—are opening the door to a new era of genome engineering and design.
“Genome Synthesis and Design Futures,” recently published by Bio
Economic Research Associates, examines ways in which these advances in
technology could affect the U.S. economy over the next two decades.
The report assesses the rate of improvement in the performance of key biological technologies and evaluates the potential implications based on analogies to the development of other major technology systems. New approaches to biological engineering are recapitulating developmental stages and pathways experienced in other fields, including aviation, industrial engineering, automotive design, and computer software. Major technological and market trends include the following:
• Productivity of DNA-sequencing tools increased more than 500-fold over the past decade, doubling every 24 months. Costs, on the other hand, declined by more than three orders of magnitude from $1.00 per base pair to less than $0.001 per base pair.
• Productivity of DNA-synthesis methods increased 700-fold over the past decade, doubling every 12 months. Again, costs fell from approximately $30 per base pair to less than $1 per base pair.
• The global market for DNA sequencing technology and services exceeded $7 billion in 2006. The market for synthesis reagents and services reached nearly $1 billion.
We analyze these developments in the context of historical patterns of technology development in the economy. From an economic perspective, the real impact of technology revolutions often lags by several decades behind the emergence of fundamental enabling techniques. The macroeconomic effects of technology revolutions—often measured in terms of productivity improvements or effects on balance of trade—appear late in the cycle of buildout and diffusion.
If new approaches to biological engineering are successful in creating systems of easily combined biological parts, the potential for serial innovation and a rapid development of useful tools is high. However, the ensuing buildout of biological technology will require overcoming formidable technical challenges.
We explore three industry segments in the vanguard of applying these emerging technologies:
• In the chemical sector, increasingly powerful tools and methods for metabolic pathway engineering could open the door to production of a wide variety of chemical products. New technologies could enable the penetration rate for biological production processes to reach 15–20% of the global chemicals industry by 2015.
• Genome engineering and design methods also promise to play important roles in the development of new energy production and conversion methods. The near-term contributions from these technologies are likely to be significant in accelerating the growth of the liquid biofuels industry that could increase from $22 billion in revenues globally in 2006 to as much as $150 billion by 2020.
• Synthetic vaccines could soon account for as much as one-third the global vaccine market.
Looking ahead, ongoing performance improvements are likely to deliver significant further increases in productivity and reductions in cost over the next decade. Intensifying global competition among companies and countries, coupled with abundant innovation, is driving the rapid diffusion of new technology.
Combinatorial engineering approaches that have transformed the fields of electrical engineering and software design are now being leveraged to accelerate biological engineering. These methods are being utilized to produce high-value products for a variety of commercial purposes, and the range of potential applications is huge.
However, the continuing buildout of these technologies will be shaped in large measure by an array of outstanding legal, ethical, economic, social, regulatory, and political questions and issues that have yet to be resolved.Cited from http://www.genengnews.com