Computing the Genome


By BIO-IT World

Horizons

Computing the Genome

CONVERSATION | Charles DeLisi, who helped conceive the Human Genome Project, turns to systems biology and an AIDS vaccine

Interview by Kevin Davies



According to the citation that accompanied his Presidential Citizens Medal, Boston University's Charles DeLisi was the first government scientist to conceive and outline the feasibility, goals, and parameters of the Human Genome Project. Currently the senior associate provost for biosciences at Boston University, DeLisi remains a leading figure in computational biology, with interests ranging from biosimulation to AIDS vaccine development. His long-range goal is "to relate expression patterns to pathways, pathways to networks, and networks to function." His bioinformatics graduate program will shortly move into a new 10-story, 184,000-square-foot multidisciplinary research building. Kevin Davies spoke with DeLisi in his office.

Q: What is your research background?
A: My A.B. and Ph.D. (from City College of New York and NYU) are both in physics. I had an early interest in genetics, but never took biology in college. I did watch it from a distance, was fascinated by its increasing conceptualization, and by the end of my graduate school days was pretty much hooked. At the time I felt my biggest obstacle to really delving into biology was not knowing chemistry — I had taken only a semester in college. So I spent the next three years in the Yale chemistry department learning the language of organic chemistry, and developing methods to calculate RNA secondary structure. My first position was in the theoretical division at Los Alamos, where, ironically, I developed an interest in immunology and cell biology.

After three years, I moved to the NIH intramural program as a visiting scientist. It was — and still is — an immunological mecca, and when I was offered tenure a year later, I couldn't resist staying. At NIH, I was doing mostly immunology, but I was watching molecular biology, and spent some time in the director's office when they were starting to consider the development of GenBank. NIH understood the importance of computers in storing and managing data, but the extramural system didn't appear ready to accept the computer as an analytical engine. The idea of devising algorithms that would look for biological function was pretty much beyond the way most people thought about things. The intramural program offered more flexibility, and I invited Minoru Kanehisa (now a professor at Kyoto University, Japan) — a brilliant computer person and first-rate biologist — to NIH, where we developed the first relational database management system for protein sequences, DNA sequences, structures, etc.

Q: How did you get involved in planning the Human Genome Project?
A: The obvious question [around 1982] was, would we ever get the whole genome sequence? I dismissed the possibility because I didn't feel the biomedical culture was going to accept a project of that size and complexity. We were developing codes for intron-exon boundaries, using Bayesian statistics in the early '80s, but when I gave talks, I'd get these blank stares. It didn't take hold, it was so foreign to people ... People didn't feel overwhelmed by data — they felt they could still handle things in the laboratory. And at that particular moment in history, they were right — but they didn't see the tsunami.


Charles DeLisi
I went to DOE (Department of Energy) [in 1985] as director of their health and environmental research programs, and after three months a report appeared on my desk from the Congressional Office of Technology Assessment (OTA). That report mentioned sequencing the entire human genome. People on that committee were fairly prominent biologists, so I got some indication that someone else in the world thought this wasn't a totally nutty thing to do. Having a reference genome would be spectacularly important. Robert Sinsheimer had held a workshop in May 1985, but nothing had come of it. They couldn't convert that interest into policy. At DOE, we decided to sponsor a workshop [in Santa Fe] of leading molecular biologists and geneticists to get our own sense of what the community felt.

This was initially just a DOE project?
When we held the Santa Fe workshop, we invited every federal agency to it, and no one was interested! All these agencies now, they have genomics as their underpinning, they had no idea what was going on. NIH was hesitant ... but a National Academy of Sciences committee strongly endorsed the project, so that induced NIH to move forward. I was pretty comfortable by then that the project had a life of its own. I moved to BU in 1990 and basically returned to structurally based immunology, and didn't look to genomics until about six years ago. When I started reorienting my own research, it was like waking up after 20 years ... Everything that was an important problem 20 years ago was unrecognizable, just unbelievable.

How important to the pace of the genome project was Celera?
I think industrial involvement has been very important to setting the pace. If [J. Craig] Venter and Celera hadn't come on the scene, the genome project would still be going on. When I left Washington in 1987, I had estimated 2000-2001 as the completion date. It was, in my opinion, unnecessarily stretched out to 2006. The feds accelerated their schedule in response to Celera. The reason it went as well as it did is that the economy was spectacular. If not, venture money would not have been as plentiful, and Celera might never have been formed.

What are you working on now?
I'm still interested in immunology, including infectious diseases. I'm starting to work with teams in India, Maui, and Thailand to develop a vaccine for AIDS. The research goes from basic chemistry to clinical testing, with several points where you can't do things without high-intensity computing, especially at the end of the project, which involves a lot of integration. We'd like to develop a vaccine component for cellular immunity, but an effective vaccine is probably a decade away, assuming it's achievable.

How is computation involved?
Our goal is to develop an epitope vaccine — we look for immunogenic peptides encoded by HIV, carrying out an exhaustive experimental search of the genome. The virus mutates under selective pressure, so a lot of immunogenic sites have mutated away. In order to get a good vaccine, you may need to go backwards in time ... to examine the genome not just as it now is, but as it once was. It's not going to be easy. Therapy (as opposed to a vaccine) is possible, but when you have a disease with essentially no cases of natural immunity, you have to be doubtful whether a vaccine will be found.

One aspect of the computational component is in designing immunogenic peptides. You need to screen computationally because there are too many combinations to explore them all experimentally. They have to be designed so that they bind with high affinity ... it's not likely we'll find a single vaccine that covers the whole human population. Selecting a set of immunogenic peptides — from a group of candidates that have been obtained by an exhaustive search — that optimizes cost, efficacy, and population coverage is a very complex computational problem.

What is the focus of the BU bioinformatics program?
We're doing systems work, not just bioinformatics. We cover the gamut. It's primarily computational/mathematical, but with some engineering of new high-throughput technologies, and numerous collaborations with experimental biologists, clinicians, and engineers. We're really beginning to map out the circuitry of the cell and model it. We look at all genomes, and network orthologies. We have our own clusters, and the Center for Computational Science supports high-performance computing, using IBM. The bioinformatics program spans engineering, arts and sciences, and medicine. We have over 100 students in our program, about 70 Ph.D. students. I wanted to plateau at around 50 Ph.D. students, but we've had some spectacular students coming through this program. We share students with faculty at Harvard and MIT, and have a number of industrial partners, including Pfizer, Serono, and IBM.

What would you consider a milestone in systems biology?
Mapping the transcriptional network of a eukaryotic cell: understanding its functional organization, along with a predictive understanding — probably at some higher-order supramolecular level — of response to changes in environment. Data alone won't get us there — it's going to require much better methods than we currently have for data integration and analysis.



Interview By: Kevin Davies



PICTURE OF DELISI BY: MICHAEL MANNING


Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

gq92112

This Bio•IT World Briefing On “Next-Generation Sequencing,”underwritten by GenomeQuest, Inc.,
presents a selection of feature stories, interviews,commentaries, conference reports, and editorials on the emergence, opportunities, and challenges posed by high-throughput sequencing. Covered in this collection: the launch of new
platforms from Applied Biosystems and Helicos; new applications of nextgen sequencing; the rise of personal genomics; and informatics solutions to vexing problem of managing the vast volumes of next-gen data.  Download now 



sgi_hybrid

SGI's Meeting Today’s Computational Needs for Science

The quest to better understand disease mechanisms and find new treatments is driven by new laboratory technologies and ever-more sophisticated modeling and simulation efforts. As such, life sciences R&D investigations increasingly are relying on more powerful computing resources. The challenge is how to accommodate the broad mix of applications.

Addressing this issue, this paper produced by the Bio-IT World Custom Publishing Group discusses a new SGI Hybrid Computing Environment approach. It optimally uses shared memory systems, multi-processor clusters, and FPGAs to accelerate computational workflows.



sgi_protm

SGI's Supercharging Proteomics Discovery

The deeper study of proteins and their interactions can reveal scientific information once considered nearly untouchable to scientists and researchers. Today, unprecedented advancements in computing power are enabling the creation of mounds of proteomic based data along with the accompanying bottlenecks data can create.

Rather than just “simplify the experiment” to fit the computational resources an alternative is now available with the SGI Proteomics Appliance. This complimentary white paper, produced by the Bio-IT World Custom Publishing Group, looks at ways to use the Proteomic Appliance to handle the most intensive proteomics computing tasks facing science today.



Life Science Webcasts & Podcasts

Waters

Automate Method SOPs for Greater Efficiency and Fewer Errors with Waters® NuGenesis® Intelligent Procedure Manager

waters podcastThe Waters® NuGenesis® SDMS Intelligent Procedure Manager is a workflow software package designed to guide the laboratory analyst through a routine, comprehensive method standard operating procedure (SOP) and integrate results with a chromatography data system. The Intelligent Procedure Manager addresses the predominately manual activities required to perform an analytical method or test, reducing cycle times up to 50-75% as compared to a traditional paper trail with fewer opportunities for human error.  From late-stage development to final product quality control and lot release, Intelligent Procedure Manager can be applied to any lab environment where rigorous adherence to approved test methods and SOPs must be followed, including compliance requirements for cGMP operations. Download Now


More Podcasts

Job Openings

Lilly Singapore Center for Drug Discovery (LSCDD) - Associate Director of Informatics
Lead and mentor a strong team for the Bioinformatics group at the Integrative Computational Sciences (ICS) department at LSCDD towards the development of novel algorithms, data analysis methods and software tools for drug discovery. Work closely with the Software Engineering group at ICS, and collaborate with the Discovery IT organization in Europe and USA. For additional information, or to apply visit: LSCDD 

 Lilly Singapore Center for Drug Discovery (LSCDD) - Senior Software Engineer
Join a strong team of software engineers in our Integrative Computational Sciences (ICS) at LSCDD. Collaborate with, and help develop integrated applications to process and visualize data from cutting-edge technologies used by scientists at Lilly Research Labs (LRL) and the Drug Discovery Research (DDR) teams. The Software Engineering team provides computational tools and tailored software solutions that enable the global effort of Tailored Therapeutics; ‘The Right Drug, at The Right Dose for The Right Patient at The Right Time'. For additional information, or to apply visit: LSCDD 

For reprints and/or copyright permission, please contact RMS, 1808 Colonial Village Lane, Lancaster, PA;

(717) 399-1900 ext 100 or via email to bio-itworld@theygsgroup.com.