Dr. Mitchell Gail Oral History
Download the PDF: Gail_Mitchel_oral_history (289 kB)
National Cancer Institute
Division of Cancer Epidemiology & Genetics
National Institutes of Health
Oral History Project
Interview with Mitchell Gail
Conducted on May 31, 2022 by Holly Werner-Thomas for
History Associates, Inc., Rockville, MD
HWT: My name is Holly Werner-Thomas and I’m an oral historian at History Associates, Inc. in Rockville, Maryland. Today’s date is Wednesday, May 31, 2022. And I’m speaking with Dr. Mitchell Gail for the National Institute of Cancer, Division of Cancer Epidemiology and Genetics (NCI-DCEG), part of the National Institutes of Health, or NIH. The NIH is undertaking this oral history project as part of an effort to gain an understanding of the National Cancer Institute’s Division of Cancer Epidemiology and Genetics. This is one in a series of interviews that focus on the work of five individuals at the NCI-DCEG including their careers before and during their time with the institute. This is a virtual interview over Zoom. I am at my home in Los Angeles while Dr. Gail is in Shady Grove, Maryland. Before we get started, can you please state your full name and also spell it?
MG: Yeah. My name is Mitchell Henry Gail. M-I-T-C-H-E-L-L, G-A-I-L
HWT: So, before I read your bio, I want to make sure I have a couple of pronunciations correct. And one is for an award in statistical research, and I want to make sure I have this right. Is it Snedecor [pronounces with long e sound]?
MG: Is it what?
HWT: Snedecore? S-N-E-E
MG: No, it’s Snedecor [pronounces with three syllables].
HWT: Snedecor. See? There you go. And then the Howard Temin Award for AIDS Research, is that [pronounces Temmin or Teamin?]
MG: [pronounces Teamin]
HWT: Temin. Snedecor and Temin. I’m glad I asked. So, here’s the bio: Dr. Gail received an MD from Harvard Medical School in 1968, and a PhD in statistics from George Washington University in 1977. He joined NCI in 1969 and served as chief of the Biostatistics Branch from 1994 to 2008. Dr. Gail is a fellow and former president of the American Statistical Association, a fellow of the American Association for the Advancement of Science, an elected member of the American Society for Clinical Investigation, and an elected member of the Institute of Medicine of the National Academy of Sciences. He has received the Spiegelman Gold Award for Health Statistics, the Snedecor Award for applied statistical research, the Howard Temin Award for AIDS Research, the NIH Director’s Award, the PHS Distinguished Service Medal, and the Nathan Mantel Lifetime Achievement Award. He was elected as the chair elect of the section on statistics of the American Association for the Advancement of Science, AAAS, and was selected to deliver the prestigious 2013 NIH Robert S. Gordon, Jr. Lecture in Epidemiology. He was named an NIH distinguished investigator in 2019. Does that all sound about right?
MG: Who is that?
HWT: (laughs) That might be you. So, you probably saw the questions that I sent. Did you? I’m not sure.
MG: I saw the questions. Yeah.
HWT: Okay. Great. So you know that, you see the beginning. It doesn’t have to be completely chronological at all. But I do like to start by asking people to describe their backgrounds in relation to their career paths. So, in other words, if you had mentors, if you had teachers, what did you see or notice that influenced you?
MG: Well, I’ve been very fortunate in my home environment, which was very supportive. My parents, I don’t think they pushed me in particular directions, but they gave me encouragement and I think that they always made it clear that a high priority for them was supporting my education. That was very important. And I was growing up in Kentucky and went to something called the University Training School, where a lot of faculty kids went. So, it was a small school but a pretty good quality for Lexington. I had a good math teacher. I had a teacher who made me less rambunctious and more interested in music and studies in the fifth grade. I think that they sort of shaped my attitude toward learning. Then I went to Andover in the last two years of high school. And that was a very rigorous scholastic experience and gave me some confidence in mathematics and in physical science, chemistry, things like that. And I think I took that with me to college. So those were advantages that I had.
And in college, I didn’t have to worry about finances. My parents were able to support me. And I had more or less decided to go to medical school. And so, I had a certain focus in college, and it was science-oriented. Although many of our most distinguished medical people had more of a liberal arts background, it turns out. But in any case, I majored in physics and chemistry in college. And I did some singing in college, too, that probably carried over from my early days in Kentucky. And that’s been one of the most constant sources of memory and continued pleasure in association with my college years.
Medical school was a closer environment. It was a smaller group. And I got to know people quite well and work closely with some of the other students there. And that I think was very formative. But I couldn’t really find a field of medicine that I wanted to practice. I thought I wanted to specialize in some aspect of medicine and do research together with patient care in that area. But I really couldn’t find the field that I wanted to do that in. But I think it was very valuable because it gave me a chance to think about the kind of work that I would like to do eventually. And just to have the time to think about various possibilities. And the breadth of exposure to not only basic science, but some more clinically oriented science. I think that was a great background. But really, I had the luxury of time.
But then the Vietnam War came up. And I was in, I did a medical internship in Boston at the Peter Bent Brigham Hospital. But I was thinking, some of my peers and my brother-in-law would continue taking residency training, but then they would go to Vietnam in the Berry Plan. And an alternative came up for me, which was to join the Public Health Service at NIH. And I became part of a cohort of people who during the Vietnam War found that very attractive. It was a Uniformed Service. I thought I’d be going for a couple of years. And I struck it off with a fellow named Charles Boone who was working in a cell biology laboratory at the National Cancer Institute. And I think he was attracted by my physics training, and he offered me a position there. So, what started out as a two-year military commitment became a 50-plus-year career. (laughs) But that, it shows the role of chance and history in getting people started.
Another aspect of that was that I was really doing basic cell biology. We were studying how cells move in tissue culture. And they move in a random walk in two dimensions on a petri dish. And I figured out that you could characterize how fast they’re moving by using something called the diffusion constant for a random walk. And I, you know, was just sort of plopped into this laboratory with a biological setup already there. The cells were there. The time-lapse photography was there to watch the cells move. Charles Boone had set up all this stuff. But I brought to it a kind of quantitative perspective.
And then I knew how to estimate this diffusion constant. But to perform statistical inference on how much uncertainty there was in the estimates, I wasn’t exactly sure how to do that. But there was in the same building, Building 37, Dr. John Gart, who was a fine statistician. And he said, “Well, a two-dimensional random walk, that’s exponentially distributed. The squared displacement is exponentially distributed. We can do certain kinds of regression on this.” So, he taught me how statistics could be used to enhance the understanding of a physical process like that. And I had never taken a course in statistics in college. But in medical school, I took a course with Ted Colton teaching doctors, really, how to read the literature and the fundamentals of statistics. And I was kind of interested in statistics. But Dr. Gart’s advice on this problem suggested to me that would be an area that I’d like to go into. And one of the great things about NIH at that time, and I think probably today, too, was that if you needed continuing education to learn a skill that could enhance your research, NIH would fund it.
So, I started taking some night courses at George Washington University. And eventually got a PhD in mathematical statistics and made a transition more to clinical applications of statistics rather than laboratory applications. That’s how I got started, really.
HWT: Thank you for that wonderful overview. And you’ve already anticipated some questions. But I want to go back just a little bit to get into the weeds a little bit more. And we don’t have to spend too long there. But I’m curious because you were drawn to physics, chemistry, medicine. Can you describe why that direction? This is previous to even applying statistics to what you do. Or not. Sometimes people don’t know why they’re attracted to what they’re attracted to.
MG: Well, you know, I always liked the physical sciences even before I went to Andover. And my father was a physician. And he, I think he was very pleased when I decided to go to medical school. But he did not really, you know, he didn’t say. “You should go to medical school.” But I knew he was pleased. So maybe there was a little of that. But I think when I left Andover, I said, “Gee, I’m going to do a double major.” I was kind of ambitious. I’m going to do history and literature. And I went and saw a tutor in history and literature, and I realized how much reading I was going to have to do. And how much memorization I was going to have to do. And I also realized that I had natural strengths in the sciences. And it just took me about a month of looking at all those books before I decided that I’d rather do my history later in life.
HWT: Can you tell us one story about Andover?
MG: Well, Andover, well, one story about Andover is that it’s an environment where, in those days, if you smoked a cigarette, you could be thrown out. It was an environment where you were expected to work hard and do well in your studies, but also have a busy couple of hours a day in athletics. And I, you know, tried my best at athletics, but that was not my strength. I could play baseball. I could do some cross country, some soccer, and things like that. But I was never in the A league. But Andover offered a chance to be in the chorus, be in a band, and I think I had a leadership role in the Andover Chorus. And I enjoyed my coursework. Now I was a little diffident because you know, coming from Kentucky into the junior year at Andover could be quite a challenge. I think the most challenging thing that happened to me was studying English. I met a Professor Owen who taught me that florid writing is not necessarily good writing. (laughs) And that was very valuable. And those were my lowest grades when I was there, when I first got there. I did quite well at Andover grade-wise. But just learning a little bit about reality and how good other people are and other things, I think, was very valuable. And I’ve made some friends that I’ve stayed friends with over the years, and still stay in contact with.
One of the most exciting things that happened at Andover was after I graduated, I took a trip to Europe with one of my friends there. And we’ve stayed friendly over the years since then.
HWT: I’m really curious about this Vietnam angle. Because I’ve talked to several people now at the NIH who have said the same thing. So there seems to be this cohort of people who were young men in the late 1960s who became involved in Public Health Service in various ways partly due to the draft, it seems. Is that a correct assessment, would you say?
MG: Yeah. And you’ve probably heard the term “yellow beret”. Because you don’t necessarily have to be that brave to go into the Public Health Service. Some people got sick in the Public Health Service because in fact, if you work in a laboratory, you run certain risks. It’s not like being shot at. And the Public Health Service has many facets to it. But what I think you’re referring to here might be the NIH aspect. And research is one aspect of the Public Health Service. And there was a cohort of very talented people who were given that opportunity. And I know several of them, professors at Harvard, professors here and there. Or people who became excellent clinicians in private practice and maybe spent twenty years at NIH and then went on to another career. But I think there was a huge amount of talent.
And some of the leaders, for example, Dr. Joseph Fraumeni, who sort of established the groundwork for the Division of Cancer, Epidemiology and Genetics, he was in the Public Health Service. A number of my peers, in what became DCEG, I would say a fair number were in the Public Health Service. And actually, a nice thing about it is once you got into the Public Health Service, you didn’t have the kinds of pressures that a young person faces today concerning will I get on tenure track? Will I get a position here? If you’re in the Public Health Service, you have basically a career path for 20 or 30 years if you want to stay in it and if you do reasonably well. Your recognition and your promotion and everything depends on your accomplishments. But there’s a certain security level there in being in the Public Health Service.
The worst thing that could happen to somebody if they were really bad in the Public Health Service would be that they might be sent to serve in Alaska. Or they might be sent to serve in the Bureau of Prisons or something like that. Because there are a lot of jobs in the Public Health Service that a researcher would not find attractive but are very valuable services to the country.
HWT: Just one follow-up question, then we’ll move on from that particular subject. So, was that something that you found that other young men were doing deliberately, in other words? Joining the service in order not to be drafted?
MG: You know, I didn’t, I never really asked people, “Are you doing this to stay out of Vietnam?” But I think if a person accepted the moniker of yellow beret, (laughs) they acknowledged that that was an incentive. And to the extent that it was a real cohort, you got to say that that was a motivation. Yeah.
But you know, NIH, for the really dedicated researcher, NIH in those days was a kind of rite of passage for many people who then went back to academic medicine. It was a chance to really learn how to do fundamental research and not have to worry about patient responsibilities necessarily, although clinical researchers had both aspects. But it was a chance to really see how research was done. And before the Vietnam era, this was already a well-established path. People could come to NIH, do some research for a few years, then go into academia and develop a great career in academia but with a basis and a network that had a basis at NIH. So, some people probably went during the Vietnam era just because this was part of a career path leading to a medical research career.
HWT: So, when you—well, let me back up and ask you this first. How did you come to work for NCI? And what were your initial goals?
MG: Right. Well, of course, when I came, I would have probably gone to any lab that would have taken me. But Charles Boone was in the Cancer Institute. And he took me. Now, and I had a very successful three years working with him. And some of those papers are still cited from that period. But after I studied how to measure cell motility and certain factors that affected it, I had to really make a basic decision, which was did I want to become a cell biologist, or did I want to move in a more clinical direction?
So, I began to look around for possibilities. Was there a place at NIH that I could find a home in and work on more medical or clinically-oriented statistics? And I read a paper in a journal, it was about liver disease, and contacted the author, who I think he also was a yellow beret, and said, “Why don’t we go to lunch, because there are some comments I have about your paper.” And at that luncheon was another fellow named David Byar. I didn’t know anything about David Byar. But he was at the luncheon, and he was kind of interested in the conversation. Then he said he was starting up a new section called the Clinical and Diagnostic Trials Section. And he was in the Cancer Institute. And he said, “Would you be interested in joining?” And I thought this was a good chance to move toward biological, let’s say medical and population sciences. And really, it was this chance luncheon. And Dave saying, “Why don’t you join us?” And there was a very talented group of people that gathered around Dave. He had some experience in clinical trials. But he gave big responsibilities to people in his section. And you know, before I knew it, I was the chief statistician for an international clinical trial in lung cancer, and also working for something called the Immunodiagnosis Committee in NCI. So, he assigned us important projects.
And I was still learning to be a statistician while having these responsibilities. And then I started taking these night courses and became a full-fledged statistician in 1977.
HWT: And we’ll get into that a little bit, too. I’m sorry, go ahead.
MG: Yeah. So, there’s a lot of just chance events. But I had a feeling of the kind of thing I would like to do. And then something happened.
HWT: This is often the case. It’s amazing. Can you describe what NCI was like when you first began there? And I know this is broad, but also how it’s changed.
MG: Yeah. Well, NCI had some very talented people. And there was a smaller feel to it. And of course, I was just working in Dave Byar’s group. I didn’t know much about what NCI was or how vast it was and so forth. But one thing about it at that time, you started out as kind of a research associate. And if you did pretty well, you could be pretty sure that there would be a way forward for you and you might have a chance of developing a career there. It was not as competitive and as exacting, the promotion process was not as difficult, it seemed to me at that time. Now maybe it was just because they chose a lot of good people, and good people did well. But I didn’t really sense the pressure or the burden of constant assessment. So, I think, now probably higher up, probably there were people working in other places where there may have been more pressure. But where I was working, it was a little less, it was not stressful.
And another nice aspect of it would be, there was kind of informality. The shops were not that big, and our group would often, practically every day we would go out to lunch and talk about what had happened that day. And often ideas would come up and people would use paper napkins and bring those ideas back to the office and work on them. So, there was a certain informality and a little bit less pressure in those days.
The other thing is that studies were—well, there were some large-scale studies, I shouldn’t say that. There were some very large-scale studies. But there were also smaller scale studies. So, individuals had a sense of controlling what they were working on. But I think the nature of science has changed over the years. The review processes have changed over the years. And the need for large-scale studies has increased to answer certain kinds of questions. And the degree of formal assessment, especially in what’s called the Intramural Research Program of NCI where I work, has become an ordeal for some people. Every four years you are strenuously evaluated, and if you don’t do well, a lot depends on it. And especially for people on tenure track. They really have to do well. Otherwise, they won’t be put up for tenure. So, things have gotten harder for younger people these days. But it’s probably true in academia, also. But you know, there are some large-scale changes that shape the nature of research that have an impact on this, too. And the need for periodic assessment. And I could talk about that if you’d like.
HWT: I would like to. But first, just staying with the early days, I have a couple more questions. So, for example—sorry, something’s popping up in my, what is it with technology today? I don’t know. (MG laughs) Very strange. But anyway, so I wanted to stay a little bit with the early days and ask you just a couple more questions. You know, for example, I know this is going back, but can you describe an average day as a research associate? What did you do when you first arrived in the morning? What was the average day like?
HWT: You know, especially technology was so different, right? So, anything you have there.
MG: Well, when I came into the laboratory, my very first experience, as I mentioned, Charles Boone had set up a laboratory and he knew he wanted to study how cells moved in tissue culture. And he had established all these facilities, including photographic systems so you could take a picture every two minutes and then you could watch the cells move around. And he had done all that work.
So, my day was, I was doing one thing and I had a laboratory assistant, Clint Thompson, who was doing another thing. Clint was actually in the laboratory tending to the cells, making sure they were healthy. Setting things up. Setting up the cameras and everything. I was in the library, or maybe in the cafeteria, drinking coffee and thinking about stuff. About how to interpret this and how to formalize, how to quantitate what we were trying to do. So, I would be working alone, more or less, in the library or in the cafeteria. Clint would be doing laboratory things. Then I would come in around lunchtime and we would start watching movies. Because we were watching how the cells were moving, and we were actually plotting their motion and measuring things. So, it was like going to the movies for a couple of hours. And then I would, then I would go back and try to analyze these data and do some work. Maybe go to the computer center at NIH and then, and maybe start writing these things up. And then think about other experiments. And then Clint and Dr. Boone would talk about that, and see if it was feasible, and try to get things done. So those two years, it was kind of a mixture of library work, because I had helped in the laboratory, some laboratory time. But kind of a blend of laboratory and, you might say, theoretical work.
But when I started working with Dave Byar, there the statistician became more of a consultant to other people. So, there was more interaction with people down the hall who had a particular problem, and in some of your – my methodologic work was motivated by those interactions. But there was a combination of collaborative work with other scientists, epidemiologists, say. And then a theoretical issue would come up and then I would work on that.
And one nice thing about a day in Dave Byar’s lab is that we sometimes had outside visitors who would stay with us and do their sabbaticals and things. And some of my most enjoyable and productive experiences came through contact with some of those people. And that led to nice theoretical work. But I had a blend of consulting and collaborative responsibilities as a statistician, and also freedom, at least maybe half my time, to work on theoretical problems that arose in connection with those applications.
HWT: Fantastic. And just building on that a little, can you describe just a couple of examples, perhaps, of you’re talking about conversations that might have influenced your methodology even? Or outside influences and how productive those became in your career? Can you give us a couple of examples?
MG: Yeah. I’ll give you two examples. Well, maybe three examples, of the things that may have been our most important public health contributions in some ways. But one thing happened was that we had a visitor to our laboratory named Ronald Brookmeyer. And he was not yet tenured at Hopkins. But he wanted to spend some time in our Division. And he and I were talking about various problems. And this was just about the time that the HIV epidemic was becoming a serious problem. And Ron and I noticed, and Ron was quite a leader here. He did some work trying to figure out how long does it take between the time you get infected with HIV and the time symptoms develop? The so-called incubation distribution. And it takes a long time. Sometimes nine or ten years before a person who’s been infected has symptoms. And the CDC was counting people with symptoms. And Ron and I said, “Well, maybe we can figure out, since we know how long it takes to develop symptoms, maybe we can go backwards and figure out from the number of people that have symptoms how many people must have been previously infected, and when, in order to account for the people who are developing symptoms now.” Something that we called “back calculation”.
And when we did these calculations, even though there were hundreds or maybe at most a couple of thousand people who had developed symptoms at the time we did these calculations, we calculated there must have been hundreds of thousands infected to give rise to the thousands that were being counted. And we wrote a paper in the Lancet. It’s something about the minimum size of the HIV epidemic. And it was, you know, like 200,000 cases, we said this is the absolute minimum. And that was very controversial. But turned out to be, of course, lower than what eventually happened. Whereas some classical epidemiologists were used to seeing epidemics rise and then fall rapidly, we knew that because the incubation distribution was so long, this wasn’t going to happen. This epidemic was going to continue to rise for quite a while.
And Ron and I became sort of advisors to the CDC in projecting HIV incidence. Because if you know how many have been infected in the past, you can project forward when people are going to get it in the future, get symptoms in the future. So, I think we were helpful, at least in the early days of the epidemic, in making projections of what the public health needs would be and so forth. That was a pretty exciting time, but it grew out of a visit.
Another example is the so-called Gail model, or the Breast Cancer Risk Assessment Tool for projecting breast cancer risk in women. And that started at lunch with John Mulvihill, who was in our Division of Cancer Epidemiology and Genetics. Because he was doing clinical work and advising women who had family histories of breast cancer concerning how drastic of measures they might want to consider in order to prevent breast cancer. And some women who might, let’s say, have a mother and maybe a sister with breast cancer, they were advised or otherwise learned or thought that they had at least a 50 percent chance of getting breast cancer. That would be true if the family was carrying something called an autosomal dominant mutation. But most people with family histories like that don’t carry those genes. So, their actual risk was much lower than they thought. And John wanted to know, whether we could project the risk of developing breast cancer over five years, over ten years, over a lifetime or to age 90.
And so, I started working on that problem. And I was, as I mentioned in Dave Byar’s group. Dave, himself, and a fellow named Sylvan Green and some others had developed some related mathematics that would be helpful to me. And I used that. And I put the absolute risk into it. Some people are interested in how big your risk is compared to somebody with no risk factors. But I’m really interested in what is your risk? Not what is it compared to someone else. And in order to calculate your risk, you not only have to know about the risk of the disease, you also have to know the chance that this person might die of something else before they develop breast cancer, say. Because that reduces the risk of breast cancer.
So, this became the theoretical groundwork for the Breast Cancer Risk Assessment Tool. And the other thing that made it happen was that previously there had been a largescale study called the Breast Cancer Detection Demonstration Project, which was done by another division of NCI. And the purpose was to see whether women would come in for mammographic screening and learn a little bit about what happened to them. But it was not originally set up as a research project. It was set up as a demonstration that people could come in for mammography and this would have a preventive effect, or at least an early detection effect.
But meanwhile, they followed those women. And there was a lot of follow-up information, and it was put into 20 or 30 reels of computer tape and stored. Some good work came out of it, describing the Breast Cancer Detection Demonstration Project and everything. But there was a bit of a goldmine there of relative risk information and absolute risk information, if someone in Dave Byar’s group had the ability read those old tapes and extract certain information that I needed. It was quite a project of data management in those days. But that was a large-scale project. But there was a team effort there to create the foundations of the model. And the model’s been updated over the years, and certain aspects of the data have been updated. But it was an example of one person coming with a question and a team of people being able to assemble the theoretical and empirical data needed to make the model.
HWT: So, I wanted to ask you, and I can imagine it was quite a feat regarding the tools, I was wondering what the tools of data management that you used were, because I’m sure that’s evolved. What exactly was that date or those years that you were developing that model?
MG: Well, let’s see. The paper came out in 1989. But the work began in probably around 1985, I would say. Yeah, just getting access to the basic BCDDP data, there was a little bit of institutional history. People knew people in the other division. People were willing to provide the tapes. People were willing to mount the tapes at the computer center and everything. And I didn’t really have to do that much of that, of what I would call the fundamental data management. But I knew what I needed from those data. And they were able to extract it for me. And then I got a subset of the information that I needed and was able to proceed. But it was the day of the mainframe. Fortunately, we didn’t have to use punch cards, which a few years previous we would have had to. But the mainframes could read the tapes and they could produce a useful material.
Of course, things were totally different in those days. There was a premium on getting it right the first time, because every time you made a mistake, you’d have to wait a day to get another run through the mainframes. So, there’s a totally different mentality today. Today, computing is so cheap and fast that a good approach to many problems is to make mistakes and correct your mistakes along the way. You can correct them in one minute. So, there’s less a premium on exact algorithmics the first time. There’s a little bit more experimental approach to programming these days, I think. But in those days, it was a bit slower.
But it’s just amazing what has happened in terms of computing over the years. And now we have access to valuable large datasets that are online. You don’t have to mount those tapes. They’re already accessible. But there are whole teams of people working at NCI, for example, and throughout the country, to assemble information, let’s say, on the incidence of cancer in each age group. So, we know for at least a quarter of the population what the age-specific incidence rates are and how they change over time and everything. And all this information, some of which is valuable for updating the Breast Cancer Risk Assessment Tool, it’s available at your desktop. Because a lot of work that’s gone into developing these systems, principally in the Division of Cancer Control and Population Science— let’s see, DCCPS - is one aspect.
The other aspect that’s changed over time, because of the computer, is the library. I used to spend a lot of time in the stacks. Or I used to spend a lot of time going to the art department to see if they could make a graph for me. Now you never have to leave your home office to have access to a vast library through the NIH library systems. And most graphics are easily done using computer software. So, you know, so I would say that not only has computational speed increased, but a number of ancillary research resources are something that you can do at home these days that used to, you know, take part of your week (laughs) by going delivering the requests and picking up the goods.
I remember a fellow named Jerry Cornfield who was at NIH and was one of the people that I really admired so much. And even though I never worked for him, I did take a course with him at George Washington University. He had a big impact on me through his writings. But one story that he told was about the early days of a mathematical invention he had made that required to invert a matrix. That means find another matrix which when you multiply them together give you the identity matrix. And today, you know, people think nothing of inverting an 80-by-80 matrix. But when he was doing this, he needed to invert a six-by-six matrix. This is child’s play today. But in those days, he said, “You know, I really need to invert this six-by-six matrix. I’m going to send it to MIT. They have a computer up there that can do this.” But because it was a mathematical matrix, the procurement office would not let him send this request to MIT to invert the matrix. So, what he did was he sent a procurement order through an engineering department. And he said, “I want one six-by-six inverted matrix.” And through the engineering department, they thought this was some piece of metal or something, they were happy to support the inversion of this matrix. So, he was able to get MIT to invert this six-by-six matrix and send him the answer. But it was because the procurement allowed for a mechanical matrix—that’s fine, we can order one of those—but we can’t order the inversion of a mathematical matrix. But anyhow, I don't know if you consider that funny. But to me, it’s very funny. And he got his six-by-six matrix. Today, on my computer, the computational difficulty goes up more or less as the square of the size of the matrix. Today, to do an 80-by-80 matrix, I can just press a button and I’ll get it in no time.
HWT: You know, I wanted to ask you later on, but this is such rich information about technology and technological changes. But I think it’s super important for institutional history, you know, and just to see the distance and how researchers went about going about their work. So for me, this is all really rich, and I appreciate it. I wanted to ask you before we go forward, also going back, and then I promise we will go forward, in the first say, five years that you started work at NCI, what were the questions that you were asking at that point? I know that you were part of the lab that Charles Boone set up. But for you personally, what was your motivation? What were you asking? What did you want to know? What did you feel compelled to need to know? That kind of thing. What were your goals?
MG: Well, I guess my goal was to find interesting problems and do a good job on them. I didn’t have a big plan, let’s say. And when I first, I really, I think I lucked out in the laboratory work because I found an interesting problem. And once I had developed the tools to measure motility, then there were a lot of ancillary questions that could be easily answered. Like you know, why do normal cells slow down, whereas cancer cells keep moving? Or what is the effect of this anti-cancer drug, colchicine, on motility? You could answer a bunch of biologically interesting questions, or at least describe them, once the system was working. And so, my immediate goal when I was working in a laboratory was to develop the tools. And once we had the tools, to exploit those, to describe certain phenomena that were of interest in the cancer field. But to understand at a deeper level why this particular pharmacologic agent slowed the cells down, or why the cancer cells kept moving over each other would have required a deeper molecular basis for understanding. So, I would have had to become a real cell biologist learning a lot of biochemistry and specializing in subcellular features to elucidate this line of research, to go from description to understanding. And some people had great careers doing that. But I felt that I wanted to move a little bit more toward the clinical side and not into deep molecular understanding.
But when I joined Dave Byar’s group, I didn’t have time to worry about where was I going. I had my hands full. And one big trial was trying to see whether people with an early stage of lung cancer would benefit from injecting something called BCG, a type of tuberculosis bacterium that makes an immunologic response. Whether at the time of surgery if you gave them BCG, this would ignite an immune response that would help prevent the cancer from coming back. And there was a reason for this. A small trial had got a quote “statistically significant” result showing that it worked. So, we had to set up a whole structure to do this study. And that was my introduction to clinical trials. And we had about 500 patients in this study. And unfortunately, we got one of the most negative results I’ve ever seen. There was absolutely no difference between those who got the BCG and those who didn’t, even though there had been a statistically significant result published. But there are good arguments for saying a small trial, even if statistically significant, can be a false positive. And this really proved it. So, I was busy with that and a number of ancillary studies and learning about clinical trials.
And then there was some work on evaluating immunologic tests to see if they could detect cancer early. That was another large committee work assignment. And both of these areas gave rise to some methodologic work that I published in the statistics literature in addition to trying to support these groups.
But I did not have a grand plan. And actually, my plan was to find interesting problems and work on them. And follow the science. And actually, you know, I have, you might say that I’ve had a very prosaic career because I’ve always been at NCI. But actually, I’ve had experience in laboratory work, in evaluating diagnostic procedures, in clinical trials, in cancer prevention, in cancer etiology. And all the while, trying to keep track of the evolving technologies that support those studies. So, there’s a lot going on even though you’re in one place ostensibly.
Now there were a couple of reorganizations along the way that sort of refreshed my exposures. But really, it’s been a very rich evolution without a lot of advance planning, I must say.
HWT: So, let’s talk about some of that evolution. So, obviously you’ve seen the questions. But from 1985 to 1994, you were a medical statistical investigator in the clinical and diagnostic trial section. And then you went on to be head of epidemiologic methods section at NCI. But I wanted to ask you, before you talk about those roles, is there anything you want to add about 1969 to 1985? Because that’s a long period of time. And again, it doesn’t have to be completely chronological. But if there’s anything you want to talk about. And then I would love for you to address those particular roles that you played as leader.
MG: Well, I, let’s see, in 1972 I think I joined the Clinical and Diagnostic Trials Section. And there was a reorganization that took our section, our division, and put us for a while in something, I think it was the Division of Cancer Prevention. My original Division, DCCP, it was called, Division of Cancer Cause and Prevention, was divided. And some of us had to go one way and some had to go another way. And Dave Byar wanted to go into prevention. And I followed Dave. And there were some interesting new projects that came along with that, including a big trial to see whether you could accelerate cessation of smoking. And there were other very interesting studies that came along.
But after being in the Division of Cancer Prevention for a while, I think it was around 1984 or 1985, I was invited by Dr. Fraumeni back to his division, which was the incipient Division of Cancer, Epidemiology and Genetics. And he said, “Would you like to head up a section on epidemiologic methods?” And so before then, I had been a researcher, a worker, but without real leadership responsibilities. Once I was the head of the Epidemiology Methods Section, I had to think about additional things like recruiting people who might become staff members eventually and evaluating people. But still, it was a very productive research time for me. And part of it was that I was located within something called the Biostatistics Branch, which was headed by a fellow named Bill Blot, William Blot. Bill Blot had trained as a statistician, but he was really at heart a superb epidemiologist. And he had a Descriptive Studies Section, he had an Analytical Studies Section within the branch. And there were two statistics units, also, in the Biostatistics Branch. But it was at least half epidemiology.
So, I didn’t have to worry about running a large branch. I just had to make sure that people were happy and productive in a relatively small group of maybe eight people or something like that. And I think that’s a pretty ideal group size. That’s, let’s see, now I’m trying to think. Yeah. So, from about 1985 to 1994, so some of the work that I described, like the work with Ron Brookmeyer, some basic work on absolute risk, the paper on the so-called Gail Model, all sort of came to fruition during this period. But some earlier work with Dave Byar that I enjoyed had to do with designing these group-randomized trials for smoking cessation. And there were others—so I had a great run with Dave. But then I had to take on a bit of a leadership role. And then I was kind of surprised when Bill Blot decided to retire in about 1994. Or not really retire. He began to set up his own International Epidemiology Institute. And eventually he became a leading member of the faculty at Vanderbilt. So he never retired. But he left NCI, and they suggested that I take over as Branch Chief. So in 1994, I believe, I became the Acting Branch Chief and in 1995, I was appointed Branch Chief. And I did that for 12 years. And there was a period of transition there because I was not really equipped to run the epi [epidemiological] side of the Biostatistics Branch because that was not my strength. And eventually we became more of a biostatistics branch, and the epidemiologists went elsewhere in the Division. And two of them became branch chiefs themselves a little later. So, they did quite well. But we became more oriented toward statistical theory and supporting and collaborating with epidemiologists throughout the Division. But not necessarily leading epidemiology studies.
But there was one study that Bill Blot sort of left hanging that seemed intriguing to me. And he had done some early work in China. In a region called Linqu County in Shandong Province where gastric cancer was a leading killer. And he decided to do some work with a fellow named Wei-Cheng You to do epidemiologic or observational studies to see what might be causing this. And he came up with some hypotheses. And one hypothesis was [that] there was a lot of Helicobacter pylori, this bacterium, in the population. Maybe that was causing the cancer. Another hypothesis had to do with maybe these people were not getting enough so-called allium vegetables, like garlic and garlic products that could have a preventive effect. And he identified one other area, a certain vitamin deficiency. He said, “We really should do a trial in Linqu that studied three treatments: a two-week treatment for Helicobacter, just give them amoxycillin and omeprazole for two weeks. Or seven years of supplements with a garlic preparation. Or seven years of supplements with Vitamin E, Vitamin C, and selenium.” So three different treatments, and studied in something called a factorial design so you could learn in one study about three preventive interventions.
And with my background in lung cancer, I knew that none of this was going to work. But I thought it would be worth a try. I thought the Helicobacter would work, I must say. But I was very skeptical of garlic and vitamins. And thus began a trial, the Shandong Intervention Trial, which NCI supported vigorously, and we collaborated with the Beijing Institute of Cancer Research, who was doing the fieldwork. We were doing the data management, the work on design and analysis and so forth. And some laboratory studies back here based on their samples. But that was a real team effort. And one of my helpers, Linda Brown, was in the Public Health Service. She would go to China with me and with another group of people at a support institution. A woman named Linda Lannom I remember quite well. And several others. So, this was a real team effort.
And sure enough, after seven years we started to get a signal [that] the treatment of H. pylori was doing something favorable. After 15 years, it was clear they were getting a large reduction in gastric cancer incidence and mortality from just two weeks of treatment for H. Pylori. And believe it or not, after 22 years, we got results. Already at 15 years there was a hint that some of these vitamins and garlic stuff might be beginning to work. But it was 22 years later we found that all three treatments were effective in reducing gastric cancer mortality. Which is unbelievable to me, because it’s very seldom that I am privileged to work on a positive clinical trial. (laughs) But I think that there are there really wonderful leads for prevention that came out of this trial.
And of course, whether it would generalize to Western countries where the diet is different, or whether it would be useful in places where people have developed antibiotic resistance to Helicobacter, these are the kinds of questions that remain. But still, I think we got a so-called hat trick in this trial. Three interventions reduced gastric cancer mortality, typically by about like 40 percent or 50 percent. And so, I hope that these leads are picked up and people who evaluate how practical it is to do these interventions pick up the ball and prevent a lot of gastric cancer in certain areas.
HWT: Great story. I also wanted to mention because you had talked about, a little bit, about AIDS research. And I know you won an award. So, I don’t know if there’s anything that you wanted to add to what you’ve already said about your work on AIDS research.
MG: Well, the only thing I would say is that I was in a lucky spot because, of course, you know, I think the “back calculation” that I discussed earlier was not the only thing going on in my setting. But it turns out that Jim Goedert and Bill Blattner and other epidemiologists were working with the people who had developed the assays that allow you to decide who’s HIV-infected or not. So, they had access to early data, epidemiological data, on risk factors for getting infected and so forth. So, I had a chance to work with a lot of very talented people on the ground floor determining what aspects of behavior were risky, whether the blood supply was safe, a number of interesting questions just because of being at NCI where – and this is an interesting point. You know, a few years before HIV, people didn’t know what a retrovirus was. And Gallo and his laboratory had done a lot of fundamental work on retroviruses. So, when this one came along, they were able to apply those tools, which were developed as part of a national program to combat cancer, which it was hypothesized was caused by viruses. So, a lot of money was spent developing the infrastructure and science behind virus, the viral causes of cancer. They didn’t find much in those days. They found maybe rare types of leukemia and lymphoma that were associated with certain viruses. But they, those tools were crucial in developing an early understanding of how HIV was transmitted and could be prevented. And I was fortunate to work with a number of those people.
And again, with Ron Brookmeyer, we wrote a book called AIDS Epidemiology: A Quantitative Approach, that discussed a number of the special issues that arise when you’re studying populations for which you didn’t design the study, you took what you got. And what you got is usually a biased sample of what you want. How can you interpret those data? So, there’s a lot in this book of general interest to epidemiologists, who are often faced with samples of opportunity. And so, it was a very enriching experience, and I was, you know, fortunate to work with a number of extremely talented people. And some of my often-cited papers came from that era. So, that just shows the kind of environment that I was in and the role of serendipity.
One thing I’d like to comment on is this viral carcinogenesis program, some people might have thought it was a flop. But all this basic understanding of how viruses work, and especially retroviruses, had an important role in AIDS research. But you know, over the years, if you ask, “What have been some of the astounding findings for cancer prevention?”, a lot of them have to do with viruses. You know, we now have a vaccine that prevents people from getting cervical cancer. Because the HPV virus causes cervical cancer. We know the importance of preventing and possibly treating certain types of hepatitis viruses, because they cause cancer. And so sometimes it takes years and years before the full realization of a research agenda becomes apparent. And I would say that viral carcinogenesis is in good repute these days, even though there were times when it wasn’t.
HWT: Can you take a moment to talk about the relationship between statistics and epidemiology?
MG: Well, epidemiology is trying to study disease phenomena in relation to exposures. But it’s not a designed experiment. Okay. So, suppose I really wanted to show that smoking causes lung cancer. Well, I could do this in rats, because I could randomly assign the rats. Some would be given smoke. Others wouldn’t be given smoke. And because of the randomization, other factors that might influence the risk of lung cancer would be balanced out in the two groups, the smoking rats and the nonsmoking rats. And I could, if I saw that the smoking rats got all the cancer and the nonsmoking rats didn’t, I could infer that smoking causes cancer in rats. Lung cancer.
In epidemiology, what we have are different kinds of studies. You have studies where some people smoke, and some people don’t smoke. But you don’t assign at random who smokes and who doesn’t. Then a lot of those smokers develop lung cancer. And the question is, what kind of studies can we do ethically in human populations, and what can we learn from them?
Now the simplest study of this type would be to take a bunch of physician smokers as Hill and Doll did, some of whom smoked and some of whom didn’t, and then follow them forward in time, and see which group has the higher lung cancer rates. And they did that. And they found much higher lung cancer rates in those who smoked.
But then the question, where does the statistician come in? Even in a study like that. Well, you have to know how large a group you have to study in order to find, you have to know whether the differences that you observe in lung cancer rates are due to chance or are real. In other words, because there’s noise, it’s like flipping a coin, but the coin is more heavily biased toward lung cancer in the smoker than it is in the nonsmoker. But still, there’s a lot of random variation. So just to understand that this is something that you should pay attention to requires statistical thinking.
But then there are more fundamental issues. And they have to do with, well, that this was not a randomized trial. Could something else that was associated with smoking and lung cancer be causing what appears to be a smoking effect? When it’s really something else, epidemiologists call those factors “confounders”. And statisticians, including [Jerry] Cornfield, have said you know, there could be a confounder. But if it’s there, it has to be so strongly related to smoking and so strongly related to lung cancer that otherwise it couldn’t possibly explain this association. And so statistical thinking goes into the interpretation of observational studies.
And then just, the other big problem with a lot of epidemiologic studies, especially case control studies, where you compare people with disease to people without disease and you ask, you try to figure out what exposures in their past might have been associated with their disease status, looking backwards. Those studies are subject to recall bias and other kinds of measurement error. And statisticians can be helpful in interpreting what the impact of such measurement error might be and how you can ameliorate or do ancillary studies to minimize that impact.
Moreover, every aspect of design is important. Because once you’ve done a study, you’re stuck with what you’ve done. And you can’t expect a statistician to bail you out in retrospect. It’s like doing a postmortem rather than a clinical trial. And a lot of thinking has to go into the design of studies. And one area where statisticians have contributed a lot recently is this modern era of omics research. I don’t know if you’ve heard that term. But what it is, is these days, I don’t have a hypothesis that this gene causes this disease. I study 500,000 genetic variations on each patient and try to figure out which one of those 500,000, or which ones, are associated with disease.
Now if you look at 500,000 things, something’s going to pop up. And a lot of statistics recently has been focused on trying to make sure that we don’t get too excited when we study many, many features, that we don’t spend a lot of money going down rabbit holes. And trying to make sure that the results are reproduceable. That if you do find something, you at least mount another study to show that what you found can be replicated. And then you can spend a lot of money in a laboratory trying to figure out why it’s associated. So high-dimensional research and suitable restraints are attributes that a statistician can contribute toward making sure that we don’t go off the rails.
And even in, you know, people are talking about big data these days. Big data. If the data are biased, if the way the data are collected or the way the data are measured are subject to systematic error—not random error, but systematic error—then no matter how big the data set is, all you’ll see is a clear and clearer signal of bias. And so, I think there’s a lot that classical statistics can bring to the table in making sure that the use of big data is proper. And a lot of companies like Google and others, they live off big data. But they also conduct, they conduct trials, or even randomized trials, in their data sources, using some classical techniques to try to make sure that they’re not being fooled by a large signal of bias. So you know, statisticians have quite a role to play.
Now these days, the link between biology and numbers is quite complicated. You know, there are people that specialize in something called bioinformatics. And people that know bioinformatics, they know a lot about computing, and they know a lot about biology. And they can handle very large datasets and make them accessible to other people that do certain kinds of analysis. So they’re an important part of a team that works with a statistician. I view the statistician as somebody who’s very comfortable with computers and even large samples but would not be particularly adept at decoding a certain part of the genome and grabbing a certain piece of information. Whereas the bioinformatician can grab that information and then, working with the statistician, try to make something of it.
HWT: That’s super fascinating. You know, I mentioned we talked about technological advances over the last several decades and how they changed the field. But of course, there are other inputs. So this kind of development of bioinformatics is super important. I would add that. Is there anything else that you want to add about other kinds of influences or technological advances before, I want to ask you some questions about the, of course, the Breast Cancer Risk Assessment tool. But anything at all? Or shall we move forward?
MG: The other big change since I came to NCI is just unbelievable developments in medical and clinical science that I wouldn’t necessarily call technological, though some of them have been facilitated by technology. But immunology is a totally different field from when I was a medical student. You know, when I was a medical student, there were two or three kinds of immune cells. And I knew that one made antibodies and one acted some other way. Today there are books written on hundreds of features of the immune system. And one of the great accomplishments in medical therapeutics, cancer therapeutics, has been the rise of immunologically-mediated treatments. You know, I talked about a failed BCG study of lung cancer. Well now, there are now many people are living because of ways of modulating the immune system and getting it to attack the cancer. So, it’s become a totally new arm of treatment. It used to be surgery, radiation, and chemotherapy. Now you have to add immunotherapy. But that’s not the only thing. Practically every facet of cell biology and therapeutics has evolved, and genomics and understanding of the genetic changes in cancer are intensely studied. All this knowledge is hard to keep up with and really necessitates a team approach to many problems. A multidisciplinary approach to many problems. And I think that’s had a big impact on epidemiology, as well.
HWT: Thank you for that. I will ask a question related to that a little bit later. And about this kind of team approach and what science really looks like. But we can get to that a little bit later. Focusing now a little bit more on, of course, we’ve talked about the breast cancer risk assessment too with the Gail model. But can you describe, if there’s anything you want to add first just in terms of describing its development. But also, I’m really interested in what you have to say about its impact.
MG: Yeah. Well, the original, well, I’ve been very interested in how should risk models like the Breast Cancer Risk Assessment Tool be used either in counseling women or in more general public health applications. And the original motivation, as I mentioned for the Breast Cancer Risk Assessment Tool was to give women a more realistic idea of what their real risk is. If you think your risk is 50 percent, you might be willing to have a prophylactic mastectomy. If you think your risk is 10 percent, you might say, “Well, let’s watch, let’s have mammograms, more periodic mammograms.” So, it can change a person’s attitude.
It could also, some models have modifiable risk factors in them, and they can, you can say, “Well, I don’t know for sure you should stop drinking four drinks a day. But the model suggests that if you cut it back, it might reduce your risk of breast cancer a certain amount.” So some models have modifiable risk factors.
I think that one area, then is counseling. There are two other aspects of the counseling. But one is, suppose you’re in your forties and you’re trying to decide whether to start having mammograms. The mammograms are recommended for women who are fifty years old and older. But if you’re a forty-eight-year-old woman, you may have other risk factors that give you a higher risk than a fifty-year-old woman. So maybe you should consider starting mammograms a little earlier.
Another big area of application is should a woman start taking a drug to prevent breast cancer? And one of the first drugs that people were interested in studying and the first drug that was really proven to prevent or reduce the risk of breast cancer was tamoxifen. Tamoxifen has a lot of side effects, unfortunately. Including it increases your risk of stroke. It increases certain other risks. Deep vein thrombosis, endometrial cancer. And so, a woman should only take it if she has a very high risk of breast cancer so that the reduction in breast cancer risk—it reduces that by about 50 percent—outweighs the increases of some of these other risks. And so, I’ve written a lot on that, how you weigh the risks and benefits. And the tool for weighing risks and benefits are the absolute risks of each of these outcomes. So, risk models provide needed information for personal decision making.
Now models like this can also inform public prevention programs. For example, suppose you don’t have enough money to give a magnetic resonance image [MRI] to every woman in the population. Or you don’t have enough instruments to do that. Can you give it to the women at highest risk? So using risk to allocate scarce medical resources is another area.
In setting up the trial that proved tamoxifen prevented breast cancer—by the way, I’m not talking about tamoxifen for women who have had breast cancer. For them, if they have estrogen receptor positive disease, they should definitely take tamoxifen or some other drug like tamoxifen. But for women who have not yet had breast cancer, the question whether you should take it or not depends on this balance of risk and benefits.
But some of the other applications of the risk modeling in public health were in setting up this trial to see whether the woman should take tamoxifen or not to prevent breast cancer. We had to figure out how big the trial should be and that depends a lot on how many breast cancers are going to develop during the trial. And that’s a function of the absolute risk of breast cancer. So these models can be used to help design medical studies.
Another aspect is, suppose you want to figure out if I have a modifiable risk factor, like alcohol consumption. If I could only get women to reduce their alcohol consumption, what impact would that have on a population level in reducing breast cancer incidence? So you can use these models in various ways, some having to do with public health like allocation of resources, designing clinical trials, estimating the effect of reducing exposure to modifiable risk factors, and deciding on screening programs, the frequency of screening, who should get screened, and things like that. So these risk models have potential. But working against their full implementation is sometimes the complexity of models. And for some applications, the ability of the models to really indicate who is and who isn’t going to develop breast cancer is not great enough for some of the applications. But there have been improvements. And I’ve been very interested in the question of how good does a model have to be for various applications?
HWT: Very interesting. One more question in that sort of area. I don’t know how much you want to focus on this, but I was curious in terms of testing the Gail model across different populations of women. Can you talk about how that was conducted and why?
MG: Yeah. Whenever you have a risk model, it’s important to validate it before you promote it, I think. And the validation, the most stringent type of validation, is to study its use in another population. A different population from the one you developed it in. To see how transportable it is, namely, to see how it performs in a variety of circumstances. And we, early on the model was pretty well validated in something called the Nurses’ Health Study at Harvard. But there are two things you want to validate. First you want to see how well calibrated the model is. And what that means is if in a certain population the model says, if we follow these women for five years, we’re going to see 150 breast cancers, then you hope that in an independent validation where 150 were predicted that you would get, you would observe 150. Not 300 and not 75. That you would observe something close to what you’re predicting. And that’s called good calibration. You want the model to predict what will actually be observed in terms of the numbers. Now there will be some fluctuation because of random variation. But you want the prediction to be within random variation of what you observe.
The other thing you want to validate is something called discriminatory accuracy or discrimination. How well the model can, how different are the risks are in cases in women, let’s say, if we’re talking about breast cancer, women who actually develop breast cancer compared to women who didn’t develop breast cancer. How different were the predicted risks? You’d like them to be very different. And that’s measured by discrimination. So, these are often validated or checked in independent populations.
And the Breast Cancer Risk Assessment Tool has done pretty well in most validations. In most validations. Now it was designed for a general population. It was not really designed for women carrying BRCA mutations or having special risk factors for breast cancer. And in fact, if you go to the website, it says don’t use this model if you know that you have a BRCA mutation. Use BOADICEA, another model that is out there.
But it’s kind of interesting that recently there was a publication from a high-risk population. A population that had over six percent of the women in it had BRCA mutations. Normally it’s much less than one percent of a population will have BRCA mutations. But this was a high-risk population. And other models are better suited for high-risk populations. But nonetheless, the Breast Cancer Risk Assessment Tool was compared to some of these other models in this population. And sure enough, the Breast Cancer Risk Assessment Tool underpredicted risk in this population.
But then, I realized that what if you excluded the women in this population who had a BRCA mutation? Because the model says don’t use it if you carry a BRCA mutation. And it turns out that when you exclude the women who had BRCA mutations, the model was well-calibrated overall, and worked as well as other models overall. Although it was a little bit, I think, underpredicting in young women and overpredicting in older women. So I would still recommend those other models for a high-risk population like that. But it was surprising to me that even though the Breast Cancer Risk Assessment Tool is designed more for the general population, it’s pretty well calibrated in high risk populations if you exclude women who have BRCA mutations.
Now there are new models being developed all the time. And new risk factors are being put into these models. And two of the most important are mammographic density, which you can measure on a mammogram, and certain genetic changes called polygenic risk scores that measure changes in the DNA. And they’re going to be able to improve the discriminatory accuracy of these models from something called the area under the ROC curve from about .6 to about .7, which it doesn’t sound like a big improvement, but it’s a good improvement for some applications. And so, I think other models are being used. But against the advantages that they offer, you have to think sometimes of the complexity of using those models, because you have to measure the mammographic density and you have to measure the genotypes.
And of course, this information may be very readily available for all women someday. But right now, if you had to use the Breast Cancer Risk Assessment Tool, you would just have to answer four or five questions on a questionnaire. And not get a report back from the genotype lab and not get a mammographic reading. So, you have to weigh what the model’s going to be used for and how hard it is to use it.
For some applications, like allocating scarce resources, if it takes too much money to do a risk assessment, you’re taking money away from the actual intervention. So, if the models require data that are too expensive to obtain, that will detract from their usability not only because it’s practically difficult to implement them, but because you’ll be using up money that could otherwise be used for the intervention. So, there is something of a tradeoff between improving these models using more discriminating information and simplicity.
HWT: So, unless there’s anything you want to add, I wanted to ask you about successes and setbacks. Of course, you’ve been talking about successes. But is there anything else that you want to mention? Anything else in terms of that? And then, I also always love to ask about setbacks, or sometimes you can say bad ideas, because they’re learning experiences. So, if there’s anything that comes to mind?
MG: Yeah, well, I thought about that a little bit. And there’s nothing in my CV that I’d like to apologize for. So, I don’t think I’ve made a publishable boo boo of importance. But I have wasted time. I have wasted time and gone to dead ends on some projects. And even a year or two ago, I was interested in the question if you studied these genetic variations along the genome, often several of them are associated with disease. Let’s say breast cancer. And you know there’s several of them. But they’re all together close on the genome. And you don't know which one of these variants is the actual causal one. Maybe one of them is causing the association with the disease, but the others are highly correlated with it. So, you’re seeing a lot of signals there, several different ones. And you’d like to know which one is causing the association, if there is one, which one is it.
Now I know from general regression theory that that’s a very hard problem to do. If you do a multiple regression, you’ll see signals for several of the independent variables or regressive variables—in this case, genetic variants—and you won’t know exactly where the signal is coming from. But I thought I would try my hand at it. And I developed a technique. And I spent months testing this out on my own invented data. In other words, I I knew the structure. I knew which one was causal and I knew how they were all correlated. And my method was working beautifully. I was really getting very excited. Then I talked to somebody in our Division who has real data of this type. And he was able to provide a real dataset. And then I could test out this method on the real dataset. And I was extremely disappointed because I’d practically written the paper just waiting for the little example to show how great it was. And it failed completely. And it had to do with subtleties of how the things are correlated. So that was a learning experience. And it also took a lot of time and energy. And I’m sure I’ve had a lot of dead ends along the way. But that was a pretty dramatic one. That’s the kind of thing that comes to mind.
HWT: So, you of course have received many honors and awards in your career. I’m wondering what is or has been the most meaningful to you. I don’t know if that’s a fair question because you might not want to just pinpoint one, two, or three. But if anything comes to mind.
MG: I would say – excuse me, I’m getting a little hoarse – that being a member of the National Academy of Medicine I think probably was, based on my Breast Cancer Risk Assessment Tool, I think that was meaningful to me. An acknowledgement that even though I’m not practicing medicine, I’m contributing to medicine. And being president of the American Statistical Association I think acknowledges, well, partly acknowledges the role of statistics in government. Because they rotate that between industry, academia, and government. But it is a measure of recognition also as a statistician. And I would say you mentioned the USPHS [U.S. Public Health Service] Distinguished Service Medal and the Distinguished Investigator designation at NIH. And I think that those are very important to me. And I’ve had a few other awards. But I did get an award for research excellence in cancer epidemiology and prevention, which I think reflects that especially work in risk modeling does have an impact on public health issues as well. So those are probably the most important to me.
HWT: So why did you decide to spend your career at the NIH?
MG: Well, I think I, I think, I tried to illustrate that there’s a rich environment here. That things are, even when you’re staying in the same place, things are moving around you. And I would say, so there’s a rich scientific opportunity there and a chance to continue to learn. And that’s probably the most important. There are other aspects of it, though, that I think are important. And one is evaluated these days in retrospect. In other words, I just went through a site visit with the site visitors who are mainly from academia, or practically exclusively from academia. What they are instructed to look at is how productive was this person in the last four years. And maybe think a little bit about the future plans. But really, the idea is that if this person has had a productive four years and has reasonable plans, then they should be supported. In academia, some people know how to do that. They know how to do the work and then write the grant. But for most people, it is, there’s a lot of effort to write the grant, describe what they’re going to do in the future and then wait to see if it gets funded. And I think that that’s one of the great advantages of working at NIH. Even though the site visit can be quite stressful for some people.
HWT: So I know we’ve been talking for a couple of hours. I just have a handful more questions for you if you have time.
MG: Yeah, okay.
HWT: Okay. One moment. [dogs barking, pause] I apologize. So, I have some COVID-related questions and some sort of mentoring-related questions. But before I ask those, I want to ask you is there anything you feel is important to add right now in terms of looking back over your career in total, or the NCI in general? Anything at all.
MG: Well, I think the opportunity to work with talented people, many talented people, is something that strikes me. And you know, unfortunately some of them have died that I’ve worked with. But I was very taken with the experience that I’ve had in working with other talented people, both in fieldwork and epidemiology and in theory. You know, it’s a real privilege to be in an environment like that.
HWT: So I noticed on your CV that in terms of COVID now, you’ve been advisor and collaborator on SARS-COV-2 observational studies for RESPIRA in Costa Rica, 2021-present, and advisor to a Trans-NIH group working on post-acute—I’m going to get this wrong. Sequalae?
MG: Yeah, that’s right.
HWT: Okay, of SARS-COV-2 protocol or recover, also 2021 to the present.
HWT: Can you talk about these roles and experiences?
MG: Well, the RECOVER study, you know, I’m sort of like the fly on the wall. Or a gadfly. That’s a huge effort, billion-dollar effort, being funded by NIAID and by the National Heart, Lung, and Blood Institute. It is, the administrative structure was set up at NIH, but the work is being done nationwide and through coordinating centers at universities. But I was asked and another statistician was asked, Michael Proschan, to comment on protocols as they were in development to make sure that the questions that people want to ask can be answered by the protocols, to identify weaknesses in the type of data that will be available and their ability or inability to answer certain questions, and to provide a little bit of technical advice on particular types of analyses that were proposed. So, I can’t say that I made a huge contribution here, but I tried to as a member of the NIH community to sharpen the studies and try to increase their chances of success. A lot depends on this. This is, you know, potentially there are a huge number of people who have been infected who will have some kind of long-term effects. And this study is designed to look at what are the risks of long-term effects, and what treatments maybe have been effective in ameliorating those. But there’s a lot of work to be done on this.
HWT: Exactly. It’s all rather new. So, in terms of your own experience and the response to COVID at NCI, do you want to talk about that just for a couple of minutes? I’m very interested in the first couple of months of shutdown, how you got the news, obviously, and how it affected the research on the team, for example.
MG: Well, of course, I felt very lucky to be able to work at home. And NCI provided additional equipment so that we could work better from home. I’m talking to you from home right now. The interactions that one normally has at work were definitely impeded, even though we’ve become more adept at using remote conferencing. And one of the most difficult situations, I think, is for postdocs who come for a training experience. I had a postdoc who came in February of 2021. And she was from France, she is from France. And she and I did not meet in person for months and months and months. It’s only in recent months that we sort of met for coffee and talked about things. And now we’re starting to gradually get back into the office. But I would say for a person who has a network of collaborators, it’s still possible to carry on, although with some difficulty. But for a young postdoc, it’s hard to establish good ties with other postdocs and learn from other postdocs the way it’s much easier to do in person. So, it’s had an impact. But fortunately, this woman is very smart and also very adept at using computer tools. And has made some nice contributions.
HWT: And then, of course COVID-19 has had a disproportionate impact on marginalized communities. But people talk about health screenings, cancer screenings, the same way. Can you talk a moment just to talk about health equity from your point of view?
MG: Now I – health equity. I have looked at some aspects of risk modeling from the perspective of various sub-populations. And I have looked at the need to specialize risk models for subpopulations, such as African-American women or Asian-American women or Hispanic-American women. And the Breast Cancer Risk Assessment Tool does that. There is some controversy about whether the development of tools that are ethnic or racially specific can contribute to health inequity. An example has to do with allocation of kidney transplants based on models for kidney function. And if the models for kidney function indicate, for example, that a Black man has higher kidney function than a White man, comparable White man, the Black man might have less chance to get a transplant because he’s not quite as sick according to this risk assessment, or this laboratory assessment of status of health. So, there’s been some people advocating not using measures like that to allocate resources like transplants.
Now I feel that that’s a misuse of the information. I’d like to take it back to breast cancer risk modeling. If I do a risk assessment on an Asian woman who’s just come to this country and I don’t specialize and just use a general model, maybe a model that was developed in White women, I am going to very much overestimate the risk that this Asian woman has. And therefore, I’d be giving her bad advice if I give her a bad number. So my point of view is, do the best you can. Present the facts as clearly as possible. And hope that people who have to make ethical or resource allocation judgments have a broader purview than just these particular facts, but that they know the facts. But I know this is a controversial area.
HWT: Yeah, it’s tricky. I also read that in your strategic plan for 2020 to 2025, that major goals include developing and implementing strategies for workforce equity. And of course, women and people of color are typically underrepresented in science, very generally speaking. Are you involved in any of those efforts? Or what is your point of view on that?
MG: Well, I think that within our Branch, there’s an awareness of the need and desirability of hiring talented people from various backgrounds. And I’ve certainly, you know, I’ve had some sensitivity training courses and things like that and realized that I do have certain assumptions and feelings that are not always realistic. But I’ve been very fortunate that the last three postdocs I had were female and are doing well. One of them actually was given an opportunity to go on tenure track in the Biostatistics Branch, which is pretty unusual because it’s a global search for those positions. But so, at least in my experience with working with female applicants and postdocs, I’ve had a very positive experience. And I’m looking forward to promoting talented people wherever they come from and especially women.
I have not worked yet with an African-American statistician or someone from certain other groups. I did write a paper with a postdoc who was working in another division on models for Hispanic women. And he has gone on to a good career at Kaiser Permanente. Very interested in diversity issues. And he has special expertise in risk modeling for Hispanic women. So, part of the solution is the ability to train people from various backgrounds and see them succeed, wherever they succeed. And occasionally there will be an opportunity to recruit. And that should be a consideration.
HWT: And then just a couple more questions. They’re all relational sort of questions, or relationship-oriented questions. And my first one, we talked about briefly before in terms of working with people working on teams. So I’m wondering what role collaboration plays in your process?
MG: Well, there are two kinds of studies that I’m usually involved in. The smaller-scale, there are smaller-scale studies, like a study that I’m working on with my current postdoc, and another that I worked on with my previous postdoc, which are technical, statistical, methodologic kinds of questions. Some theory, some computer work, things like that. But still of a relatively small-scale but intense. And I benefit from having two or three people on the team with the postdoc and me with special expertise in various facets of the theoretical problem. And Dave Byar, who I have mentioned earlier, who hired me, he felt that it’s okay to do some theory, but he never trusted the theory of a person who worked alone. And so, these small-scale studies, I think it’s very helpful to have three or four people on a project. And it’s very manageable and it’s very enjoyable.
Then there’s some large-scale studies. And I’ve mentioned things, like the Shandong Intervention Trial, that you couldn’t even imagine doing without fifty people working on it. And as long as there’s a well-defined goal, well-defined hypothesis and so forth, that can be fun. There are other situations where, and I haven’t been too involved in these, where people are part of a team, but their particular contribution is hard to define. Partly because it is such a large team. But supposed you’re studying genetic variations in breast cancer risk and there are, and you need [participants], and there are going to be 300,000 breast cancers and 300,000 women without breast cancer providing DNA from 50 institutions throughout the world. That takes a lot of organizational skill. It takes a lot of political skill. It takes a lot of data management. It takes a lot of laboratory expertise. And people are having to make their way in the context of a huge study like that and get recognition for their contributions. I haven’t done that much, but I can see that the tension between big science and individual advancement for the principal investigator can be difficult at times. So finding your niche and making sure that people know what you have contributed is essential. Sometimes people are on these big papers in high-impact journals, and they’ve contributed patient information and things like that. But not much leadership. And sometimes people on those same papers have done a lot in the area of leadership and analysis and interpretation. And I think that in the committees that run projects of that type, it becomes known who’s doing what. But I think a lot of, some people probably get lost in the wake.
HWT: And then, slightly different question but similar about relationships. Mentoring. Mentoring seems to play a very big role in the scientific community. Can you talk about that a little bit and perhaps your own role as a mentor? Or if you, you’ve mentioned several names. But if there’s somebody else who you want to talk about. Anything like that?
MG: Well, I’ve enjoyed working with a number of people over the years. And some of them actually joined the Biostatistics Branch. And I’ve worked with Phil Rosenberg, who did a lot of AIDS work with me originally. I’ve worked with Ruth Pfeiffer, who’s done a lot of work on the risk modeling. Some of them, like Jacques Benichou that I worked with on absolute risk ideas and other ideas, I tried to get him to stay in the Biostatistics Branch, but he went back to France. I guess I enjoy it from several perspectives. I enjoy the work. I enjoy seeing them do well.
And I benefit a lot from the mentoring experience. Because these days, I wouldn’t dare to do some of the computing that a postdoc can do because of their computing skills and changes in the language and so forth. I can program, but that’s not the best use of my time. But one has to be careful not to overuse the postdoc for computing support. We have other kinds of computing support. But it is advantageous to me, not only from the theoretical side but also because of some of the computing aspects, that working with a postdoc is very rewarding for me. (phone rings) Excuse me. Excuse me for one second. [pause] I have to take a two-minute break. I’ll be right back. (quick phone conversation) Sorry, I had to take a break.
HWT: That’s okay. Yes. So we were talking about mentorship.
MG: Yeah, so I really view the mentee as part of the team. And I try to involve them deeply in the research. I try to find a good problem for them to work on. And then I try to give them a lot of freedom. And then when they come back to me with something, I try to be critical but not unkind in assessing what they’ve done. And if I can help with a difficulty that they’re having technically, try to give them some hints about how to do that. And then when the work is done, I often spend a lot of time editing so that their work comes out to best advantage. Because, for example, the previous person that I worked with is from South Korea. And she wrote pretty well. But to make things perfectly clear required somebody writing in the native language. The woman that I’m working with now from France does write very well. Surprisingly well.
HWT: I’m sure there are many different roles. So my final question to you is what advice would you give to encourage young scientists to continue pursuing their goals?
MG: Well, I wish I could tell them how to get a secure environment with maybe some stress but not too much stress so that they could prosper and pursue their ideas. But I think that a little bit of looking around for opportunities of that type and thinking about where they would like to go is useful. So, I think the place that you go to is pretty important. And hopefully you’ll find a welcoming environment and a chance to thrive with moderate pressures. Not huge pressures for success at grantsmanship and so forth. So, picking a place is important. But really, if you’ve found a place like that, some people are very careful to think about a specialized research direction; I know that the research direction I want to go to, I want to be an expert in causal inference for crossover trials, and I’m going to pursue this. There’s a niche there. I can do that and establish my career. And maybe that’s a good path forward for some people. But as I tried to indicate, opportunities come up. And you never know when they’re going to come up. And if the place you’re at gives you the freedom to seize an opportunity or a particularly good collaborator or a particularly interesting problem, and you can have some time to do that, I think that’s pretty important. Because a lot of good things come from places that you never anticipate.
But you know, there are different ways of succeeding. And some of the very best scientists are monomaniacal. They have an idea and it’s going to take all their time, and it works out. And nobody ever thought it would. And they get a Nobel Prize. Some people, and I think statisticians possibly more than some other fields, are more eclectic. Because their tools apply to a variety of disciplines. The same tools can be used in various areas. And I know some of the people, including this fellow Cornfield that I mentioned, always felt that it was a combination of breadth of interests and specific tools that made him such an important figure in his field. But I can’t say that that’s what everybody should do; but certainly keeping your eyes open to new opportunities is important.
HWT: Well, I think that’s a fine place to stop, unless you have anything to add at all. Anything at all.
MG: No. I just wish I could have smiled at you once or twice. (laughs)
HWT: It is strange, I feel like having this conversation, I can’t see you. But at least the audio is there. And that’s the most important thing.
HWT: We’ll be back in contact with your transcript if you want to go over it. Mostly we do that for names, spellings of names. But you can always add something in a footnote or clarify something that needs to be clarified. Whatever. So. But like I said, we’ll be in contact with that probably sometime in the next few weeks.
MG: All right. Thanks a lot.
HWT: All right. Have a wonderful day. Appreciate your time.
MG: You’re welcome.
HWT: Okay. Bye.