×

Premium Content - Please Submit to Continue:

First Name
Last Name
Company
Thank you!
Error - something went wrong!

Panel Discussion: Paul Swinburn, Jason Lundy, Paul O'Donohoe

May 20, 2015

Full Transcript

[00:00]

MODERATOR

So thank you, Paul, Paul, and Jason, for the great talk. So I had a question, just to get us started here. Are we suggesting then that paper to ePRO equivalence doesn’t matter anymore? And is there anyone that would—maybe in the audience—that would like to challenge that?

PAUL O’DONOHOE

I’m suggesting it doesn’t matter anymore, with various significant caveats. I think it doesn’t matter as much as maybe people make out that it matters.

JASON LUNDY

Well I would, I guess, partially agree with that. I think that one of the interesting things as the evidence has grown and we’ve become better at implementing questionnaires on the electronic modes, we—and Paul was touching on this in the first presentation—we know the things to avoid, we know some of the things that don’t work well. So we’ve kind of gotten through the rough patch, and there’s a little bit more consistency for how things are being implemented. So I think that really helps and that alleviates some of the concerns about the equivalence. I think the other thing, though, and this is something that I heard relatively recently from the FDA, and I think Sue you were at this meeting where Ashley was saying, well you know we’re actually concerned with paper. And if you’re using paper in the same study as an ePRO mode, or in the context of the PRO Consortium instruments that are being developed on the electronic modes as the source modes, we have some sponsors that want a paper version available. An interesting thing about that is, when you optimize an instrument to be collected on the electronic version as its source mode, you now have a very poor paper measure when you start to disentangle that, because you put one item per screen, and it looks really clunky on paper. In the context of the IBS working group, an instrument that’s being developed there, that is something you couldn’t administer on paper because the items are so concatenated. And so it’s something you could only do on ePRO. And so Ashley was suggesting, we’re kind of more worried about people wanting to go backward than we are about electronic modes. And they surely are much more comfortable if they were seeing mixed modes within a study, mixed electronic modes. And again, so paper is still kind of this touchpoint, but I think as we develop more instruments on electronic platforms initially, and as we continue to grow the body of evidence with paper, I think this issue does get put to bed.

PAUL O’DONOHOE

I think actually Paul summed it up brilliantly by saying this is a problem of its time. I think as technology permeates society more and more, as technology gets better, and as questionnaires are developed more and more on electronic platforms, this issue is just going to go away.

PAUL SWINBURN

Thanks. Yeah I think it is. I think it’s just that a temporal thing. But it’s something is important and we need to do it because it is a huge barrier to the conduct of clinical trials. And so we need something that’s workable. We don’t need to come up with the be-all end-all solution to establish equivalence with absolutely everything. We just need something that’s going to get us from her to there, really.

PAUL O’DONOHOE

And I guess my argument is that we’ve got to that point with the majority of provisioned devices.

[04:12]

AUDIENCE MEMBER:  I have a comment and a question. So you know, we’ve had lots of discussion, Jason, through C-Path and everything, and we’ve been asked for many years for a paper out on equivalence for the SF-36 and that’s why we funded it and are doing it. But I have to agree with you guys, that it almost seems to be a moot point with the data that’s out there already. But the fact still remains that when people license the tool they still want a citable paper that shows either it is or it isn’t equivalent. So I guess there’s a difference between logically do we think we’ve done all the research we need to do to show that it is, and we still have to do it anyway sort of thing going on. And I think it is a really great point that you’re making about newer things being designed for electronic, going back to paper being a much more difficult situation. And I think also, you have to think about some of the late phase studies too, where people are trying to get in the field really quickly and then sometimes put things out and they may say oh, something isn’t programmed yet, we’re gonna start on paper and switch over. So you do have a lot more of this combined data when you get post-approval that people have to do. My question: I want to hear more about your thoughts about paper being less good, so to speak. I find it fascinating, the idea that we should just sort of do away with it altogether. And you’ve mentioned it a little bit, so I wanted to hear your guys thoughts.

PAUL O’DONOHOE

I think all I can do is partly concur with what Jason said that it blows my mind that we are trying to demonstrate equivalence to a less good form of data capture. Why do we want to be equivalent with that. We want to not have equivalence to that. I think the limitations and issues with paper are well recognized, the benefits that electronic brings are well recognized, and I think that to that extent it does somewhat undermine that whole, well we need to be exactly equivalent to the paper version. I feel that’s setting back the potential that’s inherent within the electronic systems to a degree.

JASON LUNDY

So I was just remembering a poster that I saw that was put out by the EuroQol group, and this was some years ago, but they were showing the variation in some of the things that they get back on what they call the VAS—which wasn’t really a VAS but anyway—the EQ VAS on the paper version. And I specifically remember, there was one where you draw this line from the box, right, and where the person drew the line from the box and they went around the scale and back down and around the box and then over to the scale. And they showed all these different ways that people sort of write in and they—you know, one person wrote well on Tuesday this is where I’m at and on Thursday—and so those are the things that we are able to avoid in electronic version. And when you get stuff like that, what do you do with it. It just becomes a challenge that’s just avoidable. And so I feel like that’s one of my arguments for why I dislike paper, just in general. I don’t think paper is gonna go away though. Actually I think that people are still going to use it. I think it actually might make some sense. If you got a small study and you need to get it out really quick, it might be your best option. And I don’t think that there’s anything wrong with that. And I think that sponsors—some sponsors, not all sponsors—haven’t jumped on the electronic bandwagon. They’re not there yet. And so they're still going to look for paper forms. It’s unfortunate, but I think that’s the situation.

MODERATOR

Yes that is the unfortunate case I think and in cases especially where we know that we’re not necessarily going to go for a PRO claim, right, so where we don’t need the evidence absolutely and in a supervised setting to be collecting the information, right. So I wonder if it was a PRO scientist that actually filled out that EQ-5D.

PAUL SWINBURN

I actually did some cognitive debriefing on that very instrument adaptation, for the feeling thermometer I believe Paul Kind calls it. And yeah, it’s rubbish. It’s really really bad. Really bad. All of the participants, when you sit down and say you know, what are you supposed to do here, they’re like what is this. I’ve never done this before, nobody does this, who made this up. It’s an insane exercise, just write the number in instead. Ask me out of a hundred. But it’s crazy. But yeah it’s the automatic assumption that paper is in some way a kind of gold standard is madness. You know, we can do better. But there’s that practical limitation of what we want to do right now in order to satisfy people so that we can get to that point where we can develop purely with respect to electronic platforms, which is obviously what we want.

[10:00]

PAUL O’DONOHOE

I do love the delicious irony of struggling to demonstrate equivalence from electronic back to paper, though.

AUDIENCE MEMBER:  Thank you. I’m dying to say something. Thanks guys. I just want to add to the commentary. So when this sort of discussion took place at Novartis, where I was for years and years and years, one of the things I needed to point out was that the variation associated with the six-minute walk test that nobody questions, or the blood pressure measurements, visual acuity tests, I mean these are huge compared to the topic that we are talking about. And given that most of the time we are dealing with a randomized controlled studies, small variations get washed out anyway. An the point that I think Paul made was that the error variation in paper compared to between-device variation, even taking into consideration individual variation is tiny. So proportionally these things, even if you have no data, it’s reasonable you can assume that it doesn’t matter. And I think that goes back to what Ashley’s making a comment is that they’re more worried about people going back on the paper than about going forward on the electronic platforms. So when you take the spectrum of measurements that we have in a typical clinical trial, you know you’re talking about very very insignificant part of the problem.

JASON LUNDY

I just want to add to that because I think that’s an excellent point. Someone has driven me crazy ever since we even started having the conversation within the task force for the first report. We’re holding ePRO, in my opinion, to a higher standard than we’re holding translations, for instance, where I’m actually much more concerned that we’re adding in potential variation and things that bias potentially the scores because oftentimes we might actually have to change the concept because the words that we used in English aren’t available, they don’t translate, and they’re not equivalent, that concept just doesn’t exist. So we have to come up with something new. So now I’m actually thinking, well do we even have a conceptual equivalence here, let alone score equivalence. And this isn’t something that gets brought up, but if we’re concerned about putting this thing on a handheld device. It’s kind of a paradox I’ve never really been able to get my head around, and I don’t understand where it came from but in some respects, again, being complicit in this task force, I feel like well we’ve backed ourselves into a corner and we’re fighting to get out and hopefully the times change. But I have heard some recent stories about people finding cross-cultural DIF and they’re saying well now what do we do. And so I think there may be an example that’s going to be presented at an upcoming PRO Consortium workshop. But I think that’s a fascinating issue that’s going to be coming to light maybe sooner than later.

MODERATOR

So just a quick digression. For the non-psychometricians in the room, what is DIF?

JASON LUNDY

Well so it’s differential item functioning, and sort of in layman’s terms that the items aren’t performing the same in certain sub-groups. So if you were to administer to males and females, and they both have the same severity level of the condition, you would expect their responses to be in the same range, you see shifts in the scores potentially where they don’t use the scale in the same way. That’s a very maybe naive explanation of it.

MODERATOR

A very good explanation, thank you. So I had a question. If we’re suggesting that paper to ePRO doesn’t matter anymore, that equivalence, what would we suggest would be guidelines around ePRO-to-ePRO type of guidelines to suggest to sponsors that want to show that, to bring the evidence to, say, the regulators?

PAUL O’DONOHOE

I guess I’d go back to the point I made that I think paper to electronic is a bigger change than an electronic system to another similar electronic system. And we have a lot of evidence showing paper to electronic is good, why are we now panicking about electronic to electronic. I think in regards to BYOD, there’s definitely a bit more work to be done there, but I don’t think it’s something we need to worry about any more than we’re worried about paper to electronic.

 

[15:08]

MODERATOR

So Paul would you suggest to a sponsor that for a Phase III study where there is a PRO, that they could use BYOD?

PAUL O’DONOHOE

I would suggest they could consider it, I don’t feel the regulatory landscape is ready for that. But as was discussed in the BYOD workshop yesterday, I think there’s again a certain amount of burden on the industry maybe to be pushing that a bit more and having that discussion with the regulators, bringing this out and saying, you know, we’ve done these equivalence studies for decades at this stage I guess, we are starting to gather evidence that there’s no difference between devices, what can we do to really make this work.

MODERATOR

Great. Are there any questions?

AUDIENCE MEMBER:  This is Jason. I’ve got a concern that we’re confabulating why we run some kind of equivalence study versus the benefit of ePRO versus paper. I think that if we’re only—if we’re developing a new measure, and we’re validating it, who cares about paper, you know. If you’re going ePRO, validate it in ePRO, there’s no reason to have it in paper. And so everything else I’m about to say is going to make me sound totally pro-paper, so I thought I’d start with that. And it’s not the case. But I think that when we’re talking about why would we worry about going back to paper, we’re not worried about saying is this as good as, because we know that there are great benefits. What we’re worried about saying is, according to Vandenberg and Lance who wrote the treatise on this, can we interpret the scores similarly. And largely I’m thinking about multi-item domains, so your comment about diaries, I completely agree with. But if we had these multi-item domains, we want to know that our conceptual framework, and therefore our interpretation of scores, can remain the same. Why? Because we want to be able to refer back to previous literature about how to interpret the scores, what the important differences are going to be, and that type of information. If instead we’re saying we don’t care about paper because we’ve revalidated it on ePRO and have the full psychometric understanding, then yeah I agree. But I think that the important difference is not that we’re trying to say that ePRO is just as good or better because of response rates. The issue of the EQ-5D item goes back to the whole thing of have you designed your items well, have you built your test well. And if we’ve done that, then the benefit of equivalence is being able to draw from SF-36’s 23,000+ articles in the past. And that’s pretty impressive, to be able to say yeah, I’ve got 23,000 articles backing me up here. Or at least I’ve got five articles backing me up on previous things that have probably been done on paper. And so I think that if we know why we’re doing the equating, that’s the big thing. If we know the why of it, then it really helps us understand that it may not be dead in all contexts. The context of being able to draw from that understanding, I think, still has huge merit.

JASON LUNDY

Yeah, I think that in general, agree with all of that. One of the challenges that we have is sometimes that literature isn’t as rich as say for the SF-36. And so we don’t know what type of psychometric properties the instrument has because they’re not published, we don’t know what the responder definitions look like, so we find ourselves having to back into that. You know, you can make the argument, well if you’re going to do it anyway then why are you—as you said—why are you even worried about paper. So I totally agree. And in situations where that evidence exists that’s great, you know, that’s a good approach. I think we just struggle with the other end of that spectrum, is when we don’t have that evidence, and that’s a tough spot for the sponsors to be in and the FDA is going to be asking you to produce that in your dossier. And so you have no choice. And if you are starting your trial out with paper because you needed to get it run and switched over, that’s terrible. I understand why, but I would never recommend doing that. So yeah, Jason, I think I concur.

[20:28]

AUDIENCE MEMBER

Hi. Sue Ellen Kline, Nucleus X Market Access. Thank you all for your thoughts. This is a really brilliant seminar. I sort of mirror Jason’s comments. I struggle with the fact that the instruments are validated psychometrically via paper. Then we move to electronic platforms, and equivalence I understand all the differences obviously. But I still struggle with that as a scientist. And to Jason’s point, we want to look back in the literature and use the data that were based on paper.  So the second item is, regarding a trial on pain for example, we absolutely had to get paper validated into the electronic platform because it’s a primary endpoint for our pivotals. So that was not an option to actually not go back to the paper and say even though it’s not a good gold standard, we are equivalent. And then a third question I had for the panel, and again to Jason’s point, is if we are developing new survey instruments are you recommending that we don’t even start with paper? Because we’re dealing with reviewing literature right now, item generation, focus group, cognitive debriefing. We should be doing that all on electronic, it sounds like, not on paper. Is anybody doing these on paper anymore? And developing?

SPEAKER:

Okay, so there is a practical issue, right. When you do your concept elicitation you go through your item generation, you know, you write your stuff down on paper, you’re scribbling it into our iPad or whatever. So you end up with this sort of de facto paper form that you migrate, I guess. We say that but we’re not really migrating because when we then take those items and response sets and put it on electronic mode, we’re optimizing it for that electronic mode, we don’t care about how it looks and feels on paper anymore. And we’re going to go forward and do all of the cognitive interviews on the electronic mode. So my answer to that question would be a resounding yes. But the caveat is that I doubt that that is an opinion that’s shared by a majority of people. I think there are instrument developers that would gladly develop a paper form for you and then say go talk to Paul about getting that put on a handheld device.

Why not do it in one step and think downstream, so the product will launch in the future. So in the future, to your point, it will be electronic. So we should be pushing this particular small pharma company and say do this now. Okay.

PAUL O’DONOHOE

I think you should be coming with a very very good argument of why you’re not doing it electronically. And I can’t really think of one. Just one thing I wanted to dig into a bit more was this idea of being able to compare back to the previously paper-based data. Are you suggesting we do that because you actually think there is a difference, or because you feel we have to?

AUDIENCE MEMBER:  I think it depends on the benefit that you’re trying to pull from it. If you’re basically saying look, there are some psychometric that I’m not going to analyze with my electronic version because I know it’s validated already on paper, then you’re setting yourself up for doing a proper equivalence study. If instead you’re saying, I don’t care about the paper psychometrics, I’m going to create a whole new set of interpretations, and basically an entirely new way of understanding this PRO, which may or may not comport back to the previous literature, then you don't need to refer back to it. It depends on essentially what your stance is on how you should interpret the scores and the full spectrum of psychometrics. Whenever you say those psychometrics have already been looked at, well then de facto you’re saying I am referring back to the paper literature.

[24:40]

AUDIENCE MEMBER:  I’m going to take it in a little bit different direction. I was fascinated with your comment about the linguistic validation. And it made me think of adoption of new technologies in general, right. So 20, 30 years ago everyone assumed that if anyone went out and translated a form, it was going to be equivalent, and then suddenly somebody said no no no, we don’t know that for sure. How do we test that. And there was World Health Organization work that was done to come up with a process, 15-20 years ago the ICOA study, and there was ISPOR group and they came out with recommendations and entire processes, and for a while everyone had to prove everything, and now the work is still done but it’s not like new and novel published stuff and it’s pretty well taken for granted, right. And so you know, it’s almost like that whole thing is going over again with mode, right, it’s like everyone at one point in time assumed there was no difference, and then now you know, all this work is being done. So I’m going to take it back when you said there is some DIF on some items in different countries, to something that you said with modes which is, shouldn't we expect there to be in some. For example, we know with the SF-36 in Japan on the mental health items, there are cultural differences in how people feel about responding to certain questions. You would not expect there to be equivalence there. And I think that’s a lesson that all of the mode work can take, is that you’re right, there shouldn’t necessarily always be equivalence. I don’t think it’s as much of a difference for mode as it is for cross-cultural though, but anyway to me it makes me hopeful for the future that some of this debate might lessen a bit over time as it has with linguistic validation. And then last I wanted to echo one of the speakers' comments about, is it necessary to test five different types screens and all that sort of stuff. The way we ended up going with ours is the least common denominator model that we had discussed, which is, if it works on the smaller screen on the handheld back to paper, we would assume that the rest would work. So I think that’s a reasonable approach, and so far no one has come back and questioned it.

JASON LUNDY

Michelle, do you have an IVR version?

AUDIENCE MEMBER:  We don’t have a separate version. People take it and program it in, but we do—I mean, we don’t have like a—here’s a version. There’s an interviewer script version, so people would, I would guess, use that in an IVR because it explains a little more.

JASON LUNDY

Yeah. And that’s maybe kind of a subset of this equivalence argument, right, and people have always said, well that’s a bigger change than just moving it to a handheld because you’re moving it from a visual process now to an aural process. I’ve been a part of a few IVR studies and never—I mean we see extremely high correlation between if you develop a good script and the system’s set up to repeat the response to the subject, I don’t see an issue. So I hope that that debate actually goes away as well. I don’t see a lot of people using IVR, I actually hear it more in GI than any other space.

AUDIENCE MEMBER:  To me the comparison there is between an interviewer over the phone and a computer over the phone as compared to a person sitting on a paper, and a person—I think we’re doing the wrong comparison with IVR if you go all the way back to paper.

PAUL O’DONOHOE

So for the updated meta-analysis we ran, we also pulled IVR stuff. And they were consistently high, but they actually did seem to actually be lower than compared to screens in general, which was interesting, I don’t know if that’s particularly meaningful.

MODERATOR

Okay, any other question? I do have a question from the audience. Even if equivalence is proved, is there an assumption of a learning curve? For example, does the patient behave differently the first time using a new mode? If so, should patients be restricted to the same mode throughout a study to control intra-patient variability, or does it matter?

PAUL O’DONOHOE

I think there’s got to be some assumption of learning. With the work we did with the app, a number of participants struggled to an extent with the first device they were using, the first time they did it, and then flew through the other two devices. And we received a number of comments saying, this was tricky but I picked it up very quickly and they started responding very quickly. I think there is an element of learning how that impacts equivalence.

PAUL SWINBURN

But I think you need to look at it more generally as well. You look at like efficiency of site staff in terms of their administration of pen and paper questionnaires, they’re not very standardized without a good degree of training, and they get better at it and they develop methods for doing it. So there’s all these different sources of variation. There’s always a learning process, it’s just with ePRO there just seems to be a tendency to think that we could experimentally investigate this, that we can quantify it because we’ve got access to the numbers. And elsewhere it’s not so easy. And so we just sort of—because it’s there, we’ll look at it. And that’s not a terrible thing to do, but you need to view that in context of all the other potential sources of variation I think.

[30:25]

JASON LUNDY

I don’t think it matters. I think that you have this learning, this kind of run-in period, regardless of what mode they’re completing. It’s easily handled with training. Some people do kind of a little three-day burn-in if they’re administering a daily diary so the subject gets used to setting the alarms and completing the diary and knows what to expect when they get the alert and things like that. So I don’t see that this matters at all. No one answered the question about should we restrict the subject to completing in the same mode. I think that’s ideal, but I also don’t know if it’s necessary if it doesn’t include paper. If it’s just all electronic modes, I’m fine with it, you know, use your iPad tonight and your iPhone tomorrow. I think the issue is when we are mixing paper with the electronic modes, and that’s going to raise the flag for the regulators.

PAUL O’DONOHOE

Actually, I have nothing, I just agree with that.

MODERATOR

Okay one last question. So isn’t the most important thing at the end of the day that the outcomes don’t differ from one mode to the next. And if that’s the case, if we’re looking as a research community to updating those guidelines from 2008, the equivalence from paper to ePRO, what would you suggest be in those guidelines?

JASON LUNDY

I mean, in a clinical trial the fundamental issue is we want to make sure that what we’re seeing as a responder on one mode is what we see as a responder on the other mode. I think that’s the fundamental issue. And Lori points this out in her editorial as well that that’s the heart and soul of if you’re using this an endpoint in your clinical trial this is what we should be looking at. If you find that that’s not the case, now we need to start adjusting scores and define the responders based on those adjustment scores and it gets complicated, but we can do it. But that is the gap, you know, we have these wide-ranging recommendations in the task force report and there’s no mention of that. And that’s really what we care about, and it’s an issue.

MODERATOR

I think that we are finishing up, unless there’s are any last questions. Everybody’s eager to get to the break I think. Great. Thank you so much.

[END AT 33:20]

Previous
PROs and Your Clinical Trial: Choosing, Administering, and Claiming
PROs and Your Clinical Trial: Choosing, Administering, and Claiming

Next
Equivalence to Paper: Does it Matter Anymore?
Equivalence to Paper: Does it Matter Anymore?