SPEAKER 0
Who's going to talk about A.I. and radiology? I don't have just an CV in front of me, so I'm going to have to kind of wing it. So just in, I have had the pleasure of knowing since he was going through medical school. And worked with me and Tom on a Web based order entry platform system. He decided not to go into cardiology and instead went into radiology, went to the University of Utah, where he worked in both clinical radiology as well as informatics and has been back on faculty. Are you an associate professor now, Justin? Yup. Congratulations. Associate professor in radiology and has been very interested in artificial intelligence and its application, the imaging. So with that, Justin will turn the computer and the computer screen over to you. OK.

SPEAKER 1
All right, I'm going to share here. Should be good to go. All right. Yeah. Thanks for the introduction again. Yeah, I've been on staff for about five years and my topic is eye in radiology. So this is a pretty. Pretty, pretty broad talk kind of covering, you know, everything imaging, A.I. So this is kind of the outline of where I'm going to go with everything. We'll probably get done a little bit early as well and have some time for questions so far. But basically, I kind of start with some definitions A.I. versus machine learning versus deep learning. And then briefly describe some of the newer deep learning techniques that are available. Then take a step back and talk about more traditional machine learning techniques, kind of where we've been some of the more traditional radiology image analysis software that was built on traditional techniques. And then we'll come back to the new deep learning techniques and talk a little bit about the architecture of how a deep neural network is constructed and what its capabilities are. And then we'll talk about some of the newer radiology AI software. And finally, we'll discuss the nuts and bolts of how to actually do a radiology AI project. And then I'll kind of speculate on the future of AI as it pertains to radiology. So to start with, you've got AI, machine learning and deep learning. And I guess I was going to add a caveat to this is a pretty broad audience. So the stock is still a work in progress for the computer scientists out there. If you're kind of hearing black pearls are just falsehoods, please let me know. I think I have an understanding of this stuff, but I'm sure I'm kind of misquoting and think, so please let me answer those things that are that are off. So anyway, we'll start with AI versus machine learning versus deep learning. So these terms kind of get lumped together almost interchangeably, but they actually have more specific meanings and aren't exactly the same thing. They're more just subsets of each other. I mean, artificial intelligence is really any algorithm designed to mimic human behavior, which is essentially all software. Machine learning is a subset of A.I. where there is a component of learning on preexisting data, and then deep learning is a subset of machine learning where you were using deep neural networks. So, you know, about 2016, there is this explosion of interest in radiology. I and I think it's useful just to start by pointing out what are the different tasks we're doing with radiology MRI. There's only just a handful of them. It's not this. It's not this huge array of different things that you can do. There's kind of some fundamental building blocks that most of the software is built on. So the first and most basic task is called classification, where you're basically given a binary category to an image. So this is an example of trying to classify pediatric chest x rays. It's either normal or there's pneumonia present for their environment or you're not seeing where it is. It's just saying, I've seen a lot of these. I think this is bacterial pneumonia. A lot of the first aid products to market are classifiers because I think they're relatively easy to implement. Then the next level of air imaging task is called segmentation, and that's where you your group every voxel in a radiology image into a category here you're basically outlining the left lateral ventricle. And this is also done with convolutional neural networks as well as classification. Segmentation is more powerful. It lets you do more things once each segment of the ventricle you can determine, say, the volume, you can calculate the average value within the ventricle. You can compare the size over time. And it's also the first step for Radio X, which we'll talk about in just a second. Image generation is another task that just isn't working. That's weird. Well, image generation is another task. This is actually showing again kind of working and generating images, but it's not working anyway. Generative adversarial networks or a different type of neural network that can create fake images. Essentially, you have. The way it works is that you have a generator creating random images in a discriminator, deciding if it can tell the difference between the generated image and the real image. It's sort of an artificial Turing test, so to speak, and then the two networks battle it out until the discriminator can tell the difference. So it's used in the real world and a lot of different settings. I just pulled this from Wikipedia. They can use it to simulate fashion models. This isn't a real person. You can upscale old video games to 4K age someone's photo or reconstruct a person's face after listening to their voice. Some more specific radiology examples, there's a group in Stanford that's working on ultra low dose contrast imaging, so we inject intravenous contrast with my eyes to see what structures light up. And it would be nice if we could give them less contrast than we do. So they have experimented with giving a really low dose of contrast, scanning them, giving a higher dose of contrast, scanning them again and then using a generative adversarial network to map the low dose to the high dose with some promising early results. Different companies are using various noisy algorithms to try to improve their reconstruction has true fidelity for their CT scanners to simulate a more traditional method of image reconstruction. And then you can imagine that you can sort of get creative. If you had a big database of images and corresponding memories, you could try to make a virtual memory from head CT. But I think it's important to emphasize that these gains are not magic. There has to be some existing data in the voxels to derive an image from. So interesting stuff, but a lot of limitations. Radio mix, like I alluded to earlier, is kind of a different subset of imaging. I basically, once you segment something and identify where it is, then you use radio mics to try to dig deeper and see some things about the image that maybe a human could see. You can look at things like add sharpness, texture, intensity and shape, and people have used it to determine things like the genotype of a tumor or the grade of a tumor. And finally, natural language processing is the other major task that also uses neural networks kind of beyond the scope of this talk because we're talking more about imaging A.I., although it benefits all aspects of medicine. And honestly, it's kind of the only way that I actually use every day right now when I issue my voice dictation reports. All right. So that's kind of the toolset we're working with now. I'm going to take a step back and kind of look at where things have come from. So traditional machinery techniques and broadly speaking with any type of machine learning you have supervised and unsupervised algorithms supervised implies that you are labeling the data data ahead of time into different classes and then training an algorithm to differentiate between that label data and separated into different classes. Unsupervised is unlabeled data. You just throw a bunch of data at it and say try to group it into two or three or four groups. So just some examples of each of these. The simplest supervising the supervised machine learning technique is linear regression. It basically gives you a continuous output. You try to fit the relationship of the inputs to outputs, say average temperature of average housing price. Contrast that with logistic regression, which is the classification algorithm. So it's not a continuous output. It's more of a binary one or zero output. So you can sit, say, hair length and height to a discrete output like gender. A decision tree is another supervised traditional machine learning technique that maybe is a little more intuitive than a lot of techniques. You have different decision nodes and you work your way through based on that. And then unsupervised again, you're not pre labeling the data, you're just kind of throwing data at it and seeing what it could do with it. I mean, unsupervised machine learning would be great because labeling the data as a big pain, but it tends to be less powerful. This is an example of a technique called K means clustering that can work in certain situations, especially when this high image contrast. So you can use K means clustering to separate the lung from the rest of the city. You basically just say, you know, here's a chest CT. Try these voxels up into two different classes. And since the lungs are so much different than everything else, it's actually able to do that pretty well. So just going to discuss a couple of examples of traditional software built on these traditional machine learning techniques, the first one I'm going to discuss isn't necessarily machine learning. It's more the software is just critically important for A.I. research. In general, 3D Slicer is free software. You can just download it right now. It runs on any operating system and it can do anything you want when it comes to image analysis. It's what I use to annotate all our images for research projects, and it's very powerful and very grateful for 3D slicer research for some interesting software that was developed at Harvard and Oxford in the early 90s. And it has the ability to automatically segment every structure on a 3D T1 brain MRI sequence, which is pretty amazing. I mean, you can see here it, it can segment out the cortex, the white matter. You can see the pretty man at the cutting edge thalamus pretty impressive using traditional machine learning technique. I tried to figure out how it works and what machine learning techniques it uses, but I couldn't couldn't really figure it out. It's pretty complex. But either way, segmenting the brain with researcher can take hours, usually up to about eight hours. So it's very slow but very powerful. One of the more popular or one of the kind of more higher profile, I guess you would say, uses of traditional machine learning techniques that actually made it to market was breast computer aided diagnostics, which kind of makes sense, right? I mean, it seems like a simple task to identify a stipulated nodule on a mammogram. It's pretty simple. It's just a lump of fat with drama in it, and they leaned in the direction of higher sensitivity and lower specificity. And so it kind of fell victim to a boy who cried wolf phenomenon where it was pointing out all these different things. None of them that were actually important, and it kind of quickly fell out of favor and went by the wayside. All right, so that's kind of more traditional machine learning. You know, it's been nice in the research space, not a lot of commercial applications that really that really took hold and succeeded. So now we'll move on to deep learning, and that's kind of the what's generating all the excitement. Basically, deep learning sprung from a biologic understanding of how the nervous system is actually organized. So in the late eighteen hundreds, we started getting an appreciation for how neurons were anatomically organized. You know, you have a neuron cell body with an excellent that reaches out and synapses with multiple dendrites, and each neuron can have numerous connections with other neurons. Computational scientists started developing models of those neural networks in the 1940s, and the concept of deep neural networks came in the 1960s and showed a lot of promise. But Minsky, in nineteen sixty nine, published a paper saying, Yes, there's a lot of promise. But computers aren't powerful enough to actually harness the deep neural networks yet. And so research kind of stopped at that point. Until the early 2010s, that's when interest in deep neural networks really started picking up. And some of the computer science people can correct me, but my understanding is kind of the key innovation was the implementation of neural networks on a CPU that it contains thousands of cores versus a CPU that contains tens of cores that actually made it possible to put these huge neural networks into memory. And the key paper, at least from an imaging perspective that really started the deep learning revolution, or at least helped to start it was in 2012. There's this image net classification challenge that's run every year where you try to identify what different images are to say that's a bus or that's an airplane. And typically, every year you'd see about a one percent improvement using traditional machine learning techniques. And then in 2012, Geoffrey Hinton and his team published a paper where they improved by 10 percent on the image classification challenge. So order of magnitude improvement and really got everybody's attention. So what are just broadly speaking, what are the pros of deep learning as opposed to more traditional machine learning techniques? I would say in general, they tend to be more accurate. They tend to be faster. So like I said, it take eight hours to segment a brain with free server. If you train a neural network to to do the same thing, it will take minutes. And there also, I think, just sort of intuitively easier to implement you. You know, you have a bunch of training data, you train a model and it goes forth and does what you trained it to do. The the big con is that it's a black box. You train this really complicated, deep neural network. And if it doesn't work, it's not like you can go in and figure out exactly why it wasn't working. You kind of have to intuit what's going on and then tweak it. So how are these things actually constructed? I go into the architecture of them a little bit. A deep neural network just implies that there is greater than one hidden layer. That's what deep learning means is you have a neural network with more than one hidden layer. Every neural network has three types of layers. You have an input layer, say pixels on an image, you have a bunch of hidden layers and then you have an output layer that that's your output. So pneumonia or no? No. And importantly, every neuron in one layer is connected to every neuron in the next layer, so you can see with even just a few neurons, you start to get numerous numerous connections. And then once you have the structure of the neural networks, you actually have to train it based on an existing labeled data set. And the training process is basically assigning weights to all of these neuronal connections. And then once you have a trained neural network, you can predict you can just speed inputs through it like a cheese grater. It works its way like plinko through all of these neurons, and then you get an output. Just from a high level, so this is kind of how it works, if you're trying to classify pneumonia or not pneumonia, you're feeding in to a neural network a bunch of pixels to into the input layer, going through a bunch of hidden layers and then you get an output of pneumonia or not. That's for image classification. And this is actually way oversimplified. So this is an example of an image that's twenty eight by twenty eight. So that's seven hundred and eighty four pixels. And here's an oversimplified version of that neural network. You can see they're showing Pixel one through 20 and then jumping ahead to seven eighty four. And you can appreciate how these neural networks just get huge really fast. Keep in mind that a typical image is five 12 by five 12 as well. So these these get really big, really fast. So I've said the word convolutional neural network multiple times in the talk, and I want to kind of clarify what that is. I mean, it's basically the deep neural network that's most commonly used for image analysis purposes is both for classification and segmentation. Basically, a convolution is an image filter like smoothing or blurring or sharpening. And typically, a commission on neural network uses convolutions with what's called the max pooling steps in combination to kind of form the architecture of neural networks. So this is an example of a unit. It's called the U because of the shape of the architecture. But you can see the blue arrows. Here are convolutions, so you do different works to the image and then you have a max pooling step where you decrease the the spatial resolution in the image, say, five 12 by five, the 256 by 256, then the multiple convolutions and then max pooling step and you work your way along and then you reassemble back to the original spatial resolution. The thinking being from a high level that these early, high spatial resolution layers are finding low level features like points and lines and edges, and that the later layers are finding high level, low resolution features like, say, a face. So this is a unit called that because it's kind of shaped like a, you know, and that's what's most commonly used for segmentation, that's what I use for segmentation. So that's kind of the the how of a deep neural network is constructed. What are some examples of the new software that's out there using these deep learning techniques? One of the first ones to market was brain volume software. I'm just showing an example of neural quanta. I probably should have said in the beginning, I had no disclosures. I don't have any interest in these companies. This is kind of just the one of two market leaders right now. But they make a product called neuro client where they do age, normalize brain volumes. So you can say, Oh, the 65 year old males left parietal lobe is greater than two standard deviations below normal for a 65 year old male. It's heavily derived from free researcher. So I'm saying earlier, you know, if you train a neural network on Fifth Circuit segmentations, it's much quicker. And I think that's essentially what they did. And then they create a normalized database and issue reports from that. They also create a product called Lesion Client, where they segment out their hyper intense lesions on MS multiple sclerosis studies and try to find new lesions on her tongue. Rapid is another company that's been very successful. You know, they're called, they say, rapid icon. They basically automate CT perfusion processing. I think the interesting thing is there's not a whole lot of A.I. involved. We've been processing CT for fusion studies for 25 years. They use a neural network to automatically identify an artery and a vein as inputs for processing. And then they use the traditional techniques to process out the perfusion study. I think their real secret to their success was that they were at Stanford, and they partnered with all these clinical trials to develop increased time windows for stroke treatment. And that really opened up a lot more business and kind of changed the of treatment game. So all of these big papers that came out said we got these results and we use the software to get these results. And so all the neurologists are going to their hospitals and requesting rapid eye software. So I think they've probably been the biggest success story of any company. And the key was validating it clinically in huge trials and then driving the demand for the software. This is just an example of another company, I Doc.. You have this whole, so these are kind of brain volume investigations and rapid are kind of well-established neuro applications. And then you kind of have your pathology detection companies and everybody is trying to trying to come out with software that can triage and detect acute findings. So intracranial hemorrhage is kind of the first task that was tackled by academics and business using largely classification algorithms. They're also doing large vessel occlusion detection, so strokes and cervical spine fracture detection. In body imaging, they're working on pulmonary embolism detection. And a nuclear medicine, say, pet CT. They're working on basically segmenting out hydro metabolic lesions on pet, which you can see there's a lot of image contrast here, so not a not a terribly difficult task. So that's one of the new software that's out there. A lot of people talk about beyond image interpretation or so-called upstream A.I., which is basically using software to improve radiology workflow, which that's great. I would love to see that, and I'm honestly more interested in a lot of that than I am in the imaging, necessarily upfront. All right. So that's some of the examples of newer software now I'm going to go into a little bit of how do you do any project if you're interested in doing it? What are the nuts and bolts of actually making that happen? So first, as with any research project, you've got to kind of frame your question, identify a goal or a hypothesis and the structure you're interested in analyzing with A.I. or pathology that you're interested in. So this is an example here of segment and not the right ventricle in the right atrium. Basically, you know, to train a neural network, you're going to have to have about 200 or 300 studies and you want to make sure, if you can, that it's a good mix of demographics and image quality, et cetera. And then this could be the pain point somebody has to manually segment the structure or pathology you're interested in on every study. And you can see here this is the segmentation process in 3D slicer. It's pretty time consuming and laborious. Typically, any structure is going to take five or 10 minutes per study, but I would argue that it's not bad if you find four seven people who are interested. Medical students, graduate students, residents, each of them spend 16 hours. You can get to 200 to 300 pretty quickly. And then once you have a label dataset, you train a neural network like a unit on those manually created segmentations to create a model. And then you can use that model to then find those structures automatically on images. The whole reason I got interested in imaging I was because I was coming at it from a 3D printing background and had been doing a lot of manual segmentation myself. And in 2016, when I heard there were these things that can automatically segment structures, which I had been doing manually for quite a while, it was pretty, pretty exciting. So as I've kind of alluded to, my primary interest is in using units to segment different structures. The first project that we took on was segmenting out the ventricles on a head ct. So you can see here this is a segmented ventricle. So just like I was alluding to to do the project, we had about 250 headsets and we all segmented out the ventricles on each head headset and then use that to train a unit and then make predictions. So this is just an example of a segmented vegetable. Now, when you're training these models, you need to evaluate how good they're performing, and so as the model is training, it will spit out a statistic to say how good it's doing on the training data. And the two major statistics that people talk about are Jakarta indexes and dice coefficients without getting into the details too much. They're both essentially a measure of overlap. So in general, you know, a dice score of one would be absolutely perfect overlap with your training data. They scored zero would mean that you're not finding anything. In reality, what we've found is, well, they're the first project with the variable segmentation. We've got a dice score of zero point nine, which was actually pretty surprised by on subsequent projects. I've been happy with the dice score of anywhere from 0.7 to 0.8. And those are for projects where precise delineation of the structure isn't as important. We're just trying to identify landmarks. But for the ventricle where we really wanted to determine the exact volume, it was nice to get such a high score. This is just an example that I think raises a couple of interesting points. So the green here is the manual segmentation that ventricle. And then the blue is the neural network prediction of that same digital. And I thought it was interesting. There was actually kind of a student becomes the master phenomenon that happened with this, where the sometimes the neural network segmentation was actually better than the ground truth label. The manual segmentation. And I think what happens is to segment off the ventricles. You can see that headset is kind of noisy. And so as you segment it out, you get a lot of little spikes along the edges. So we applied a smoothing step to kind of smooth it out and make it what I felt would be more anatomic. But in the smoothing, it would slightly reduce the size of the ventricle. So I think the manual segmentations, we just couldn't make them as accurate as they needed to be. But when you trained the neural network, it kind of figured out what you were trying to do, which was to precisely delineate CSF attenuation structure just left of the midline. So it's interesting the neural network does better than your manual labels. So what are some challenges with getting imaging research going? Well, I'd say to begin with, you got to get a computer set up that works, and I found that to be very challenging. You have to get the graphics drivers all working just perfectly. You have to have the right version of the graphics driver. That depends on the version of another graphics driver. And if you can find a how to on line that walks you through exactly how to do it, that's that's generally how I can get it done. Access to data can be a challenge as well. Like I said, you know, you're using 200 or 300 headsets. Well, I mean, if you're a startup company, you've got to somehow get access to those entities. There are publicly available data sets out there, and I think that's how some companies got started. But eventually, you probably have to partner with a university or academic hospital to get more data. And then, as as alluded to several times, labeling and segmentation is kind of a pain point as well. It's very time consuming, but my counter point is usually that it's not that bad. And in general, another thing that you try to do is augment your data. So you've got the manual segmentations, but you can do things to that to try to squeeze more out of your existing labels. You can rotate the images, flip them, wear them, sharpen them, etc. So there are ways to kind of get more out of the existing data. And then just, you know, the available code out there to manipulate radiology images is a little limited. I mean, there are libraries, there's tie dye com and things like that that can extract the image data from dotcom, data sets and everything. And that's great. But a lot of really basic tasks that we want to do all the time. We got to spend a lot of time kind of mastering how you do that and then once you do, you can really build on that. One of the things I really struggled with was rotated images and image and maintaining physical location and space. Every radiology images of physical location. As you scroll through it, you'll see reference lines on different planes saying, You're here. Well, when you rotate an image, you don't want that position to get lost. So to figure out how to do this, and there was not an example anywhere on the internet of how to do this. So if somebody were to just show me, this would have saved me a lot of X. Another challenge or benefit is that it's a pretty rapidly changing landscape. When we first trained to create our first unit, it was kind of a manual creation. It would take 20 to 30 hours to train a model for vertical segmentation. There's a new Python library called Monye. That's Open-Source. It's based on PyTorch, so it uses cached data sets, and we saw an order of magnitude improvement in performance, both in training the model, which typically would take about one to two hours instead of 20. And in the predictions, so predictions would take about 70 seconds. And now predictions take less than 10 seconds. And another challenge is the black box nature of neural networks, I've alluded to that a few times, just the fact that, you know, if it doesn't work, you've got to try to figure out why it's not working. And I think that is one of the roles for radiologists, actually. You know, we're used to looking at the images. I think we can serve a role for kind of intuiting why neural networks aren't working sometimes. I was trying to train a neural network to recognize the cochlea in the inner ear, and I was applying smoothing and noise in augmentations to my data. And once I applied the smoothing augmentation and trained the model, it started finding the cochlea all over the place in the skull, et cetera. Then I had to realize that in the smoothing step, I was basically obliterating any recognizable cochlea and kind of ruining the model. So you got to kind of think about think about the image and appearance and use that to train your model. One of the classic examples, and I think it's kind of amusing gets rolled out is a group was trying to train a classification algorithm for pneumonia and it was performing perfectly like, you know, 100 percent specificity and sensitivity. You can generate heat maps from classification algorithms so that it'll tell you what it's looking at to make the determination. And when they did that, they actually found they had gotten all their normal chest x rays from one hospital and all the pneumonia examples from a different hospital. And they used different markers on them, and it was actually identifying the markers to identify pneumonia and not the pneumonia itself. That's a popular example. I think it always kind of gets some laughs. In reality, I mean, I think they would have figured that out pretty fast once they started trying to use it. So in terms of research, what are some future directions? Well, I kind of look at the reason I've been kind of emphasizing like, what do we have? We got classification segmentation, Gans, proteomics, natural gas processing kind of see it like when the iPhone first came out and you had a screen and a camera and a microphone and speakers and people are constantly finding ways to combine them. One of the messages I'm trying to spread to clinicians is the classic radiology research project is we measure two thing three hundred times in normal people and in disease people, and we found a difference. And a lot of times they're pretty difficult measurements to perform so they don't actually end up getting used clinically because you have to do reformat to get the perfect measurement. And I've been telling them, if you're going to do that kind of project, come to me, we'll segment it instead. And if you do actually find something useful, then we can automatically measure those things in the future using a trained neural network. So kind of try to harness the traditional effort that's been put into research projects and kind of use that to generate little datasets. We're working on a head CT anonymization algorithm currently that we actually have working. There's nothing that directly anonymize its headsets that kind of say, well, you can try the existing ones for MRI on honesty and see if it works, but we're trying to directly do that for sharing, for education and research purposes. And then I think there's a lot of potential in the future for age normalized volume. So, you know, Neuron is a product of Stone Age normalized brain volumes. I think there's a lot of different structures that it would be useful to have age normalized data for, like the pituitary, for example. So I'm just going to wrap up by some speculations about the future of radiology in general. So back in 2016, Andrew Yang and Geoffrey Hinton both basically said, you know, the sky is falling, start turning radiologist game over in five years, was it completely obvious that within five years, deep mourning is going to do better than radio? And they've since walked back those statements, which is appropriate because five years have passed and I still have a job. But I would just say that, you know, this is a really difficult challenge. These new tools are very powerful for a few very specific things, but there's so many basic things that have still yet to be overcome. For example, this is a diffusion sequence that gets around on every brain MRI across our standard, as we see it named as a DWI, an isolated device. And I saw that diffusion weighted be 1000 B20 an ISO. That is such an easily fixable problem. You can train a classifier to come up with standardized naming system. Have a middle man easily solvable. Years and years have gone by, and no system has ruled that out. So things just happen really, really slow in health care. And the other point I was make is that these are very laser focused tools. We train a neural network to segment out the ventricle, and that works great. But that's a there's a huge difference between segmenting the ventricle or identifying intracranial hemorrhage and saying I have looked at headset and I know that I have found everything that you could possibly hope to comment on about that headset. And I just given the tool sets that are out there, I just fail to see how you're going to get independent reads if radiology studies any time soon. Until we get I guess, generally I. But I think it kind of plays into the Gartner hype cycle of everything being harder than you think. You know, you have a new thing. Everybody gets excited about it and then realizes it's harder than you think it would be. And then you slowly start to learn how to use it and be productive. The Gartner web page lists of deep learning near the top in 2019, and you can kind of see it coming down in 2020. So I think that people are learning the limitations of it and learning how we can actually implement it. In 2020, two of our national organizations came out with a statement recommendation to the FDA, basically saying Your company comes to you and tries to get FDA approved for autonomous imaging, which is to say we will interpret that Chest X rate for, you know, radiologists necessary. Don't even review it because there's no way they're actually doing that. I'm sure there's some self-interest in that, but I'm positive that it's also true. So, yeah, I think the reality is there's a lot of new and powerful, very targeted tools that are going to help us do our job more efficiently and maybe do some more exotic measurements than we've been able to in the past. But in the end, I think, you know, our jobs were pretty similar. So that's that's everything at this point, like I said, it finished a little bit earlier. But if you guys have any questions.

SPEAKER 0
Thank you very much. You know, I always think about Cardiogram, which are far simpler, and we've had automated ACG reading for 30 years. At one point in the late 2008 2009, CMS said, Oh, it's so good, we're just going to eliminate paying doctors to override them. And then, of course, they backed away from that. And we still use the human computer issues. And one of the things I really like about this talk is it goes back to why in the Center for Intelligent Health Care, we said we really need to have human and computer cognition because we're looking. I think the future is in augmenting the clinician, not in replacing them and and doing that if there's things that can be heavy lifting. So you talked a little bit about workflow, but how can the center help you with those sorts of workflow issues? What what sort of things do you see that you need? To facilitate.

SPEAKER 1
I think, you know, like the head CT anonymization project, I worked with David Ellis and he used the DG X for some computationally intensive tasks, and so I think that the supercomputer is going to be particularly useful for imaging AI research. And I'm not kidding myself. I mean, I can implement a unit, but I'm sure that they're constantly going to be better and new techniques. And I think that collaboration with undergraduate institutions and computer science departments is going to be important to kind of stay relevant.

SPEAKER 0
True. Other questions for Justin.

SPEAKER 2
And it's the Jimmy Chang who was a great source of lots of content there. I really appreciate the careful review. You know, the Center for Intelligent Health Care. Two of the other course one's a core for good data, and the other one is a core for good design. And getting back to John's query about workflow. I do want to actually wish that just a little bit. One of the things that's been notable about the radiology world is the I would call a lack of ability to move forward with structured reporting. That is, how do we convert the radiology world into one where we start off with good data? We start off with interpretations that are categorical, that are classified in ways that then can ultimately help hopefully drive towards a a way to marry or match the interpretation from AI model or algorithms. To what if you will ground. Truth is, can you comment about both the structure reporting and what do you think it's going to take to advance kind of the field of radiology towards that approach

SPEAKER 1
or to structure reporting of our texts reports?

SPEAKER 2
Yeah, that's a big part of it. What what is it about workflow? What is it about structure reporting? What's needed? How would that actually then play into the work that like, for example, that you're doing with?

SPEAKER 1
I think that, you know, as you're getting more quantitative data from automatically segmenting structures and just getting automated results, you can populate those into reports in a predictable way. I think structured reporting is tough. You know, everybody agrees it's a good thing to do in getting getting common categories for your report is one thing, but actually getting people to really report it in a rigorous, rigorously structured way. Obviously, you guys know better than I do. It's just really difficult. So I think is going to help in the sense that you're automatically getting a lot of these measurements and you can auto populate reports of them to so, you know, maybe the collection of data, if you can take the human element out of it, it'll be more homogeneous.

SPEAKER 0
I saw the Dr Chambers was on earlier, I don't know if he's still on, but I know Ward has a lot of interest in imaging informatics. Well, question. Hello. Yeah, go ahead. Yeah, it's Steve, Robert, great talk. Can you comment? How much does uncertainty in the ground truth complicate the development of a model? I mean, you come into ventricles easy to create a model, but it's also easy to tell the ventricle from not the ventricle. Other things that actually may require some judgment and there may be some difference of opinion. You know, for example, what's a pneumonia and what is it? Pneumonia may not be as easy. How much if you have multiple experts defining ground truth, can that uncertainty complicate creating a model and how is it best handled?

SPEAKER 1
Yeah. No, I think that's a great point. I mean, you know, we're not always right. I mean, even segment in the third ventricle was difficult. Finding the anterior margin of the third ventricle requires guesswork, and I think our model was kind of sloppy at predicting it so. How is it best handled? I'm sure it's going to be an ongoing challenge. I mean, I think. Binding ground truth, first of all, it's important to know, I mean, if you're going to have an ammonia classifier, you should probably confirm clinically the patient actually had pneumonia. And it's not. It doesn't look like ammonia on the chest x rays. So, you know, clinical correlation with imaging findings, especially when you're trying to get, you know, identified pathology. It's probably important, like if you're doing radio mics to try to determine the great of a brain tumor, you're digging into the clinical data as well. So I don't have a perfect answer, but I guess I'd say that careful, very careful curation of the data and realizing what your limitations are a step in the right direction.

SPEAKER 0
Yeah, not necessarily knowing what the right answer is is sometimes good judgment. And it begs sort of the next question which is suppose you see something and you know, it's not normal. And what you think you're the expert is that I think it's probably pneumonia, but if it isn't pneumonia, I think it's an infarct. But I don't think it's a cancer and there's reasons for all of that. Is it possible to create a model that in fact comes up with the hierarchical differential diagnosis within some kind of waiting?

SPEAKER 1
I say in the sense that anything's possible, yes, but I can't speak to how that would be implemented. You know, most of the tasks I see are sort of like giving you a, you know, a probability of it being something. So I don't know if you train a classifier to identify bacterial pneumonia or viral pneumonia normal. Maybe it gives you outputs of point seven for bacterial pneumonia and five for viral pneumonia and point three for normal. And you say, Oh, it's probably a vector of money, but but to go from that to. I guess it'd be carefully training the classifier, but even then, man differentials are very expensive and part of the challenge too is we see, you know, you hear the, you know, the quote of you hear of beat someone for zebras. Well, I mean, really, I'll just kind of live in a zoo and we see zebras all the time. And when you're trying to train neural networks on uncommon pathology, it's kind of tough to get enough training examples to do that. So I think that's a challenge to.

SPEAKER 0
I have not tested. I'm reading a book about the fallacies of A.I., and this author is building the case that human cognition really is about intuition and computers can't do that. A.I. is really good at ingenuity, and he's not saying that humans don't have both intuition and ingenuity. But what he's saying is, I can't do intuition. And I think what he is really saying by that is it can give you hard numbers, but it can't do what you just said, which is think about these things. And that's where I think that combination of human and computer cognition is. And I see this in the automated EKG reads, and Justin actually kind of pointed to it. In some cases, it catches things that I would have missed. Maybe I was a little tired. I was distracted, and it holds you to looking at things a little more thoroughly. So I think there's been a lot of failure, starting with my assist in trying to replace the clinicians. But I think there's a whole lot of room for understanding expertize and saying, you know, as you're working, let's say you're working on a report and you say, I think to your point, I think this is pneumonia. I don't think this is cancer. It might be heart failure. I could foresee any AI system at that time giving you percentages of or references to say, well, there's a two point twenty five percent to twenty five percent chance that this could be cancer based on the image. So I think we're really still at the front end of things.

SPEAKER 1
Yeah, I agree. I mean, it is really just a laser guided tool at this point and getting into the more subtle nuances of even a specific imaging appearance is ways up from that.

SPEAKER 0
I don't know if Dr. Scott is available for comment, but as an AI scientist, do you see other ways we ought to be looking at imaging? And so from an algorithmic perspective, if there's any of the AI scientists are going to comment that you think are innovative in ways we ought to be looking at solutions.

SPEAKER 2
So, oh, the said the Steven Scott here, the in the talk, it basically covered, I mean, there are lots, tons of variations on the architectures, but the convolutional architectures, especially really deep ones, modularized ones are, from what I've seen, pretty much the norm. I mean, Garrett working might you might want to add to this. He did a lot of digging into like X-ray image image analysis. One thing that that was going to mention that you can, you know, if you're, you know, if you're scarce unlabeled data, then like an unknown or semi supervised approach is pretty common where you you'll do the like do you architecture, where you're scaling down and then scaling back up. You use that a train that is an auto encoder where you try to just replicate the you feed the input in and you try to replicate it to the output and then you throw out the decoder and and then you use your labeled data to train the actual classifier, for instance. And that would that bias you, as you can use a lot of saved millions or billions of images that are unlabeled to train the network to identify some of the important features for reconstructing an image. And then those same features will probably be pretty important in classifying. And then when you throw up the right half. And then you just you bolt on a classifier, a couple of extra layers for classification and then use your use your labeled data for that to train that, that that that's a not uncommon technique. But as far as. Like really more advanced kinds of roaches. I mean, I've seen some things that are kind of more. There's still kind of more fringy approaches that might end up proving useful, but I haven't really seen them. Have, you know, any more success in the mainstream than the classic convolutional architectures? And they usually come with their own difficulties, like they might take longer to train. Or maybe they're a little bit more unstable numerically, and they might not even converge to a good solution at all. And so I would tend to kind of think that an off the shelf convolutional network can give you plenty to work with, especially if you bury how you piece it together because you can. I tell you, I tell my students it's like Legos. You can plug them together however you want. Thank you, very helpful. Garrett, I don't know if you had anything to add to that so that you're participating. Yeah. One thing that comes to mind, I think one of the biggest unexplored frontiers for image classification models is this kind of goes to what Dr Redwood is talking about leveraging hierarchical information because most radiology applications of AI are just doing computer vision on individual images in a bubble. And that's not how medicine in practice. Images are ordered for a reason. So training conditional models based on like the ordering criteria or suspicions of ordering physicians is a huge source of information that's currently left completely untapped. There's a lot of difficulty in releasing large, anonymized datasets that contain that information, but that is a direction that I think should be explored moving forward.

SPEAKER 0
Thank you. And that's one of the reasons we've put the supercomputer inside the firewall and working with the IRP so that we can get clinicians to both find ground truth and review charts in a non anonymized fashion to to get out the answers to the original point. If you have a sputum culture that grows staph pneumonia and blood cultures that are positive for that and a white count that's elevated, I'm thinking you're and you have an infiltrate on a chest x ray. I suspect you're at a very high level of confidence that it's pneumonia.

SPEAKER 1
I think one thing we've kind of played around with, too, is, oh, a couple of different techniques that have proven useful. One of the challenges is we can't predict at full resolution know five five fold by five 12 just makes the graphics card die. So you've got to predict at 128 by 128. And one of the things that David Ellis suggested that I thought was pretty clever was just an iterative prediction technique where you actually predict it at 128, you know, low resolution, find it and then crop the image and zoom in and then predict it again at full resolution. And so you sort of like fake having a ton of memory and your graphics card and predicting that full resolution. And the other thing he did, too was trying to identify the face went ahead cities. I didn't want to label a face 300 times that would have taken forever, and he had the idea of, well, this is an it's not an exact thing. So how about new label to face on one and then we'll work 200, head to match that one and then walk the label back and create. So we have like 600 label densities now and train the neural network. And it worked just fine because we're not trying to precisely to find anything. It's just sort of general. So non-linear registrations are, you know, pretty amazing if you're not trying to exactly delineate something.

SPEAKER 0
And what do you think there is a need for a five by five call system? Do you think you're missing things?

SPEAKER 1
I think the issue is and in the computer, scientists could probably speak to this better. But it's I mean, it's an exponential, you know, you're at 128 by 128 and you're you're hitting the 12 gig limit on your graphics card, but then you go to 256 by 256, you're going up to, I don't know, ton of memory and then five, 12, it's not even in the realm of possibilities. So you kind of got, I think, just get creative in ways like that. I mean, people have talked about 2D 3D neural networks to where you don't predicts the whole volume. You just predict a slab of 10 slices at a time. And there's so there's different ways around it. But to just do a full headset at full resolution, it's going to be a while. Thank you.

SPEAKER 2
I mean, as far as the graphics cards get bigger and better slow, and we'll be able to eventually tackle high resolution images within a lot of natural image classification. Usually, you know, if it's if it's a higher resolution image or just aggressively downsample that either even before you beat it into the network or in the first layers of the network, because the features that you need like to figure out whether or not there's a dog in an image you don't need, you know, a full, you know, like like a full HD kind of image. You can it to, you know, 32 by 32 and have plenty of information in there to classify it. You probably can't say the same thing for like radiology. And so that that could be problematic. But yeah, there's really not a good way around it. You have to either have bigger hardware or just take some techniques that will just downsample the image or do some kind of, I guess, some hierarchical approach might work. OK? If you're doing segmentation, you can fake it till you make it there, like you can take, say, like nine overlapping 128 by 128 sub images from your 512. And yet for segmentation, assuming that you're like segment, the feature is going to fall within like a 120 by 120 bucks. That'll probably get you pretty good results.

SPEAKER 1
So you just dice it up into multiple different subsets.

SPEAKER 2
Yeah, and then you have your full resolution 128 by 128 slices.

SPEAKER 1
Yeah, and that's kind of what I was doing with the cropping. You sort of find it at low resolution and then zoom in. But yeah, you can just crop up the initial one beforehand, too.

SPEAKER 2
Now I had some students in class do that like they were having extremely high resolution pathology scans, scans of pesticides. And there's just no way that I can't even remember what the numbers were, but they were ridiculous. And so they split it down to, I think, maybe 32 pixels square or something like that and then analyze each one of those together and then took some kind of a weighted vote across the sub images. And the idea was like they were looking for like one specific to say, one specific pathology that might appear within one of these C sub images. And so basically, the idea was if they found it in any one of them with a lot of confidence and they would say yes on the whole slide and that it kind of I mean, some of it kind of depends on where is that you're trying to do if what you're looking for Mike Spann multiple sub images, then you need some more clever way to combine them rather than, say, just kind of a boat to a lot of time think might be application specific? Yeah.

SPEAKER 1
David, I didn't see her on here, I should let you talk about that.

SPEAKER 2
I've actually tried like sort of interleaving leaving the image like having multiple resolutions so I can serve the sampling as sort of interleaved. I've tried that and didn't actually give me any better results in just downsampling. But an interesting idea. Well, yeah, once again, I think a lot of it's application specific, if you can, if the features that would successfully identify there were successfully classify an image are still clearly visible when you downsampled, then downsampling might be just the easiest way to go about it. But if you if you needed something where you know, like like if you're looking for a really minor, you'd find kind of find features that the moment you down sample they just blur out, then you might. You're probably not going to have much success with downsampling by itself.

SPEAKER 1
It was actually kind of surprised you could sign that quickly as well as it did. But I think it was using a lot of context because it's always in the same location on every image. But the really fine pathology, something that's in different locations would be to.

SPEAKER 2
Well, we had a I was amused by your example with the with the chest x rays and identifying the marker on the shoulder. Garrett saw some things that there are some classic chest X-ray data sets where you sometimes the systems that do well, they're not learning to say, diagnose, say, pneumonia or some other condition. They're just they're just classifying the pose of the X-ray. Like if it was lateral versus frontal or something like that. And so there's a selection bias in just the data because the physician's not going to order a particular position without suspecting that that condition is actually going to be present. And so you just detect, say, a lateral x ray and then just predict, you know, yes, all the time, you're probably going to be right part of it. No.

SPEAKER 0
Well, we've we've reached five fifty nine. I really think this has been an excellent discussion. Justin, thank you because we certainly had a lot of conversation afterwards. Next month, if some of you will be speaking and talking about his work with classification of hypertension and work towards a classification of hypertension, we will send out a reminder a couple or a week beforehand to everyone. I wish the best for the holidays, for everyone and thank you.

UNKNOWN
Thank you. Thank you.