Protein Absorption in Cells

1 Introduction

What follows is a proof of concept for using persistent homology to establish the effectiveness of medication in treating sick cells. Cells are sick in the sense that they cannot absorb a certain protein. The medication is suppose to allow the protein to be absorbed. The protein has been dyed so as to show up in images. The cells are grouped and each group is given a medication, next the protein is introduced to the cell. Effective medication have the protein distributed throughout the cell. Ineffective medication have the protein concentrated along the cell boundary. The problem is to find a way to compute the effectiveness of the various medications.

2 Imports

Here we import the needed libraries. We are suppressing the messages from loading the libraries to keep the resulting html document cleaner. We first check if the package `install.load` is installed, if not we install and load it, if it is installed we simply load it. Next we use this library to install (if needed) and load the renaming libraries.

suppressMessages({
  if(!require("install.load")) {
  install.packages("install.load")
  }
})
suppressMessages(install_load("tiff"))      # Using: readTIFF
suppressMessages(install_load("rtiff"))     # Using: autoThreshold
suppressMessages(install_load("RSAGA"))     # Using: grid.to.xyz
suppressMessages(install_load("TDA"))       # Using: ripsDiag, 
suppressMessages(install_load("plotly"))    # Using: plot_ly
suppressMessages(install_load("plot3D"))    # Using: image2D

3 Helper Function

This function applies auto-thresholding to its input, which is an image. If a pixel value is less than the threshold, it replaced by 0. The algorithm for determining the threshold value is in the package `rtiff`.

autoTH <- function(x) 
{
  th <- autoThreshold(x, est = .002)
  x[x < th[5]] <- 0
  return(x)
}

4 Image Processing Function

This function takes a file path, an optional trim amount, an optional sample size, and an optional z-distance (the distance between slices in the cell). It then reads in the stacked image, and trims out the noisy slices (if a trim amount is supplied). It then applies a thresholding algorithm to each slice individually (see the helper function above). Next the image slices are turned into 3D-points, with a 4th dimension containing the original pixel intensity. A sample is then taken from this point cloud, and the persistence diagram is computed from this sample.

The function returns a list containing the original file path of the image, the raw slices, the trimmed raw slices, the thresholded slices, the point cloud, the sample point cloud, a 3D plot of the sample point cloud, and the persistence diagram.

processImageStack <- function(fp, trmAmnt = NULL, smpSize = NULL, zDist = NULL)
{
  ## Read in the stacked image, then trim out the noise (if told to).
  rawSlices <- readTIFF(fp, indexed = TRUE, native = FALSE, all = TRUE)
  if(is.null(trmAmnt)) trmAmnt <- 1:length(rawSlices)
  rawSlicesTrimmed <- rawSlices[trmAmnt]
  
  ## Autothreshold each image slice seperatly
  thSlices <- lapply(rawSlicesTrimmed, autoTH)
  
  ## Convert all image slices (matricies) to point sets, rename z -> intensity
  pntSlices <- lapply( thSlices
                     , function(x) grid.to.xyz(data=x,varname = "intensity"))
  
  ## Add in z values
  if(is.null(zDist)) zDist <- 3.87595 
  for(i in 1:length(pntSlices))
  {
    pntSlices[[i]]$z = rep((i-1)*zDist)
  }
  
  ## Concat the lists of points, then switch order so z is before intensity
  pointCloud <- do.call("rbind", pntSlices)
  pointCloud <- pointCloud[, c(1,2,4,3)]
  
  
  ## Get a point sample
  if(is.null(smpSize)) smpSize <- 500
  sPointCloud <- pointCloud[ sample( nrow(pointCloud)
                                   , smpSize
                                   , prob = pointCloud$intensity), ]
  
  ## Generate the plot
  x <- sPointCloud$x
  y <- sPointCloud$y
  z <- sPointCloud$z
  cloudPlot <- {plot_ly(
    x=x, y=y, z=z, 
    type="scatter3d", 
    mode="markers", 
    text=paste("Intensity: ", sPointCloud$intensity),
    marker=list(opacity=.6, color=sPointCloud$intensity)
  ) %>%
  layout(title = fp)
  }
  
  ## Compute the persistence diagram, only for the spacial coordinates
  sRipsDiag <- {
    ripsDiag(
      sPointCloud[ ,1:3],
      maxdimension = 2,
      maxscale = 40, 
      library = "GUDHI"
    )
  }
  
  ## Generate our return list
  retList <- {list(
    "filePath"         = fp,
    "rawSlices"        = rawSlices, 
    "rawSlicesTrimmed" = rawSlicesTrimmed,
    "thSlices"         = thSlices,
    "pointCloud"       = pointCloud,
    "sPointCloud"      = sPointCloud,
    "cloudPlot"        = cloudPlot,
    "ripsDiag"         = sRipsDiag
  )}
  return(retList)
}

5 Test Image

Now lets run this on a test image.

5.1 Read and Process the Image

testFP  <- "test.tif"
testT   <- 14:36
testR   <- processImageStack(fp = testFP, trmAmnt = testT, smpSize = 600)

5.2 Results

5.2.1 Raw Image Slices

This is a heat map of each slice of the cell in its original state, no filtering, thresholding, or trimming has been applied at this point.

image2D(testR$rawSlices,asp = 1, axes=FALSE, colkey = FALSE, xlab="", ylab="")
4e37ec94b1be55a695349f73b8f01f88841854fd.png
0fb00e44b437ed37b8d474744e0aa4898c57fdb8.png
9de30cd19ffc2055a145f1a78d597b1f82c44d79.png
5fcf38770633aaf2f12a24225b3fc5ade2cd292e.png
82fe62a88a25059648f541fa6c79a371ab008b69.png

5.2.2 Raw Trimmed Image Slices

Since the first few and last few slices appear to be nothing but noise, they have been trimmed out, and the results are shown below. The results from computing the homology are much better when this is done. At this point the trimming has been done manually based on each image, but it would be nice to have an automated way to do it.

image2D(testR$rawSlicesTrimmed,asp = 1, axes=FALSE, colkey = FALSE, xlab="", ylab="")
3b73763340e672c5332d2e3637b59182b63bad8a.png
dd530524401550024d7b00227d4fbfa52f908173.png
40c58d231513855eeb14cc1d98327793182fbb14.png

5.2.3 Thresholded Image Slices

Next we show the results after an automatic thresholding function has been applied to each slice individually. This function comes from an R-package.

image2D(testR$thSlices,asp = 1, axes=FALSE, colkey = FALSE, xlab="", ylab="")
313d2b0b229d460552aaeb46bc04de9d9ad13390.png
466e15f46d42003a6f6c549834c847ca40aac60d.png
80da1612f3a1e057ad402f79dbbb87f841d043aa.png

5.2.4 3D Point Cloud

Below is an interact 3D point cloud of a random sample taken from the full point cloud generated from the stacked image. The sample is random, but points with higher intensity have a higher probability of being chosen. Hence if there is a high intensity region in the cell, there is a high probability that the sample will include points from this area.

testR$cloudPlot

test-3d-point-cloud.html

5.2.5 The Persistence Diagram

Here is the persistence diagram for the test image. The black dots are the connected components, the red are the 1-holes, and the blue are the 2-holes. For a cell with high intensity around the boundary, one would expect a blue point far away from the diagonal. For a cell with intensity dispersed throughout, one would expect any blue points to be near the diagonal.

par(mfrow=c(1,1))
plot(testR$ripsDiag[["diagram"]])
test-persistence-diagram.png

6 High Intensity on the Cell Boundary

Here will will look at a cell with high intensity around the cell boundary. For this example, one would expect the persistence diagram to have one blue point far away from the diagonal.

6.1 Read and Process the Image

strongFP <- "Boundary-Strong.tif"
strongT  <- 15:38
strongR  <- processImageStack(fp = strongFP, trmAmnt = strongT)

6.2 Raw Image Slices

image2D(strongR$rawSlices,asp = 1, axes=FALSE, colkey = FALSE, xlab="", ylab="")
4e37ec94b1be55a695349f73b8f01f88841854fd.png
0fb00e44b437ed37b8d474744e0aa4898c57fdb8.png
9de30cd19ffc2055a145f1a78d597b1f82c44d79.png
5fcf38770633aaf2f12a24225b3fc5ade2cd292e.png
82fe62a88a25059648f541fa6c79a371ab008b69.png

6.3 3D Point Cloud

strongR$cloudPlot

high-intensity-boundary.html

6.4 The Persistence Diagram

par(mfrow=c(1,1))
plot(strongR$ripsDiag[["diagram"]])
high-intensity-boundary-persistence-diagram.png

7 Medium Intensity on the Cell Boundary

Here will will look at a cell with medium intensity around the cell boundary. For this example, one would expect the persistence diagram to have one blue point a moderate distance away from the diagonal.

mediumFP <- "Boundary-Medium.tif"
mediumT  <- 7:29
mediumR  <- processImageStack(fp = mediumFP, trmAmnt = mediumT)
image2D(mediumR$rawSlices,asp = 1, axes=FALSE, colkey = FALSE, xlab="", ylab="")
6a72aeb4cfa922b1a13d8ce307c4c35f4ff9e1fe.png
23d6e5932d96225e8e90b62ee260c9f81f367bdd.png
9f64bfb84a3da40403ae2aa24bb56114b7bbb36c.png
0f9aa8506b7a7e987089b7ce8a5b7e111d9ef6b9.png
c68966819c74c8a9e898b1afab649b542d8d7e6e.png
mediumR$cloudPlot

medium-intensity-boundary.html

par(mfrow=c(1,1))
plot(mediumR$ripsDiag[["diagram"]])
medium-intensity-boundary-persistence-diagram.png

8 Low Intensity on the Cell Boundary

Here will will look at a cell with low intensity around the cell boundary. For this example, one would expect the persistence diagram to have no blue points except near the diagonal.

weakFP <- "Boundary-Weak.tif"
weakT  <- 13:29
weakR  <- processImageStack(fp = weakFP, trmAmnt = weakT)
image2D(weakR$rawSlices,asp = 1, axes=FALSE, colkey = FALSE, xlab="", ylab="")
72575b13c488ace9b29a98865952954bcc756d08.png
3a568d6cf8e7ec4fb461f3d60eabf45aa466b73c.png
83d94720740ebd19bd84c6ae8251ebf22c4ac13c.png
48393dc450d10457a99295fae2ae25c9de7bdbef.png
83d4ebef82cb3c1dec43e774c3815824f072b087.png
weakR$cloudPlot

weak-intensity-boundary.html

par(mfrow=c(1,1))
plot(weakR$ripsDiag[["diagram"]])
weak-intensity-boundary.png

9 Results Comparison

  par(mfrow=c(3,1))
  plot(weakR$ripsDiag[["diagram"]], main="Weak Intensity Boundary")
  plot(mediumR$ripsDiag[["diagram"]], main="Medium Intensity Boundary")
  plot(strongR$ripsDiag[["diagram"]], main="Strong Intensity Boundary")
comparison.png

The weak intensity boundary is indicative of the medication working effectively. The blue dots all begin close to the diagonal in the persistence diagram indicates that the protein distribution has no voids and is evenly distributed in the cell.

The strong intensity boundary is indicative of the medication not working effectively. The blue dot far from the diagonal in the persistence diagrams indicates that the protein is distributed along the cell boundary, meaning little protein made it into the cell.


Author: Ryan Jensen (ryan.jensen@mathnotes.cc)

Modified: 2020-06-01 Mon 14:59

Generated with: Emacs 26.3 (Org mode 9.3.7)