Science Can Now Judge A Book Through Its Cover

Ever read the synopsis of a book wrapped in a transparent cover and wished you could glance through the prologue to get a better insight into the book? Many of us have been tempted to open the books and read a few pages before buying. If it isn’t allowed in your favorite bookstore, fret not! MIT researchers and their colleagues have designed a prototype of an imaging technology that could read closed books.

In September 2016, a team of researchers at MIT published a paper in Nature Communications, describing a prototype imaging device they had developed. To put it in layman terms, it can be visualized as a camera that can read closed books. Based on technology exploiting terahertz radiation, the imaging device can extract content through densely layered structures such as a closed book.

In the world of physics, terahertz (THz) radiation refers to electromagnetic waves, propagating at frequencies in the terahertz range i.e. frequencies between high-frequency edge of the microwave band (300 gigahertz) and the long-wavelength edge of far-infrared light (3000 GHz). Terahertz radiation, synonymous to sub-millimeter radiation or T-light has directive similar to laser light and can be transmitted through a variety of materials like ceramics, wood, plastics, paper and textiles. Given these characteristics, terahertz radiation with a wavelength of hundreds of micrometers can produce images with a resolution similar to that of the images viewed by a human eye under visible light, when the waves are detected in two or three dimensions after being reflected by or transmitted through the target object. This also enables one to use terahertz radiation to analyze the internal structure of materials like wood and plastics.

Terahertz radiation has found its applications in a multitude of fields for 3D imaging and detection of the structure of some complex samples. In medical imaging, T-light is used to detect the presence and growth of cancerous tissue. Moreover, since T-rays are non-ionizing in characteristic, no damage is inflicted on tissues or the DNA design. It can also be used to check the water content and density of tissues. Other applications of terahertz radiation are in manufacturing where it is used for quality control and process monitoring, in communication where terahertz radiation is used for high-altitude telecommunications through high and low moisture content in the atmosphere, in security where T-light finds use in surveillance. Despite a wide array of applications, terahertz light is most commonly used in imaging and scientific research. Recently developed methods showcase the ability of terahertz radiation to get images of opaque materials (near-infrared range) using methods like THz time-domain spectroscopy and THz tomography.

How does THz technology penetrate through a closed book?


The three challenges faced by the current imaging technologies are spatial resolution, spectral contrast and occlusion, when used for non-invasive inspection of dense and complex samples like a closed book. To further elaborate it,  conventional THz time domain spectroscopy (TDS) fails deep content extraction from layered structures because of three main reasons: the signal-to-noise ratio (SNR) drops with depth (number of layers), contrast of the content is much lower than the contrast between dielectric layers and content from deeper layers are occluded by the content from the front layers. Recent researches have employed time-of-flight capabilities of the conventional THz time domain spectroscopy coupled with its spectral capabilities to computationally overcome these bottlenecks. Conventional time-of-flight imaging techniques have successfully been able to image fast phenomena viz. Femto photography and complex geometries. Since time and space are related with the constant entity, speed of light (x=ct), it is safe to state that higher time resolutions will result in finer space resolution. This concept forms the basis of time-gated spectral imaging for content extraction through layered structures.

Related: E-Paper Technology is coming

While conventional TDS samples the pulses of high frequency light (100 GHz to 3 THz) at a very high time resolution to give space resolution of close to 30 micron (high enough to separate the pages of a closed book), time-gated spectral imaging allows the researchers to extend the modality of imaging to read through the pages of the book. The novel method uses the statistics obtained from the sub-millimeter electric field (THz E-field) to lock into the place of each layer i.e. the page place. The PPEX (Probabilistic Pulse Extraction) algorithm, used to localize the pages is reportedly better than other regular algorithms like CLEAN, edge-detection techniques or wavelet-based peak finding algorithms as it computes an energy value based on the amplitude and speedextracted from the waveform of the THz time-domain pulses. Based on high-energy (low velocity) parts of the curve, the layers are localized.

Character extraction from pages of a closed book


Post this, time-gated spectral analysis is used to compute a Fourier transform to extract frames with the highest contrast between the paper and the content material. The kurtosis of the histograms in the frequency domain is induced by the presence of two distinct reflective materials on each layer (i.e. the surface of the paper and the ink). This technique magnifies the contrast and enables content extraction from deeper layers. The extra fragment in the algorithm allows recognition of the occluded letter at each page. The convex cardinal shape composition (CCSC) algorithm automatically matches regions of relatively high intensity against combinations of shapes (templates of alphabets) at different locations and orientations. The optimized algorithm is successful despite the significant shadowing and occlusion in the deeper layers.

The systematic and well-designed algorithm was tested for its feasibility against the theoretical explanations. The study demonstrated the first successful character extraction from a stack of at least 9 pages. Although the algorithm can correctly deduce the distance from the camera to the top 20 layers (i.e. it can localize the top 20 pages in a stack), the energy of the reflected signal is so low that the differences between frequency signatures are swamped by noise beyond the first 9 pages. Since terahertz technology is relatively new and garners attention from a lot of researchers at the moment, it is foreseen that the accuracy of the detectors and the intensity of the radiation sources can be further enhanced to help extraction of content from the deeper layers.

Future Potential


The MIT researchers were accompanied by researchers from the GeorgiaTech who developed the algorithm to find each letters from distorted and incomplete images. The two teams have intrigued a big audience for their novel idea. Barmak Heshmat, a research scientist at the MIT Media lab and a corresponding author of the new paper advocates the immense usefulness of the technology by stating how the Metropolitan Museum in New York showed a lot of interest in it. They could use it to investigate some antique books that they dread touching. Many objects of archaeological interest with high cultural value can now be studied, thus making this technology of great worth to museums, libraries or other premises with documents which are too fragile and can be damaged by the slightest touch.

Heshmat also says that the new THz-based technology can be used by the manufacturing industry to dig into materials organized in thin layers such as coatings on machine parts or pharmaceuticals. The opportunities to harness this technology seem endless. Won’t be inaccurate to say multiple dozens of industries could find themselves using this.

Keep Exploring
What Are the Best Websites and Apps for Learning a New Language?