With big data we mean any enormous and multifaceted collection of data (texts, numbers, documents, images, videos etc.) that cannot be analyzed by ordinary computing devices and algorithms. Big data, due to their sheer volume and inherent variety, are extremely challenging to manage and hence difficult to understand.
More and more users, and more and more machines generate more unstructured data on a daily basis. Capturing the essence of such massive flow of data is beyond traditional computation facilities and their classical methodologies. Large data centres and intelligent algorithms, embedded within capable and flexible distributed computing environments such as Hadoop, are necessary to make sense of big data. One of the major fields “suffering” from big data, which has been widely neglected so far, is medical imaging. More than approximately two trillion medical images are captured worldwide each year. A large number of these images have to be stored for several years. There is a huge amount of information contained in these images and their annotations (notes on diagnosis, biopsy, treatment etc.). Presently this colossal pool of human knowledge is going untapped. Employing machine-learning algorithms may help us to overcome this barrier and to create the frontier for the 21st century medical imaging.
In this talk we will attempt to investigate the following questions: What is big data? What is big data analytics? What is Hadoop and what is its relationship to big Data? Why do medical images constitute big data? What is the role of machine learning in big data analytics?